Hello, I'm Jiejie
Today I'm going to introduce a super liver product!
Pandas is a NumPy based tool created to solve data analysis tasks. It provides a large number of functions and methods that enable us to process data quickly and conveniently.
The 20 [divided into 15 groups] functions introduced in this article are definitely data processing killers, which you will love when you use them.
data:image/s3,"s3://crabby-images/eaa09/eaa0965bea868ec2bba6c85c4681a20b7ed5812a" alt=""
Construct dataset
Here we first construct a data set to demonstrate these 20 functions.
import pandas as pd df ={'full name':[' Classmate Huang','Huang Zhizun','Huang Laoxie ','Da Mei Chen','Sun Shangxiang'], 'English name':['Huang tong_xue','huang zhi_zun','Huang Lao_xie','Chen Da_mei','sun shang_xiang'], 'Gender':['male','women','men','female','male'], 'ID':['463895200003128433','429475199912122345','420934199110102311','431085200005230122','420953199509082345'], 'height':['mid:175_good','low:165_bad','low:159_bad','high:180_verygood','low:172_bad'], 'Home address':['Guangshui, Hubei','Xinyang, Henan','Guangxi Guilin','Hubei Xiaogan','Guangzhou, Guangdong'], 'Telephone number':['13434813546','19748672895','16728613064','14561586431','19384683910'], 'income':['1.1 ten thousand','8.5 thousand','0.9 ten thousand','6.5 thousand','2.0 ten thousand']} df = pd.DataFrame(df) df
design sketch:
data:image/s3,"s3://crabby-images/1b1c0/1b1c04c5d83bce4e0e07c1f5cc07aa5c2515432d" alt=""
1. cat function
This function is mainly used for string splicing;
df["full name"].str.cat(df["Home address"],sep='-'*3)
design sketch:
data:image/s3,"s3://crabby-images/12ae1/12ae15a9ce772086ac6e28ff15126b222fb3b20d" alt=""
2. contains function
This function is mainly used to judge whether a string contains a given character;
df["Home address"].str.contains("wide")
design sketch:
data:image/s3,"s3://crabby-images/e2e67/e2e6721d4caece804a2466034fbf7812758a6f32" alt=""
3. Startswitch and endswitch functions
This function is mainly used to determine whether a string is represented by Beginning / end;
#"Huang Wei" in the first line begins with a space df["full name"].str.startswith("yellow") df["English name"].str.endswith("e")
design sketch:
data:image/s3,"s3://crabby-images/c80dd/c80dd2412e32253eac14b4e8fa69c9eea6813784" alt=""
4. count function
This function is mainly used to calculate the number of occurrences of a given character in the string;
df["Telephone number"].str.count("3")
design sketch:
data:image/s3,"s3://crabby-images/3e169/3e16972e88c0284e49c65e86365a40d596668fb3" alt=""
5. get function
This function is mainly used to obtain the string at the specified position;
df["full name"].str.get(-1) df["height"].str.split(":") df["height"].str.split(":").str.get(0)
design sketch:
data:image/s3,"s3://crabby-images/6a5c2/6a5c27f725ecd16bd399e84ee8d80e71a1f4b9bc" alt=""
6. len function
This function is mainly used to calculate the string length;
df["Gender"].str.len()
design sketch:
data:image/s3,"s3://crabby-images/3f98d/3f98d80904d4db000a078494355dc865113f0bae" alt=""
7. upper and lower functions
This function is mainly used for English case conversion;
df["English name"].str.upper() df["English name"].str.lower()
design sketch:
data:image/s3,"s3://crabby-images/0b204/0b204612d6278730a4c50ec4254e89a40c7985a6" alt=""
8. pad+side parameter / center function
This function is mainly used to add a given character to the left, right or left and right of the string;
df["Home address"].str.pad(10,fillchar="*") #Equivalent to ljust() df["Home address"].str.pad(10,side="right",fillchar="*") #Equivalent to rjust() df["Home address"].str.center(10,fillchar="*")
design sketch:
data:image/s3,"s3://crabby-images/792f7/792f777faaa5c081e74333fd1e1288cae761c875" alt=""
9. repeat function
This function is mainly used to repeat the string several times;
df["Gender"].str.repeat(3)
design sketch:
data:image/s3,"s3://crabby-images/8cd95/8cd95773ea72e56a200e20195496f1238a93ff41" alt=""
10. slice_replace function
This function is mainly used to replace the character at the specified position with a given string;
df["Telephone number"].str.slice_replace(4,8,"*"*4)
design sketch:
data:image/s3,"s3://crabby-images/75f23/75f23696ff12b72770628fa09444cbf03c1788c2" alt=""
11. replace function
This function is mainly used to replace the character at the specified position with the given string;
df["height"].str.replace(":","-")
design sketch:
data:image/s3,"s3://crabby-images/7492d/7492d41ba6b3ce15c370cd47b61b2f41be40efca" alt=""
This function also accepts a regular expression to replace the character at the specified position with the given string.
df["income"].str.replace("\d+\.\d+","regular")
design sketch:
data:image/s3,"s3://crabby-images/5d7a3/5d7a35c61923f9ac1511c331003d03ad80430ad7" alt=""
12. split method + expand parameter
This function is mainly used to expand a column into several columns;
#Common usage df["height"].str.split(":") #split method with expand parameter df[["Height description","final height"]] = df["height"].str.split(":",expand=True) df #split method with join method df["height"].str.split(":").str.join("?"*5)
design sketch:
data:image/s3,"s3://crabby-images/9e256/9e25664e645e07089d962f14efb5ca510ff35927" alt=""
13. strip, rstrip and lstrip functions
This function is mainly used to remove blank characters and line breaks;
df["full name"].str.len() df["full name"] = df["full name"].str.strip() df["full name"].str.len()
design sketch:
data:image/s3,"s3://crabby-images/5cf72/5cf72f8994150890df3a7198b26b009c35170ecb" alt=""
14. findall function
This function is mainly used to use regular expressions to match strings and return a list of search results;
df["height"] df["height"].str.findall("[a-zA-Z]+")
design sketch:
data:image/s3,"s3://crabby-images/4d379/4d379e14f120d1b294a6d77431b96b4a934ad3ab" alt=""
15. extract and extractall functions
This function is mainly used to accept regular expressions and extract matching strings (be sure to add parentheses);
df["height"].str.extract("([a-zA-Z]+)") #Extract the composite index from extractall df["height"].str.extractall("([a-zA-Z]+)") #extract with expand parameter df["height"].str.extract("([a-zA-Z]+).*?([a-zA-Z]+)",expand=True)
design sketch:
data:image/s3,"s3://crabby-images/f8227/f82275591f85181487c1346a3bbd8f2493e471dc" alt=""