Hello, I'm Jiejie
Today I'm going to introduce a super liver product! Pandas # is a tool based on NumPy, which is created to solve data analysis tasks. It provides a large number of functions and methods that enable us to process data quickly and conveniently. The 20 [divided into 15 groups] functions introduced in this article are definitely data processing killers, which you will love when you use them.
Construct dataset
Here we first construct a data set to demonstrate these 20 functions.
import pandas as pd df ={'full name':[' Classmate Huang','Huang Zhizun','Huang Laoxie ','Da Mei Chen','Sun Shangxiang'], 'English name':['Huang tong_xue','huang zhi_zun','Huang Lao_xie','Chen Da_mei','sun shang_xiang'], 'Gender':['male','women','men','female','male'], 'ID':['463895200003128433','429475199912122345','420934199110102311','431085200005230122','420953199509082345'], 'height':['mid:175_good','low:165_bad','low:159_bad','high:180_verygood','low:172_bad'], 'Home address':['Guangshui, Hubei','Xinyang, Henan','Guangxi Guilin','Hubei Xiaogan','Guangzhou, Guangdong'], 'Telephone number':['13434813546','19748672895','16728613064','14561586431','19384683910'], 'income':['1.1 ten thousand','8.5 thousand','0.9 ten thousand','6.5 thousand','2.0 ten thousand']} df = pd.DataFrame(df) df
design sketch:
1. cat function
This function is mainly used for string splicing;
df["full name"].str.cat(df["Home address"],sep='-'*3)
design sketch:
2. contains function
This function is mainly used to judge whether a string contains a given character;
df["Home address"].str.contains("wide")</pre>
design sketch:
3. Startswitch and endswitch functions
This function is mainly used to determine whether a string is represented by Beginning / end;
#"Huang Wei" in the first line begins with a space df["full name"].str.startswith("yellow") df["English name"].str.endswith("e")</pre>
design sketch:
4. count function
This function is mainly used to calculate the number of occurrences of a given character in the string;
df["Telephone number"].str.count("3")</pre>
design sketch:
5. get function
This function is mainly used to obtain the string at the specified position;
df["full name"].str.get(-1) df["height"].str.split(":") df["height"].str.split(":").str.get(0)</pre>
design sketch:
6. len function
This function is mainly used to calculate the string length;
df["Gender"].str.len()</pre>
design sketch:
7. upper and lower functions
This function is mainly used for English case conversion;
df["English name"].str.upper() df["English name"].str.lower()</pre>
design sketch:
8. pad+side parameter / center function
This function is mainly used to add a given character to the left, right or left and right of the string;
df["Home address"].str.pad(10,fillchar="*") #Equivalent to ljust() df["Home address"].str.pad(10,side="right",fillchar="*") #Equivalent to rjust() df["Home address"].str.center(10,fillchar="*")</pre>
design sketch:
9. repeat function
This function is mainly used to repeat the string several times;
df["Gender"].str.repeat(3)</pre>
design sketch:
10. slice_replace function
This function is mainly used to replace the character at the specified position with a given string;
df["Telephone number"].str.slice_replace(4,8,"*"*4)</pre>
design sketch:
11. replace function
This function is mainly used to replace the character at the specified position with the given string;
df["height"].str.replace(":","-")</pre>
design sketch:
This function also accepts a regular expression to replace the character at the specified position with the given string.
df["income"].str.replace("\d+\.\d+","regular")</pre>
design sketch:
12. split method + expand parameter
This function is mainly used to expand a column into several columns;
#Common usage df["height"].str.split(":") #split method with expand parameter df[["Height description","final height"]] = df["height"].str.split(":",expand=True) df #split method with join method df["height"].str.split(":").str.join("?"*5)</pre>
design sketch:
13. strip, rstrip and lstrip functions
This function is mainly used to remove blank characters and line breaks;
df["full name"].str.len() df["full name"] = df["full name"].str.strip() df["full name"].str.len()</pre>
design sketch:
14. findall function
This function is mainly used to use regular expressions to match strings and return a list of search results;
df["height"] df["height"].str.findall("[a-zA-Z]+")</pre>
design sketch:
15. extract and extractall functions
This function is mainly used to accept regular expressions and extract matching strings (be sure to add parentheses);
df["height"].str.extract("([a-zA-Z]+)") #Extract the composite index from extractall df["height"].str.extractall("([a-zA-Z]+)") #extract with expand parameter df["height"].str.extract("([a-zA-Z]+).*?([a-zA-Z]+)",expand=True)</pre>
design sketch:
If you think this article is of some use to you, don't forget to connect three times, because it will be the strongest driving force for me to continue to output more high-quality articles!