Pandas Tour (6): summary of string utility methods

Posted by mistertylersmith on Wed, 04 Dec 2019 01:05:22 +0100

About string basic methods

Hello everyone, I'm back! In the previous several issues, we have simply understood the basic operation of pandas, but as long as the data is involved, the most common type is String, so many times we are actually dealing with strings, so today, I will share my own summary of the common methods about strings to you, hoping to help you guys~

Split and format

latitude = '37.24N'
longitude = '-115.81W'
'Coordinates {0},{1}'.format(latitude,longitude)
>>>   'Coordinates 37.24N,-115.81W'

f'Coordinates {latitude},{longitude}'
>>>'Coordinates 37.24N,-115.81W'

'{0},{1},{2}'.format(*('abc'))
>>>'a,b,c'

coord = {"latitude":latitude,"longitude":longitude}
'Coordinates {latitude},{longitude}'.format(**coord)
>>>'Coordinates 37.24N,-115.81W'

Access argument' s attribute

class Point:
    def __init__(self,x,y):
        self.x,self.y = x,y
    def __str__(self):
        return 'Point({self.x},{self.y})'.format(self = self)
    def __repr__(self):
        return f'Point({self.x},{self.y})'

test_point = Point(4,2)
test_point
>>>    Point(4,2)

str(Point(4,2))
>>>'Point(4,2)'

Replace with %s , %r :

" repr() shows the quote {!r}, while str() doesn't:{!s} ".format('a1','a2')
>>> " repr() shows the quote 'a1', while str() doesn't:a2 "

Align :

'{:<30}'.format('left aligned')
>>>'left aligned                  '

'{:>30}'.format('right aligned')
>>>'                 right aligned'

'{:^30}'.format('centerd')
>>>'           centerd            '

'{:*^30}'.format('centerd')
>>>'***********centerd************'

Replace with %x , %o :

"int:{0:d}, hex:{0:x}, oct:{0:o}, bin:{0:b}".format(42)
>>>'int:42, hex:2a, oct:52, bin:101010'

'{:,}'.format(12345677)
>>>'12,345,677'

Percentage :

points = 19
total = 22
'Correct answers: {:.2%}'.format(points/total)
>>>'Correct answers: 86.36%'

Date :

import datetime as dt
f"{dt.datetime.now():%Y-%m-%d}"
>>>'2019-03-27'

f"{dt.datetime.now():%d_%m_%Y}"
>>>'27_03_2019'

today = dt.datetime.today().strftime("%d_%m_%Y")
today

'27_03_2019'

Split without parameters :

"this is a  test".split()
>>>['this', 'is', 'a', 'test']

Concatenate :

'do'*2
>>>'dodo'

orig_string ='Hello'
orig_string+',World'
>>>'Hello,World'

full_sentence = orig_string+',World'
full_sentence
>>>'Hello,World'

Check string type , slice，count，strip :

strings = ['do','re','mi']
', '.join(strings)
>>>'do, re, mi'

'z' not in 'abc'
>>> True

ord('a'), ord('#')
>>> (97, 35)

chr(97)
>>>'a'

s = "foodbar"
s[2:5]
>>>'odb'

s[:4] + s[4:]
>>>'foodbar'

s[:4] + s[4:] == s
>>>True

t=s[:]
id(s)
>>>1547542895336

id(t)
>>>1547542895336

s is t
>>>True

s[0:6:2]
>>>'fob'

s[5:0:-2]
>>>'ado'

s = 'tomorrow is monday'
reverse_s = s[::-1]
reverse_s
>>>'yadnom si worromot'

s.capitalize()
>>>'Tomorrow is monday'

s.upper()
>>>'TOMORROW IS MONDAY'

s.title()
>>>'Tomorrow Is Monday'

s.count('o')
>>> 4

"foobar".startswith('foo')
>>>True

"foobar".endswith('ar')
>>>True

"foobar".endswith('oob',0,4)
>>>True

"foobar".endswith('oob',2,4)
>>>False

"My name is yo, I work at SG".find('yo')
>>>11

# If can't find the string, return -1
"My name is ya, I work at Gener".find('gent')
>>>-1

# Check a string if consists of alphanumeric characters
"abc123".isalnum()
>>>True

"abc%123".isalnum()
>>>False

"abcABC".isalpha()
>>>True

"abcABC1".isalpha()
>>>False

'123'.isdigit()
>>>True

'123abc'.isdigit()
>>>False

'abc'.islower()
>>>True

"This Is A Title".istitle()
>>>True

"This is a title".istitle()
>>>False

'ABC'.isupper()
>>>True

'ABC1%'.isupper()
>>>True

'foo'.center(10)
>>>'   foo    '

'   foo bar baz    '.strip()
>>>'foo bar baz'

'   foo bar baz    '.lstrip()
>>>'foo bar baz    '

'   foo bar baz    '.rstrip()
>>>'   foo bar baz'

"foo abc foo def fo  ljk ".replace('foo','yao')
>>>'yao abc yao def fo  ljk '

'www.realpython.com'.strip('w.moc')
>>>'realpython'

'www.realpython.com'.strip('w.com')
>>>'realpython'

'www.realpython.com'.strip('w.ncom')
>>>'realpyth'

Convert to lists :

', '.join(['foo','bar','baz','qux'])
>>>'foo, bar, baz, qux'

list('corge')
>>>['c', 'o', 'r', 'g', 'e']

':'.join('corge')
>>>'c:o:r:g:e'

'www.foo'.partition('.')
>>>('www', '.', 'foo')

'foo@@bar@@baz'.partition('@@')
>>>('foo', '@@', 'bar@@baz')

'foo@@bar@@baz'.rpartition('@@')
>>>('foo@@bar', '@@', 'baz')

'foo.bar'.partition('@@')
>>>('foo.bar', '', '')

# By default , rsplit split a string with white space
'foo bar adf yao'.rsplit()
>>>['foo', 'bar', 'adf', 'yao']

'foo.bar.adf.ert'.split('.')
>>>['foo', 'bar', 'adf', 'ert']

'foo\nbar\nadfa\nlko'.splitlines()
>>>['foo', 'bar', 'adfa', 'lko']

summary

In addition to the above summary, there are too many very practical methods, you can search according to your own needs!

I put the ipynb and py files of this issue on Github. If you want to download them, please click the following link:

Github warehouse address: https://github.com/yaozeliang/pandas_share

I hope you can continue to support me, finish, and sprinkle flowers

Topics: Python github Attribute

Programmer Think