Implementing AES encryption and decryption with python

Posted by greepit on Sat, 18 Dec 2021 20:03:32 +0100

1. Preface

AES is a kind of symmetric encryption. The so-called symmetric encryption is the secret key used for encryption and decryption.

I wrote an article about python AES encryption and decryption before, but there are many details. This time, I systematically explain how to use AES encryption and decryption from the aspects of parameter type, encryption mode, coding mode, completion mode, etc.

You can't be eager for quick success and instant benefits when reading articles. In order to solve a problem, you can temporarily find a code and apply it. Maybe you can solve the problem quickly, but you need to query again in case of new problems. I think it's a waste of time. I believe you will gain a lot after reading this article carefully.

2. Environmental installation

pip uninstall crypto
pip uninstall pycryptodome
pip install pycryptodome

The first two uninstall commands are to prevent some installation environment problems. See article

3. Encryption mode

The most commonly used modes of AES encryption are ECB mode and CBC mode. Of course, there are many other modes, all of which belong to AES encryption. The difference between ECB mode and CBC mode is that ECB does not need iv offset, while CBC does.

4.AES encryption parameters

The following parameters are used in python.

parameterFunction and data type
Secret keyThe secret key is used for encryption, and the same secret key is required for decryption; The data type is bytes
PlaintextParameters to be encrypted; The data type is bytes
patternThe commonly used aes encryption modes are ECB and CBC (I only use these two modes and other modes); the data type is the enumerator inside the aes class
iv offsetThis parameter is not required in ECB mode, but in CBC mode; The data type is bytes

The following is a simple example of ECB mode encryption and decryption:

from Crypto.Cipher import AES

password = b'1234567812345678' #The secret key, b, is expressed as bytes
text = b'abcdefghijklmnhi' #Content to be encrypted, bytes type
aes = AES.new(password,AES.MODE_ECB) #Create an aes object
# AES.MODE_ECB indicates that the mode is ECB mode
en_text = aes.encrypt(text) #Encrypted plaintext
print("Ciphertext:",en_text) #Encrypted plaintext, bytes type
den_text = aes.decrypt(en_text) # Decrypt ciphertext
print("Plaintext:",den_text)

Output:

Ciphertext: b'WU\xe0\x0e\xa3\x87\x12\x95\\]O\xd7\xe3\xd4 )'
Plaintext: b'abcdefghijklmnhi'

The above is the encryption and decryption for ECB mode. From this example, we can see that there are several restrictions in the parameters.

  1. The secret key must be 16 bytes or a multiple of 16 bytes of byte data.
  2. The plaintext must be 16 bytes or a multiple of 16 bytes. If it is less than 16 bytes, it needs to be completed. The completion rules will be described in the completion mode later.

Through the example of CBC mode:

from Crypto.Cipher import AES
password = b'1234567812345678' #The secret key, b, is expressed as bytes
iv = b'1234567812345678' # iv offset, bytes type
text = b'abcdefghijklmnhi' #Content to be encrypted, bytes type
aes = AES.new(password,AES.MODE_CBC,iv) #Create an aes object
# AES. MODE_ The CBC representation pattern is the CBC pattern
en_text = aes.encrypt(text) 
print("Ciphertext:",en_text) #Encrypted plaintext, bytes type
aes = AES.new(password,AES.MODE_CBC,iv) #Decryption in CBC mode requires re creating an aes object
den_text = aes.decrypt(en_text)
print("Plaintext:",den_text)

Output:

Ciphertext: b'\x93\x8bN!\xe7~>\xb0M\xba\x91\xab74;0'
Plaintext: b'abcdefghijklmnhi'

Through the example of CBC mode above, we can simply see the difference between CBC mode and ECB mode: aes New() decrypts and encrypts the aes object. Encryption and decryption cannot call the same aes object, otherwise an error will be reported. TypeError: decrypt() cannot be called after encrypt().

Summary:

1. During AES encryption and decryption in python, the ciphertext, plaintext, secret key and iv offset passed in must be bytes (byte type) data. python can only accept bytes type data when building AES objects.

2. When the length of the secret key, iv offset and plaintext to be encrypted is less than 16 bytes or a multiple of 16 bytes, it needs to be completed.

3. CBC mode needs to regenerate AES objects. In order to prevent such errors, I write code to regenerate AES objects regardless of the mode.

5. Coding mode

As mentioned earlier, AES encryption and decryption in python can only accept byte data. The common plaintext to be encrypted may be Chinese, or the ciphertext to be decrypted may be encoded by base64, which needs to be encoded or decoded before AES can be used for encryption or decryption. In any case, when python uses AES for encryption or decryption, it needs to be converted into bytes data first.

We encrypt and decrypt Chinese plaintext in ECB mode, for example:

from Crypto.Cipher import AES

password = b'1234567812345678' #The secret key, b, is expressed as bytes
text = "study hard and make progress every day".encode('gbk') #gbk code is one Chinese character corresponding to two bytes, and eight Chinese characters are exactly 16 bytes
aes = AES.new(password,AES.MODE_ECB) #Create an aes object
# AES.MODE_ECB indicates that the mode is ECB mode
print(len(text))
en_text = aes.encrypt(text) #Encrypted plaintext
print("Ciphertext:",en_text) #Encrypted plaintext, bytes type
den_text = aes.decrypt(en_text) # Decrypt ciphertext
print("Plaintext:",den_text.decode("gbk")) # Decoding is also required after decryption

Output:

16
 Ciphertext: b'=\xdd8k\x86\xed\xec\x17\x1f\xf7\xb2\x84~\x02\xc6C'
Clear text: study hard and make progress every day

For Chinese plaintext, we can use the encode() function to encode and convert the string into bytes data. Here, I choose gbk coding to just meet 16 bytes. utf8 coding is a Chinese character corresponding to 3 bytes. Here, for example, gbk coding is selected.

After decryption, the decode() function is also required to decode and convert byte data back to Chinese characters (string type).

Now let's look at another case. The ciphertext is base64 encoded (this is also very common, and many websites use it). We use http://tool.chacuo.net/cryptaes/ Examples of this website:

Mode: ECB

Password: 12345678123456678

Character sets: gbk encoding

Output: base64

Let's write a python for aes decryption:

from Crypto.Cipher import AES
import base64

password = b'1234567812345678' 
aes = AES.new(password,AES.MODE_ECB) 
en_text = b"Pd04a4bt7Bcf97KEfgLGQw=="
en_text = base64.decodebytes(en_text) #base64 decoding will be performed, and the return value is still bytes
den_text = aes.decrypt(en_text)
print("Plaintext:",den_text.decode("gbk")) 

Output:

Clear text: study hard and make progress every day

b"Pd04a4bt7Bcf97KEfgLGQw = =" here is a byte data. If you pass a string, you can directly use the encode() function to convert it to byte data.

from Crypto.Cipher import AES
import base64

password = b'1234567812345678' 
aes = AES.new(password,AES.MODE_ECB) 
en_text = "Pd04a4bt7Bcf97KEfgLGQw==".encode() #Convert string to bytes data
en_text = base64.decodebytes(en_text) #base64 decoding will be performed, the parameter is bytes data, and the return value is still bytes
den_text = aes.decrypt(en_text) 
print("Plaintext:",den_text.decode("gbk")) 

For both utf8 and gbk encoding, one character corresponds to one byte for English character encoding. Therefore, the * * encode() * * function here is mainly used to convert data into bytes, and then use base64 for decoding.

hexstr, base64 encoding and decoding example:

import base64
import binascii
data = "hello".encode()
data = base64.b64encode(data)
print("base64 code:",data)
data = base64.b64decode(data)
print("base64 decode:",data)
data = binascii.b2a_hex(data)
print("hexstr code:",data)
data = binascii.a2b_hex(data)
print("hexstr decode:",data)

Output:

base64 code: b'aGVsbG8='
base64 decode: b'hello'
hexstr code: b'68656c6c6f'
hexstr decode: b'hello'

Here we need to explain that there are some AES encryption, and the secret key or IV vector is encoded by base64 or hexstr. For this, the first thing to do is to decode and convert them back to bytes. Again, the parameters passed by python for AES encryption and decryption are bytes (byte type) data.

In addition, I remember that in the previous pycryptodome library, string type data can be used directly when passing IV vectors and plaintext, but now the new version must be byte type data, which may be for unity and easy memory.

6. Filling mode

Previously, I used secret keys and plaintext, including IV vectors, which are fixed 16 bytes, that is, the data blocks are aligned. The filling mode is to solve the problem of data block misalignment. What characters are used for filling corresponds to different filling modes

The common AES completion modes are as follows:

patternsignificance
ZeroPaddingFill with b '\ x00', where 0 is not string 0, but b '\ x00' of byte data
PKCS7PaddingWhen n data are required to align, the filled byte data is n and N data are filled
PKCS5PaddingLike PKCS7Padding, I feel no difference in AES encryption, decryption and filling
no paddingWhen it is 16 bytes of data, it can not be filled. When it is less than 16 bytes of data, it is the same as ZeroPadding

Here is a detail. I find that many articles are wrong.

Significance of ZeroPadding filling mode: many articles explain that when it is 16 byte multiple, it will not be filled, and then when it is less than 16 byte multiple, it will be filled with byte data 0. This explanation is wrong. This explanation should be no padding, and ZeroPadding fills it regardless of whether the data is aligned or not until the next alignment, That is, even if you have enough 16 bytes of data, it will continue to fill in 16 bytes of 0, and then the total data is 32 bytes.

There may be a question here, why is it 16 bytes? In fact, this is the size of the data block, website There are also corresponding settings on the website. The corresponding settings on the website are 128 bits, that is, 16 byte alignment. Of course, there are 192 bits (24 bytes) and 256 bits (32 bytes).

After this explanation, we will talk about the problem of data block alignment, rather than the multiple of 16 bytes.

Except for the no padding filling mode, the remaining filling modes will be filled until the next data block alignment, without the problem of not filling.

PKCS7Padding and pkcs5ppadding need to fill in the byte correspondence table:

Plaintext length value (mod 16)Number of padding bytes addedValue per fill byte
0160x10
1150x0F
2140x0E
3130x0D
4120x0C
5110x0B
6100x0A
790x09
880x08
970x07
1060x06
1150x05
1240x04
1330x03
1420x02
1510x01

It can be seen here that when the plaintext length value has been aligned (mod 16 = 0), it still needs to be filled, and the filled 16 bytes are 0x10. The ZeroPadding filling logic is similar, except that the filled byte values are 0x00, which is expressed as b'\x00' in python.

After filling, AES can be used for encryption and decryption. Of course, after decryption, the filled data also needs to be removed. However, Python needs to implement these steps by itself (please comment if there is such a library).

7. Complete implementation of Python

from Crypto.Cipher import AES
import base64
import binascii

# Data class
class MData():
    def __init__(self, data = b"",characterSet='utf-8'):
        # data must be bytes
        self.data = data
        self.characterSet = characterSet
  
    def saveData(self,FileName):
        with open(FileName,'wb') as f:
            f.write(self.data)

    def fromString(self,data):
        self.data = data.encode(self.characterSet)
        return self.data

    def fromBase64(self,data):
        self.data = base64.b64decode(data.encode(self.characterSet))
        return self.data

    def fromHexStr(self,data):
        self.data = binascii.a2b_hex(data)
        return self.data

    def toString(self):
        return self.data.decode(self.characterSet)

    def toBase64(self):
        return base64.b64encode(self.data).decode()

    def toHexStr(self):
        return binascii.b2a_hex(self.data).decode()

    def toBytes(self):
        return self.data

    def __str__(self):
        try:
            return self.toString()
        except Exception:
            return self.toBase64()


### Encapsulation class
class AEScryptor():
    def __init__(self,key,mode,iv = '',paddingMode= "NoPadding",characterSet ="utf-8"):
        '''
        Build a AES object
        key: Secret key, byte data
        mode: There are only two usage modes, AES.MODE_CBC, AES.MODE_ECB
        iv:  iv Offset, byte data
        paddingMode: Fill mode, default to NoPadding, Optional NoPadding,ZeroPadding,PKCS5Padding,PKCS7Padding
        characterSet: Character set encoding
        '''
        self.key = key
        self.mode = mode
        self.iv = iv
        self.characterSet = characterSet
        self.paddingMode = paddingMode
        self.data = ""

    def __ZeroPadding(self,data):
        data += b'\x00'
        while len(data) % 16 != 0:
            data += b'\x00'
        return data

    def __StripZeroPadding(self,data):
        data = data[:-1]
        while len(data) % 16 != 0:
            data = data.rstrip(b'\x00')
            if data[-1] != b"\x00":
                break
        return data

    def __PKCS5_7Padding(self,data):
        needSize = 16-len(data) % 16
        if needSize == 0:
            needSize = 16
        return data + needSize.to_bytes(1,'little')*needSize

    def __StripPKCS5_7Padding(self,data):
        paddingSize = data[-1]
        return data.rstrip(paddingSize.to_bytes(1,'little'))

    def __paddingData(self,data):
        if self.paddingMode == "NoPadding":
            if len(data) % 16 == 0:
                return data
            else:
                return self.__ZeroPadding(data)
        elif self.paddingMode == "ZeroPadding":
            return self.__ZeroPadding(data)
        elif self.paddingMode == "PKCS5Padding" or self.paddingMode == "PKCS7Padding":
            return self.__PKCS5_7Padding(data)
        else:
            print("I won't support it Padding")

    def __stripPaddingData(self,data):
        if self.paddingMode == "NoPadding":
            return self.__StripZeroPadding(data)
        elif self.paddingMode == "ZeroPadding":
            return self.__StripZeroPadding(data)

        elif self.paddingMode == "PKCS5Padding" or self.paddingMode == "PKCS7Padding":
            return self.__StripPKCS5_7Padding(data)
        else:
            print("I won't support it Padding")

    def setCharacterSet(self,characterSet):
        '''
        Set character set encoding
        characterSet: Character set encoding
        '''
        self.characterSet = characterSet

    def setPaddingMode(self,mode):
        '''
        Set fill mode
        mode: Optional NoPadding,ZeroPadding,PKCS5Padding,PKCS7Padding
        '''
        self.paddingMode = mode

    def decryptFromBase64(self,entext):
        '''
        from base64 Encoding string encoding AES decrypt
        entext: data type str
        '''
        mData = MData(characterSet=self.characterSet)
        self.data = mData.fromBase64(entext)
        return self.__decrypt()

    def decryptFromHexStr(self,entext):
        '''
        from hexstr Encoding string encoding AES decrypt
        entext: data type str
        '''
        mData = MData(characterSet=self.characterSet)
        self.data = mData.fromHexStr(entext)
        return self.__decrypt()

    def decryptFromString(self,entext):
        '''
        From string AES decrypt
        entext: data type str
        '''
        mData = MData(characterSet=self.characterSet)
        self.data = mData.fromString(entext)
        return self.__decrypt()

    def decryptFromBytes(self,entext):
        '''
        From binary AES decrypt
        entext: data type bytes
        '''
        self.data = entext
        return self.__decrypt()

    def encryptFromString(self,data):
        '''
        String AES encryption
        data: String to be encrypted, data type: str
        '''
        self.data = data.encode(self.characterSet)
        return self.__encrypt()

    def __encrypt(self):
        if self.mode == AES.MODE_CBC:
            aes = AES.new(self.key,self.mode,self.iv) 
        elif self.mode == AES.MODE_ECB:
            aes = AES.new(self.key,self.mode) 
        else:
            print("This mode is not supported")  
            return           

        data = self.__paddingData(self.data)
        enData = aes.encrypt(data)
        return MData(enData)

    def __decrypt(self):
        if self.mode == AES.MODE_CBC:
            aes = AES.new(self.key,self.mode,self.iv) 
        elif self.mode == AES.MODE_ECB:
            aes = AES.new(self.key,self.mode) 
        else:
            print("This mode is not supported")  
            return           
        data = aes.decrypt(self.data)
        mData = MData(self.__stripPaddingData(data),characterSet=self.characterSet)
        return mData


if __name__ == '__main__':
    key = b"1234567812345678"
    iv =  b"0000000000000000"
    aes = AEScryptor(key,AES.MODE_CBC,iv,paddingMode= "ZeroPadding",characterSet='utf-8')
    
    data = "study hard"
    rData = aes.encryptFromString(data)
    print("Ciphertext:",rData.toBase64())
    rData = aes.decryptFromBase64(rData.toBase64())
    print("Plaintext:",rData)

I simply encapsulated it. The data types returned by encryption and decryption can be encoded by toBase64(),toHexStr(). In addition, I did not complete the key and iv. I can use the MData class to implement it myself. For more details, you can learn about it through the comments in the source code.

Topics: Python AES