1. Preface
AES is a kind of symmetric encryption. The so-called symmetric encryption is the secret key used for encryption and decryption.
I wrote an article about python AES encryption and decryption before, but there are many details. This time, I systematically explain how to use AES encryption and decryption from the aspects of parameter type, encryption mode, coding mode, completion mode, etc.
You can't be eager for quick success and instant benefits when reading articles. In order to solve a problem, you can temporarily find a code and apply it. Maybe you can solve the problem quickly, but you need to query again in case of new problems. I think it's a waste of time. I believe you will gain a lot after reading this article carefully.
2. Environmental installation
pip uninstall crypto pip uninstall pycryptodome pip install pycryptodome
The first two uninstall commands are to prevent some installation environment problems. See article
3. Encryption mode
The most commonly used modes of AES encryption are ECB mode and CBC mode. Of course, there are many other modes, all of which belong to AES encryption. The difference between ECB mode and CBC mode is that ECB does not need iv offset, while CBC does.
4.AES encryption parameters
The following parameters are used in python.
parameter | Function and data type |
---|---|
Secret key | The secret key is used for encryption, and the same secret key is required for decryption; The data type is bytes |
Plaintext | Parameters to be encrypted; The data type is bytes |
pattern | The commonly used aes encryption modes are ECB and CBC (I only use these two modes and other modes); the data type is the enumerator inside the aes class |
iv offset | This parameter is not required in ECB mode, but in CBC mode; The data type is bytes |
The following is a simple example of ECB mode encryption and decryption:
from Crypto.Cipher import AES password = b'1234567812345678' #The secret key, b, is expressed as bytes text = b'abcdefghijklmnhi' #Content to be encrypted, bytes type aes = AES.new(password,AES.MODE_ECB) #Create an aes object # AES.MODE_ECB indicates that the mode is ECB mode en_text = aes.encrypt(text) #Encrypted plaintext print("Ciphertext:",en_text) #Encrypted plaintext, bytes type den_text = aes.decrypt(en_text) # Decrypt ciphertext print("Plaintext:",den_text)
Output:
Ciphertext: b'WU\xe0\x0e\xa3\x87\x12\x95\\]O\xd7\xe3\xd4 )' Plaintext: b'abcdefghijklmnhi'
The above is the encryption and decryption for ECB mode. From this example, we can see that there are several restrictions in the parameters.
- The secret key must be 16 bytes or a multiple of 16 bytes of byte data.
- The plaintext must be 16 bytes or a multiple of 16 bytes. If it is less than 16 bytes, it needs to be completed. The completion rules will be described in the completion mode later.
Through the example of CBC mode:
from Crypto.Cipher import AES password = b'1234567812345678' #The secret key, b, is expressed as bytes iv = b'1234567812345678' # iv offset, bytes type text = b'abcdefghijklmnhi' #Content to be encrypted, bytes type aes = AES.new(password,AES.MODE_CBC,iv) #Create an aes object # AES. MODE_ The CBC representation pattern is the CBC pattern en_text = aes.encrypt(text) print("Ciphertext:",en_text) #Encrypted plaintext, bytes type aes = AES.new(password,AES.MODE_CBC,iv) #Decryption in CBC mode requires re creating an aes object den_text = aes.decrypt(en_text) print("Plaintext:",den_text)
Output:
Ciphertext: b'\x93\x8bN!\xe7~>\xb0M\xba\x91\xab74;0' Plaintext: b'abcdefghijklmnhi'
Through the example of CBC mode above, we can simply see the difference between CBC mode and ECB mode: aes New() decrypts and encrypts the aes object. Encryption and decryption cannot call the same aes object, otherwise an error will be reported. TypeError: decrypt() cannot be called after encrypt().
Summary:
1. During AES encryption and decryption in python, the ciphertext, plaintext, secret key and iv offset passed in must be bytes (byte type) data. python can only accept bytes type data when building AES objects.
2. When the length of the secret key, iv offset and plaintext to be encrypted is less than 16 bytes or a multiple of 16 bytes, it needs to be completed.
3. CBC mode needs to regenerate AES objects. In order to prevent such errors, I write code to regenerate AES objects regardless of the mode.
5. Coding mode
As mentioned earlier, AES encryption and decryption in python can only accept byte data. The common plaintext to be encrypted may be Chinese, or the ciphertext to be decrypted may be encoded by base64, which needs to be encoded or decoded before AES can be used for encryption or decryption. In any case, when python uses AES for encryption or decryption, it needs to be converted into bytes data first.
We encrypt and decrypt Chinese plaintext in ECB mode, for example:
from Crypto.Cipher import AES password = b'1234567812345678' #The secret key, b, is expressed as bytes text = "study hard and make progress every day".encode('gbk') #gbk code is one Chinese character corresponding to two bytes, and eight Chinese characters are exactly 16 bytes aes = AES.new(password,AES.MODE_ECB) #Create an aes object # AES.MODE_ECB indicates that the mode is ECB mode print(len(text)) en_text = aes.encrypt(text) #Encrypted plaintext print("Ciphertext:",en_text) #Encrypted plaintext, bytes type den_text = aes.decrypt(en_text) # Decrypt ciphertext print("Plaintext:",den_text.decode("gbk")) # Decoding is also required after decryption
Output:
16 Ciphertext: b'=\xdd8k\x86\xed\xec\x17\x1f\xf7\xb2\x84~\x02\xc6C' Clear text: study hard and make progress every day
For Chinese plaintext, we can use the encode() function to encode and convert the string into bytes data. Here, I choose gbk coding to just meet 16 bytes. utf8 coding is a Chinese character corresponding to 3 bytes. Here, for example, gbk coding is selected.
After decryption, the decode() function is also required to decode and convert byte data back to Chinese characters (string type).
Now let's look at another case. The ciphertext is base64 encoded (this is also very common, and many websites use it). We use http://tool.chacuo.net/cryptaes/ Examples of this website:
Mode: ECB
Password: 12345678123456678
Character sets: gbk encoding
Output: base64
Let's write a python for aes decryption:
from Crypto.Cipher import AES import base64 password = b'1234567812345678' aes = AES.new(password,AES.MODE_ECB) en_text = b"Pd04a4bt7Bcf97KEfgLGQw==" en_text = base64.decodebytes(en_text) #base64 decoding will be performed, and the return value is still bytes den_text = aes.decrypt(en_text) print("Plaintext:",den_text.decode("gbk"))
Output:
Clear text: study hard and make progress every day
b"Pd04a4bt7Bcf97KEfgLGQw = =" here is a byte data. If you pass a string, you can directly use the encode() function to convert it to byte data.
from Crypto.Cipher import AES import base64 password = b'1234567812345678' aes = AES.new(password,AES.MODE_ECB) en_text = "Pd04a4bt7Bcf97KEfgLGQw==".encode() #Convert string to bytes data en_text = base64.decodebytes(en_text) #base64 decoding will be performed, the parameter is bytes data, and the return value is still bytes den_text = aes.decrypt(en_text) print("Plaintext:",den_text.decode("gbk"))
For both utf8 and gbk encoding, one character corresponds to one byte for English character encoding. Therefore, the * * encode() * * function here is mainly used to convert data into bytes, and then use base64 for decoding.
hexstr, base64 encoding and decoding example:
import base64 import binascii data = "hello".encode() data = base64.b64encode(data) print("base64 code:",data) data = base64.b64decode(data) print("base64 decode:",data) data = binascii.b2a_hex(data) print("hexstr code:",data) data = binascii.a2b_hex(data) print("hexstr decode:",data)
Output:
base64 code: b'aGVsbG8=' base64 decode: b'hello' hexstr code: b'68656c6c6f' hexstr decode: b'hello'
Here we need to explain that there are some AES encryption, and the secret key or IV vector is encoded by base64 or hexstr. For this, the first thing to do is to decode and convert them back to bytes. Again, the parameters passed by python for AES encryption and decryption are bytes (byte type) data.
In addition, I remember that in the previous pycryptodome library, string type data can be used directly when passing IV vectors and plaintext, but now the new version must be byte type data, which may be for unity and easy memory.
6. Filling mode
Previously, I used secret keys and plaintext, including IV vectors, which are fixed 16 bytes, that is, the data blocks are aligned. The filling mode is to solve the problem of data block misalignment. What characters are used for filling corresponds to different filling modes
The common AES completion modes are as follows:
pattern | significance |
---|---|
ZeroPadding | Fill with b '\ x00', where 0 is not string 0, but b '\ x00' of byte data |
PKCS7Padding | When n data are required to align, the filled byte data is n and N data are filled |
PKCS5Padding | Like PKCS7Padding, I feel no difference in AES encryption, decryption and filling |
no padding | When it is 16 bytes of data, it can not be filled. When it is less than 16 bytes of data, it is the same as ZeroPadding |
Here is a detail. I find that many articles are wrong.
Significance of ZeroPadding filling mode: many articles explain that when it is 16 byte multiple, it will not be filled, and then when it is less than 16 byte multiple, it will be filled with byte data 0. This explanation is wrong. This explanation should be no padding, and ZeroPadding fills it regardless of whether the data is aligned or not until the next alignment, That is, even if you have enough 16 bytes of data, it will continue to fill in 16 bytes of 0, and then the total data is 32 bytes.
There may be a question here, why is it 16 bytes? In fact, this is the size of the data block, website There are also corresponding settings on the website. The corresponding settings on the website are 128 bits, that is, 16 byte alignment. Of course, there are 192 bits (24 bytes) and 256 bits (32 bytes).
After this explanation, we will talk about the problem of data block alignment, rather than the multiple of 16 bytes.
Except for the no padding filling mode, the remaining filling modes will be filled until the next data block alignment, without the problem of not filling.
PKCS7Padding and pkcs5ppadding need to fill in the byte correspondence table:
Plaintext length value (mod 16) | Number of padding bytes added | Value per fill byte |
---|---|---|
0 | 16 | 0x10 |
1 | 15 | 0x0F |
2 | 14 | 0x0E |
3 | 13 | 0x0D |
4 | 12 | 0x0C |
5 | 11 | 0x0B |
6 | 10 | 0x0A |
7 | 9 | 0x09 |
8 | 8 | 0x08 |
9 | 7 | 0x07 |
10 | 6 | 0x06 |
11 | 5 | 0x05 |
12 | 4 | 0x04 |
13 | 3 | 0x03 |
14 | 2 | 0x02 |
15 | 1 | 0x01 |
It can be seen here that when the plaintext length value has been aligned (mod 16 = 0), it still needs to be filled, and the filled 16 bytes are 0x10. The ZeroPadding filling logic is similar, except that the filled byte values are 0x00, which is expressed as b'\x00' in python.
After filling, AES can be used for encryption and decryption. Of course, after decryption, the filled data also needs to be removed. However, Python needs to implement these steps by itself (please comment if there is such a library).
7. Complete implementation of Python
from Crypto.Cipher import AES import base64 import binascii # Data class class MData(): def __init__(self, data = b"",characterSet='utf-8'): # data must be bytes self.data = data self.characterSet = characterSet def saveData(self,FileName): with open(FileName,'wb') as f: f.write(self.data) def fromString(self,data): self.data = data.encode(self.characterSet) return self.data def fromBase64(self,data): self.data = base64.b64decode(data.encode(self.characterSet)) return self.data def fromHexStr(self,data): self.data = binascii.a2b_hex(data) return self.data def toString(self): return self.data.decode(self.characterSet) def toBase64(self): return base64.b64encode(self.data).decode() def toHexStr(self): return binascii.b2a_hex(self.data).decode() def toBytes(self): return self.data def __str__(self): try: return self.toString() except Exception: return self.toBase64() ### Encapsulation class class AEScryptor(): def __init__(self,key,mode,iv = '',paddingMode= "NoPadding",characterSet ="utf-8"): ''' Build a AES object key: Secret key, byte data mode: There are only two usage modes, AES.MODE_CBC, AES.MODE_ECB iv: iv Offset, byte data paddingMode: Fill mode, default to NoPadding, Optional NoPadding,ZeroPadding,PKCS5Padding,PKCS7Padding characterSet: Character set encoding ''' self.key = key self.mode = mode self.iv = iv self.characterSet = characterSet self.paddingMode = paddingMode self.data = "" def __ZeroPadding(self,data): data += b'\x00' while len(data) % 16 != 0: data += b'\x00' return data def __StripZeroPadding(self,data): data = data[:-1] while len(data) % 16 != 0: data = data.rstrip(b'\x00') if data[-1] != b"\x00": break return data def __PKCS5_7Padding(self,data): needSize = 16-len(data) % 16 if needSize == 0: needSize = 16 return data + needSize.to_bytes(1,'little')*needSize def __StripPKCS5_7Padding(self,data): paddingSize = data[-1] return data.rstrip(paddingSize.to_bytes(1,'little')) def __paddingData(self,data): if self.paddingMode == "NoPadding": if len(data) % 16 == 0: return data else: return self.__ZeroPadding(data) elif self.paddingMode == "ZeroPadding": return self.__ZeroPadding(data) elif self.paddingMode == "PKCS5Padding" or self.paddingMode == "PKCS7Padding": return self.__PKCS5_7Padding(data) else: print("I won't support it Padding") def __stripPaddingData(self,data): if self.paddingMode == "NoPadding": return self.__StripZeroPadding(data) elif self.paddingMode == "ZeroPadding": return self.__StripZeroPadding(data) elif self.paddingMode == "PKCS5Padding" or self.paddingMode == "PKCS7Padding": return self.__StripPKCS5_7Padding(data) else: print("I won't support it Padding") def setCharacterSet(self,characterSet): ''' Set character set encoding characterSet: Character set encoding ''' self.characterSet = characterSet def setPaddingMode(self,mode): ''' Set fill mode mode: Optional NoPadding,ZeroPadding,PKCS5Padding,PKCS7Padding ''' self.paddingMode = mode def decryptFromBase64(self,entext): ''' from base64 Encoding string encoding AES decrypt entext: data type str ''' mData = MData(characterSet=self.characterSet) self.data = mData.fromBase64(entext) return self.__decrypt() def decryptFromHexStr(self,entext): ''' from hexstr Encoding string encoding AES decrypt entext: data type str ''' mData = MData(characterSet=self.characterSet) self.data = mData.fromHexStr(entext) return self.__decrypt() def decryptFromString(self,entext): ''' From string AES decrypt entext: data type str ''' mData = MData(characterSet=self.characterSet) self.data = mData.fromString(entext) return self.__decrypt() def decryptFromBytes(self,entext): ''' From binary AES decrypt entext: data type bytes ''' self.data = entext return self.__decrypt() def encryptFromString(self,data): ''' String AES encryption data: String to be encrypted, data type: str ''' self.data = data.encode(self.characterSet) return self.__encrypt() def __encrypt(self): if self.mode == AES.MODE_CBC: aes = AES.new(self.key,self.mode,self.iv) elif self.mode == AES.MODE_ECB: aes = AES.new(self.key,self.mode) else: print("This mode is not supported") return data = self.__paddingData(self.data) enData = aes.encrypt(data) return MData(enData) def __decrypt(self): if self.mode == AES.MODE_CBC: aes = AES.new(self.key,self.mode,self.iv) elif self.mode == AES.MODE_ECB: aes = AES.new(self.key,self.mode) else: print("This mode is not supported") return data = aes.decrypt(self.data) mData = MData(self.__stripPaddingData(data),characterSet=self.characterSet) return mData if __name__ == '__main__': key = b"1234567812345678" iv = b"0000000000000000" aes = AEScryptor(key,AES.MODE_CBC,iv,paddingMode= "ZeroPadding",characterSet='utf-8') data = "study hard" rData = aes.encryptFromString(data) print("Ciphertext:",rData.toBase64()) rData = aes.decryptFromBase64(rData.toBase64()) print("Plaintext:",rData)
I simply encapsulated it. The data types returned by encryption and decryption can be encoded by toBase64(),toHexStr(). In addition, I did not complete the key and iv. I can use the MData class to implement it myself. For more details, you can learn about it through the comments in the source code.