Python implements RSA encryption and decryption of text files

Posted by samona on Fri, 28 Jan 2022 23:53:18 +0100

Recently, I am writing a project in python. I hereby record the file encryption and decryption problems encountered in the project.
As for the encryption algorithm of python version, you can still retrieve a lot by searching it casually, but most of them are the same article published back and forth on different platforms, or reprinted, and the examples are the simplest cases. Then, if it is used in the actual project, it will be a little more complicated than this, For example, my demand is to encrypt a database script file generated by mysqldump. It's definitely impossible to call it directly from an online example, so I have to study it myself and make a record.

RSA algorithm

What is RSA algorithm?

The algorithm selected for the project is RSA asymmetric encryption algorithm. We won't explain this algorithm too much. Let's focus on:

  • The public key is used for encryption
  • The private key is used for decryption
  • len_in_byte(raw_data) = len_in_bit(key)/8 -11, such as 1024bit key, the length of the content that can be encrypted at one time is 1024 / 8 - 11 = 117 bytes

Why subtract 11 byte s?

Because we use PKCS1Padding, which takes up 11 bytes, the length of plaintext that it can encrypt must be subtracted from these 11 bytes

What problems might you encounter?

Based on the above three points, we can probably know what problems we may encounter to complete file encryption and decryption?

The length of one-time encrypted plaintext is related to the length of the key, so we need to encrypt a file. We can't read out the text content at one time and then encrypt it
If the file is large, it is impossible for us to read the contents of the file into the memory at one time, which may directly cause the server to be unable to respond to other requests, which is certainly unreasonable
After the text is encrypted, it will be decrypted again. If the read length is different, it will inevitably lead to decryption failure. Then the database backup file will be discarded, which is more dangerous

Do It

Installation dependency, python version 3.7.4

    pip install pycryptodomex -i https://pypi.tuna.tsinghua.edu.cn/simple/

Import module:

    import base64
    from Cryptodome import Random
    from Cryptodome.PublicKey import RSA
    from Cryptodome.Cipher import PKCS1_v1_5 as Cipher_pkcs1_v1_5
    from Cryptodome.Signature import PKCS1_v1_5 as Signature_pkcs1_v1_5

Generate public key + private key. Note that the length of the generated public key here is 1024bit

    # Pseudo random number generator
    random_generator = Random.new().read
    # rsa algorithm generation example
    rsa = RSA.generate(1024, random_generator)
    private_pem = str(rsa.exportKey(), encoding="utf-8")
    with open("client-private.pem", "w") as f:
        f.write(private_pem)
      
    public_pem = str(rsa.publickey().exportKey(), encoding="utf-8")
    with open("client-public.pem", "w") as f:
        f.write(public_pem)'''

For encryption, the length of the incoming plaintext is segmented here. Because the length of the key we generate is 1024bit, the length of the plaintext we encrypt at one time cannot exceed 117 byte s

    def rsa_encrypt(plaintext, pub_key):
        '''
        rsa encryption
        :param plaintext: Plaintext
        :param pub_key:Public key
        '''
        message = plaintext.encode("utf-8")
        length = len(message)
        default_length = 117  # 1024 / 8 - 11 1024 is the key length
        rsakey = RSA.importKey(pub_key)
        cipher = Cipher_pkcs1_v1_5.new(rsakey)
        # No segmentation required
        if length <= default_length:
            return default_rsa_encrypt(cipher, message)
        # Need segmentation
        offset = 0
        result = []
        while length - offset > 0:
            if length - offset > default_length:
                result.append(default_rsa_encrypt(
                    cipher, message[offset:offset+default_length]))
            else:
                result.append(default_rsa_encrypt(cipher, message[offset:]))
            offset += default_length
        return "\n".join(result)
      
    def default_rsa_encrypt(cipher, message):
        ciphertext = base64.b64encode(cipher.encrypt(message))
        # print(b"ciphertext:"+ciphertext)
        ciphertext_decode = ciphertext.decode("utf-8")
        # print("ciphertext_decode:"+ciphertext_decode)
        return ciphertext_decode

decrypt

    def rsa_decrypt(ciphertext, priv_key):
        '''
        rsa decrypt
        :param ciphertext:ciphertext
        :param priv_key:Private key
        '''
        message = base64.b64decode(ciphertext)
        length = len(message)
        default_length = 128
        rsakey = RSA.importKey(priv_key)
        cipher = Cipher_pkcs1_v1_5.new(rsakey)
        if length <= default_length:
            return default_rsa_decrypt(cipher, message)
        # Need segmentation
        offset = 0
        result = []
        while length - offset > 0:
            if length - offset > default_length:
                result.append(rsa_decrypt(
                    cipher, message[offset:offset+default_length]))
            else:
                result.append(rsa_decrypt(cipher, message[offset:]))
            offset += default_length
        decode_message = [x.decode("utf-8") for x in result]
        return "".join(decode_message)
      
    def default_rsa_decrypt(cipher, message):
        plaintext = cipher.decrypt(message, random_generator)
        # print(b"plaintext:"+plaintext)
        plaintext_decode = plaintext.decode("utf-8")
        # print("plaintext_decode:"+plaintext_decode)
        return plaintext_decode

For encryption and decryption files, considering the problems we raised at the beginning, we use line by line reading and line by line encryption. After encryption, the ciphertext is also written line by line

    def rsa_encrypt_file(file_path, save_path, pub_key):
        '''
        rsa Encrypted file
        :param file_path:Encrypted file path required
        :param save_path:File path stored after encryption
        :param pub_key:Public key
        '''
        with open(file_path, "r", encoding="utf-8") as f:
            line = f.readline()  # Read one line
            while line:
                context = rsa_encrypt(line, pub_key)  # Encrypted cut characters
                with open(save_path, "a", encoding="utf-8") as w:
                    w.write(context+"\n")
            line = f.readline()
    def rsa_decrypt_file(file_path,save_path,priv_key):
        '''
        rsa Decrypt file
        :file_path:File path to decrypt
        :save_path:File path stored after decryption
        :priv_key:Private key
        '''
        with open(file_path,"r",encoding="utf-8") as f:
            line = f.readline()
            while line:
                context = rsa_decrypt(line.strip("\n"),priv_key)
                with open(save_path,"a",encoding="utf-8") as w:
                    w.write(context)
                line = f.readline()

At the beginning of the test, I used a long line of digital text I entered casually. There was no problem with the personal test. However, when I directly used my database script file, the encryption could succeed, but the decoding failed after decryption. At that time, I was puzzled. I thought it was a problem with the character set, so I replaced utf-8 with gb2312, The encryption and decryption succeeded. I was elated at that time until I re encrypted and decrypted another backup file and encountered decoding failure. I couldn't sleep at that time~

Until I saw the incomplete multiple byte sequence of this sentence, I immediately understood it, because my script file contains Chinese, utf8
Encoding a Chinese character is three bytes, and gb2312 encoding a Chinese character is two bytes. As long as it is multi byte, when cutting, it is possible that a Chinese character is cut into two parts, which will naturally lead to failure to decode into correct Chinese characters. The problem is clear, so it depends on how to solve it.

Because it is a script file, if it is not handled well, it may lead to the failure of script execution and eventually the failure of database restoration, which is contrary to the original intention of the project~

So I thought of a way. First, judge the character coding of each line of text. If it exceeds 117, the last character will not be accumulated. The code is as follows:

    def cut_string(message,length = 117):
        result = []
        temp_char = []
        for msg in message:#Traverse each character
            msg_encode = msg.encode("utf-8")#Encode each character
            temp_encode = "".join(temp_char).encode("utf-8")#Accumulated bytes after encoding
            if len(temp_encode) + len(msg_encode) <= length:#If it is less than the agreed length, add it to the result set
                temp_char.append(msg)
            else:#If the agreed length has been exceeded, it is added to the next result set
                result.append("".join(temp_char))
                temp_char.clear()
                temp_char.append(msg)
        result.append("".join(temp_char))
        return result

The encryption method needs to be readjusted:

    def rsa_encrypt_file(file_path,save_path,pub_key):
        '''
        rsa Encrypted file
        :param file_path:Encrypted file path required
        :param save_path:File path stored after encryption
        :param pub_key:Public key
        '''
        with open(file_path,"r",encoding="utf-8") as f:
            line = f.readline() #Read one line
            while line:
                cut_lines = cut_string(line) # Cutting characters ensures that Chinese characters are not cut
                for cut_line in cut_lines:
                    context = rsa_encrypt(cut_line,pub_key) #Encrypted cut characters
                    with open(save_path,"a",encoding="utf-8") as w:
                        w.write(context+"\n")
                line = f.readline()

This problem has been solved. In fact, with this cut_ After the string method, the previously written encryption and decryption method does not need to be segmented, but the code is retained.

In the above method, the efficiency of encryption and decryption is very low, because it is line by line encryption and decryption. A 300M script file takes 40 minutes to complete encryption, which is really too uncomfortable. Therefore, the strategy is adjusted to compress first and then encrypt, so it involves reading and writing binary files. The final implementation code is as follows:

    def rsa_encrypt_binfile(file_path,save_path,pub_key):
      '''
      rsa Encrypted binary
      :param file_path:Encrypted file path required
      :param save_path:File path stored after encryption
      :param pub_key:Public key
      '''
      with open(file_path, 'rb') as f:
        message = f.read()
      length = len(message)
      default_length = 117 # 1024 / 8 - 11 1024 is the key length
      rsakey = RSA.importKey(pub_key)
      cipher = Cipher_pkcs1_v1_5.new(rsakey)
      # No segmentation required
      result = []
      if length <= default_length:
        result.append(base64.b64encode(cipher.encrypt(message)))
    
      # Need segmentation
      offset = 0
      while length - offset > 0:
        if length - offset > default_length:
          result.append(base64.b64encode(cipher.encrypt(message[offset:offset+default_length])))
        else:
          result.append(base64.b64encode(cipher.encrypt(message[offset:])))
        offset += default_length
      
      with open(save_path,"ab+") as w:
        for ciphertext in result:
          ciphertext += b"\n"
          w.write(ciphertext)
    def rsa_decrypt_binfile(file_path,save_path,priv_key):
      '''
      rsa Decrypt binary
      :file_path:File path to decrypt
      :save_path:File path stored after decryption
      :priv_key:Private key
      '''
      with open(file_path,"rb") as f:
        line = f.readline()
        while line:
          message = base64.b64decode(line.strip(b"\n"))
          rsakey = RSA.importKey(priv_key)
          cipher = Cipher_pkcs1_v1_5.new(rsakey)
          plaintext = cipher.decrypt(message, random_generator)
          with open(save_path, 'ab+') as w: #Append write
            w.write(plaintext)
          line = f.readline()

The above is the detailed content of RSA encryption and decryption text file implemented by python. For more information about python rsa encryption and decryption, please pay attention to other relevant articles of script home!

Topics: Python