Cryptography series: detailed explanation of bcrypt encryption algorithm

Posted by resting on Mon, 20 Sep 2021 19:10:02 +0200

brief introduction

An encryption algorithm to be introduced today is called bcrypt. Bcrypt is a password hash function designed by Niels Provos and David Mazi è res. it is based on the Blowfish password and was proposed on USENIX in 1999.

In addition to adding salt to resist rainbow table attack, a very important feature of bcrypt is adaptability, which can ensure that the encryption speed is within a specific range. Even if the computing power of the computer is very high, the encryption speed can be slowed down by increasing the number of iterations, so as to resist violent search attack.

The bcrypt function is the default password hashing algorithm for OpenBSD and other systems, including some Linux distributions (such as SUSE Linux).

How bcrypt works

Let's first review the encryption principle of blowfish. Firstly, blowfish needs to generate K-array and S-box for encryption. It takes some time for blowfish to generate the final K-array and S-box. Each new key needs to be preprocessed with about 4KB text. Compared with other block cipher algorithms, this will be very slow. However, once generated, or the key remains unchanged, blowfish is still a very fast packet encryption method.

Is it good to be so slow?

Of course, because for a normal application, the key will not be changed often. Therefore, the preprocessing will only be generated once. It will be fast when used later.

For malicious attackers, each attempt of a new key requires a long preprocessing, so it is not cost-effective for attackers to crack the blowfish algorithm. Therefore, blowfish can resist dictionary attacks.

Provos and Mazi è res took advantage of this and further developed it. They developed a new key setting algorithm for blowfish and called the resulting password "eksbrowfish" ("expensive key schedule Blowfish"). This is an improved algorithm for blowfish. In the initial key setting of bcrypt, salt and password are used to set the sub key. Then, after a round of standard blowfish algorithm, by alternately using salt and password as keys, each round depends on the state of the key of the previous round. Although in theory, the strength of bcrypt algorithm is not better than blowfish, because the number of rounds to reset the key in bcrpyt can be configured, it can better resist violent attacks by increasing the number of rounds.

Implementation of bcrypt algorithm

To put it simply, the bcrypt algorithm is a string OrpheanBeholderScryDoubt The result of 64 times of blowfish encryption. A friend will ask, isn't bcrypt used to encrypt the password? How to encrypt a string?

Don't worry, bcrpyt takes the password as the factor to encrypt the string, and the encryption effect is also obtained. Let's look at the basic algorithm implementation of bcrypt:

Function bcrypt
   Input:
      cost:     Number (4..31)                      log2(Iterations). e.g. 12 ==> 212 = 4,096 iterations
      salt:     array of Bytes (16 bytes)           random salt
      password: array of Bytes (1..72 bytes)        UTF-8 encoded password
   Output: 
      hash:     array of Bytes (24 bytes)

   //Initialize Blowfish state with expensive key setup algorithm
   //P: array of 18 subkeys (UInt32[18])
   //S: Four substitution boxes (S-boxes), S0...S3. Each S-box is 1,024 bytes (UInt32[256])
   P, S <- EksBlowfishSetup(cost, salt, password)   

   //Repeatedly encrypt the text "OrpheanBeholderScryDoubt" 64 times
   ctext <- "OrpheanBeholderScryDoubt"  //24 bytes ==> three 64-bit blocks
   repeat (64)
      ctext <-  EncryptECB(P, S, ctext) //encrypt using standard Blowfish in ECB mode

   //24-byte ctext is resulting password hash
   return Concatenate(cost, salt, ctext)

The above function bcrypt has three inputs and one output.

In the input part, cost represents the number of rounds, which we can specify ourselves. More rounds, slower encryption.

Salt is an encryption salt used to confuse the use of passwords.

Password is the password we want to encrypt.

The final output is the encrypted result hash.

With three inputs, we will call the eksbrowfishsetup function to initialize 18 subkeys and 4 1K size S-boxes to achieve the final P and s.

Then use P and S to perform 64 blowfish operations on "Orphean beholdercrydouble", and finally get the result.

Next, let's look at the algorithm implementation of eksbrowfishsetup method:

Function EksBlowfishSetup
   Input:
      password: array of Bytes (1..72 bytes)   UTF-8 encoded password
      salt:     array of Bytes (16 bytes)      random salt
      cost:     Number (4..31)                 log2(Iterations). e.g. 12 ==> 212 = 4,096 iterations
   Output: 
      P:        array of UInt32                array of 18 per-round subkeys
      S1..S4:   array of UInt32                array of four SBoxes; each SBox is 256 UInt32 (i.e. 1024 KB)

   //Initialize P (Subkeys), and S (Substitution boxes) with the hex digits of pi 
   P, S  <- InitialState() 

   //Permutate P and S based on the password and salt     
   P, S  <- ExpandKey(P, S, salt, password)

   //This is the "Expensive" part of the "Expensive Key Setup".
   //Otherwise the key setup is identical to Blowfish.
   repeat (2cost)
      P, S  <-  ExpandKey(P, S, 0, password)
      P, S  <- ExpandKey(P, S, 0, salt)

   return P, S

The code is very simple. Eksbrowfishsetup receives the above three parameters and returns the final P containing 18 sub key s and four 1k size sboxes.

First initialize to get the initial P and S.

Then call ExpandKey, and import salt and password to generate the first round of P and S.

Then loop to the cost power of 2, use password and salt as parameters in turn to generate P and S, and finally return.

Finally, let's take a look at the implementation of ExpandKey:

Function ExpandKey
   Input:
      password: array of Bytes (1..72 bytes)  UTF-8 encoded password
      salt:     Byte[16]                      random salt
      P:        array of UInt32               Array of 18 subkeys
      S1..S4:   UInt32[1024]                  Four 1 KB SBoxes
   Output: 
      P:        array of UInt32               Array of 18 per-round subkeys
      S1..S4:   UInt32[1024]                  Four 1 KB SBoxes       

   //Mix password into the P subkeys array
   for n   <- 1 to 18 do
      Pn   <-  Pn xor password[32(n-1)..32n-1] //treat the password as cyclic

   //Treat the 128-bit salt as two 64-bit halves (the Blowfish block size).
   saltHalf[0]   <-  salt[0..63]  //Lower 64-bits of salt
   saltHalf[1]   <-  salt[64..127]  //Upper 64-bits of salt

   //Initialize an 8-byte (64-bit) buffer with all zeros.
   block   <-  0

   //Mix internal state into P-boxes   
   for n   <-  1 to 9 do
      //xor 64-bit block with a 64-bit salt half
      block   <-  block xor saltHalf[(n-1) mod 2] //each iteration alternating between saltHalf[0], and saltHalf[1]

      //encrypt block using current key schedule
      block   <-  Encrypt(P, S, block) 
      P2n   <-  block[0..31]      //lower 32-bits of block
      P2n+1   <- block[32..63]  //upper 32-bits block

   //Mix encrypted state into the internal S-boxes of state
   for i   <- 1 to 4 do
      for n   <- 0 to 127 do
         block   <- Encrypt(state, block xor salt[64(n-1)..64n-1]) //as above
         Si[2n]     <- block[0..31]  //lower 32-bits
         Si[2n+1]   <-  block[32..63]  //upper 32-bits
    return state

ExpandKey is mainly used to generate P and S. the generation of algorithm is relatively complex. You can study it in detail if you are interested.

Structure of bcrypt hash

We can use bcrypt to encrypt the password and finally save it to the system in the form of bcrypt hash. The format of a bcrypt hash is as follows:

$2b$[cost]$[22 character salt][31 character hash]

For example:

$2a$10$N9qo8uLOickgx2ZMRZoMyeIjZAgcfl7p92ldGxad68LJZdL17lhWy
\__/\/ \____________________/\_____________________________/
 Alg Cost      Salt                        Hash

In the example above, $2a$ Represents the unique flag of the hash algorithm. Here is the bcrypt algorithm.

10 represents the cost factor, which is the 10th power of 2, that is, 1024 rounds.

N9qo8uLOickgx2ZMRZoMye is a 22 length character of 16 bytes (128 bits) salt encoded by base64.

The last IjZAgcfl7p92ldGxad68LJZdL17lhWy is a 24 byte (192bits) hash, which is encoded by bash64 and has a length of 31 characters.

History of hash

This hash format follows the Modular Crypt Format used when storing passwords in OpenBSD password files. At the beginning, the format definition is as follows:

$1$: MD5-based crypt ('md5crypt')
$2$: Blowfish-based crypt ('bcrypt')
$sha1$: SHA-1-based crypt ('sha1crypt')
$5$: SHA-256-based crypt ('sha256crypt')
$6$: SHA-512-based crypt ('sha512crypt')

However, the original specification did not define how to handle non ASCII characters or null terminators. The revised specification stipulates that in the case of hash string:

String must be UTF-8 encoded
Must contain null Terminator

Because these changes were included, the version number of bcrypt was changed to $ 2a$.

But in June 2011, because of the implementation of bcypt in PHP crypt_blowfish They suggest that system administrators update their existing password database and replace $2a $with $2x $to indicate that these hash values are bad (old algorithms need to be used). They also suggested that crypt_blowfish uses the header $2y $for the hash value generated by the new algorithm. Of course, this change is limited to PHP crypt_blowfish.

Then, in February 2014, a bug was also found in the bcrypt implementation of OpenBSD. They stored the length of the string in an unsigned char (i.e. 8-bit Byte). If the password is longer than 255 characters, it will overflow.

Because bcrypt was created for OpenBSD. So when a bug appeared in their library, they decided to upgrade the version number to $2b $.

Topics: Python Algorithm

Programmer Think