How and why does Java override the equals and hashCode methods of objects

Posted by Dave3765 on Sat, 15 Jan 2022 04:12:49 +0100

Foreword: if Java objects want to compare whether they are equal, they need to rewrite the equals method and the hashCode method, and the prime number 31 is used in the hashCode method. Let's see why.

1, Requirements:

Compare whether two objects are equal. For the following User object, it is considered the same object as long as the name and age are equal.

2, Solution:

You need to override the equals method and hashCode method of the object

package com.peppa.user.entity;

import org.springframework.util.StringUtils;

/**
 * User entity
 *
 */
public class User {
    private String id;
    private String name;
    private String age;

    public User(){

    }

    public User(String id, String name, String age){
        this.id = id;
        this.name = name;
        this.age = age;
    }

    public String getId() {
        return id;
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getAge() {
        return age;
    }

    public void setAge(String age) {
        this.age = age;
    }

    @Override
    public String toString() {
        return this.id + " " + this.name + " " + this.age;
    }

    @Override
    public boolean equals(Object obj) {
        if(this == obj){
            return true;//Address equality
        }

        if(obj == null){
            return false;//Nonempty: x.equals(null) should return false for any nonempty reference X.
        }

        if(obj instanceof User){
            User other = (User) obj;
            //If the fields to be compared are equal, the two objects are equal
            if(equalsStr(this.name, other.name)
                    && equalsStr(this.age, other.age)){
                return true;
            }
        }

        return false;
    }

    private boolean equalsStr(String str1, String str2){
        if(StringUtils.isEmpty(str1) && StringUtils.isEmpty(str2)){
            return true;
        }
        if(!StringUtils.isEmpty(str1) && str1.equals(str2)){
            return true;
        }
        return false;
    }

    @Override
    public int hashCode() {
        int result = 17;
        result = 31 * result + (name == null ? 0 : name.hashCode());
        result = 31 * result + (age == null ? 0 : age.hashCode());
        return result;
    }
}

3, Testing

1. Create two objects with the same name and age, then the object equals is true.

@Test
public void testEqualsObj(){
    User user1 = new User("1", "xiaohua", "14");
    User user2 = new User("2", "xiaohua", "14");
    System.out.println((user1.equals(user2)));//Print as true
}

4, Why override the equals method
Because the equals method is not overridden, execute user1 Equals (user2) compares the addresses of two objects (i.e. user1 == user2). They must be unequal. See the Object source code:

   public boolean equals(Object obj) {
        return (this == obj);
    }

5, Why override the hashCode method

Since the equals method is used to compare whether the two objects are equal, as long as the equals method is overridden, why rewrite the hashCode method?

In fact, when the equals method is rewritten, it is usually necessary to rewrite the hashCode method to maintain the general contract of the hashCode method, which states that equal objects must have equal hash codes. Then why? Just look at the example below.

The hashCode method of the User object is as follows. The hashCode method of the parent class is not overridden

@Override
public int hashCode() {
    return super.hashCode();
}

Using hashSet

@Test
public void testHashCodeObj(){
    User user1 = new User("1", "xiaohua", "14");
    User user2 = new User("2", "xiaohua", "14");
    Set<User> userSet = new HashSet<>();
    userSet.add(user1);
    userSet.add(user2);
    System.out.println(user1.equals(user2));
    System.out.println(user1.hashCode() == user2.hashCode());
    System.out.println(userSet);
}

result

Obviously, this is not the result we want. We hope that if two objects are equal, they can also be considered equal when using hashSet storage.

By looking at the add method of hashSet, we can know that the add method uses the hashCode method of the object to judge, so we need to rewrite the hashCode method to achieve the desired effect.

After rewriting the hashCode method, the above result is

@Override
public int hashCode() {
    int result = 17;
    result = 31 * result + (name == null ? 0 : name.hashCode());
    result = 31 * result + (age == null ? 0 : age.hashCode());
    return result;
}

Therefore: hashCode is used for fast access of hash data. For example, when using HashSet/HashMap/Hashtable class to store data, it will judge whether it is the same according to the hashCode value of the storage object.

6, How to rewrite hashCode

Generate a variable result of type int and initialize a value, such as 17

For each important field in the class, that is, the field affecting the value of the object, that is, the field with comparison in the equals method, perform the following operations: a. calculate the value of this field, filedhashvalue = filed hashCode(); b. Filedvalue + hashresult = execute;

7, Why use 31

Take a look at the source code of the String hashCode method:

/**
 * Returns a hash code for this string. The hash code for a
 * {@code String} object is computed as
 * <blockquote><pre>
 * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 * </pre></blockquote>
 * using {@code int} arithmetic, where {@code s[i]} is the
 * <i>i</i>th character of the string, {@code n} is the length of
 * the string, and {@code ^} indicates exponentiation.
 * (The hash value of the empty string is zero.)
 *
 * @return  a hash code value for this object.
 */
public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

It can be seen from the comments that the hashCode method of an empty string returns 0. And a formula is also given in the notes, which can be understood.

31 is also used in the String source code, and it is said on the Internet that there are two reasons:

Reason 1: fewer product result conflicts
31 is a "neither big nor small" existence in the proton number. If you use a small prime number such as 2, the product will be in a very small range, which is easy to cause hash value conflict. If you select a prime number of more than 100, the resulting hash value will exceed the maximum range of int, neither of which is appropriate. If the hash code operation is performed on more than 50000 English words (combined from two different versions of Unix dictionaries), and the constants 31, 33, 37, 39 and 41 are used as multipliers, and the hash value conflict number calculated by each constant is less than 7 (the test done by foreign gods), these numbers will be used as alternative multipliers for generating hashCode.

Therefore, from the reasons for choosing 31 among 31, 33, 37 and 39, see reason 2.

Reason 2: 31 can be optimized by JVM
The most effective calculation method in the JVM is bit operation:

*Shift left < <: discard the highest bit on the left and complete 0 on the right (move the data on the left * 2 to the power).
* shift right > >: move the data on the left of > > to the power of / 2.
* unsigned shift right > > >: whether the highest bit is 0 or 1, fill 0 on the left.

   So: 31 * i = (i << 5) - i(Left 31*2=62,Right 2*2^5-2=62) - Both sides are equal, JVM You can calculate efficiently...

Topics: Java hashcode