Basic introduction to Java Set

Posted by Lucnet on Tue, 08 Feb 2022 05:55:13 +0100

Definition of Set

Java Collection is a particularly useful tool class, which can be used to store a variety of objects, and can realize common data structures, such as stack, queue and so on. Where Set sets represent unordered and non repeatable sets. It is similar to a jar. The program can "throw pieces" of multiple objects into the Set set in turn, and the Set set usually cannot remember the order in which elements are added. The Set Collection is basically the same as the Collection and does not provide any additional methods. In fact, Set is a Collection, but the behavior is slightly different (Set Collection is not allowed to contain duplicate elements). The Set interface inherits from the Collection interface and has the following implementation. The box is often used and needs to be understood.

Because the parent interface of the set interface is Collection, some methods of Collection are available, as follows:

HashSet

HashSet is a typical implementation of the Set interface, which is used most of the time when using the Set set. HashSet stores the elements in the Set according to the Hash algorithm, so it is difficult to have good access and search performance.
HashSet has the following characteristics:

  • The arrangement order of elements cannot be guaranteed. The order may be different from the addition order, and the order may also change.
  • Hashsets are not synchronized. If multiple threads access a HashSet at the same time, assuming that two or more threads modify the HashSet set at the same time, they must be synchronized through code.
  • Collection elements can be null.

When an element is stored in the HashSet set, the HashSet will call the hashCode() method of the object to get the hashCode value of the object, and then determine the storage location of the object in the HashSet according to the hashCode value. If two elements return true through the equals() method comparison, but their hashCode() method return values are different, HashSet will store them in different locations and can still be added successfully. That is, the HashSet set determines whether two elements are equal by comparing two objects through the equals() method, and the return value of the hashCode() method of the two objects is also equal.

package ex.hql.set;

import java.util.HashSet;
import java.util.Set;

class A{
    @Override
    public boolean equals(Object obj) {
        return true;
    }
}

class B{
    @Override
    public int hashCode() {
        return 1;
    }
}

class C{
//Rewriting this method returns the same hashcode, but comparing the two objects with false through equals will cause the HashSet to store multiple elements in the slot of the element, which will lead to performance degradation
    @Override
    public int hashCode() {
        return 2;
    }

    @Override
    public boolean equals(Object obj) {
        return true;
    }
}

public class HashSetDemo {
    public static void main(String[] args) {
        Set set=new HashSet();
        set.add(new A());
        set.add(new A());
        set.add(new B());
        set.add(new B());
        set.add(new C());
        set.add(new C());
        System.out.println(set);
    }
}


Create classes A,B and C. class A overrides the equals() method, class B overrides the hashCode() method, and class C overrides the two methods. Create A HashSet set and add these classes. The output results are as follows. It can be seen that only one class C has been added, and two classes A and B have been added successfully.

Visible: when putting an object into a HashSet, if you need to override the equals() method of the corresponding class of the object, you should also override its HashCode() method. The rule is: if two objects return true through the equals() method comparison, the hashCode values of the two objects should also be equal.

If a variable object is added to the HashSet and the subsequent program modifies the instance of the variable object, it may be equal to other elements in the set (that is, the two objects return true through the equals() method, and the hashCode values of the two objects are also equal), which may lead to the inclusion of two identical objects in the HashSet, as shown in the following examples

package ex.hql.set;


import javax.jws.soap.SOAPBinding;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;

class D{
    int number;

    public D(int number) {
        this.number = number;
    }

    @Override
    public String toString() {
        return "D{" +
                "number=" + number +
                '}';
    }

    @Override
    public boolean equals(Object obj) {
        if(this==obj)
            return true;
        if(obj!=null&&obj.getClass()==D.class){
            D d=(D)obj;
            return this.number==d.number;
        }
        return false;
    }

    @Override
    public int hashCode() {
        return this.number;
    }
}


public class HashSetDemo2 {
    public static void main(String[] args) {
        Set set=new HashSet();
        set.add(new D(6));
        set.add(new D(-4));
        set.add(new D(8));
        set.add(new D(3));
        System.out.println(set);
        Iterator iterator=set.iterator();
        D d=(D)iterator.next();//Gets the first object in the collection
        d.number=3;
        System.out.println(set);
        set.remove(new D(3));
        System.out.println(set);
        System.out.println("Whether the collection contains number Object with 3:"+set.contains(new D(3)));
        System.out.println("Whether the collection contains number by-4 Object:"+set.contains(new D(-4)));
    }
}

When trying to delete an object with number 3, HashSet calculates the hashCode value of the object to find the storage location of the object in the set, and then compares the object here with the object with number 3 through the equals() method. If it is the same, delete the object - only the second element of HashSet meets this condition, The first element actually exists in the position of the object with hashCode - 4, so the second element is deleted. The first object with number 3 is saved at the position with number - 4. If the objects with number - 4 are not equal through equals, false will be returned, which will cause the HashSet to not accurately access the element. It can be seen that after the program adds the variable object to the HashSet, do not modify the instance variables involved in the calculation of hashCode() and equals() in the set elements.

LinkedHashSet

LinkedHashSet is a subclass of HashSet. It also determines the storage location of elements according to the hashCode value of elements, but it also uses linked list to maintain the order of elements, so that elements appear to be saved in the order of insertion. Because the insertion order of elements is to be maintained, the performance is slightly lower than that of HashSet.

package ex.hql.set;

import java.util.LinkedHashSet;

public class LinkedHashSetDemo {
    public static void main(String[] args) {
        LinkedHashSet linkedHashSet=new LinkedHashSet();
        linkedHashSet.add("First inserted");
        linkedHashSet.add("Second inserted");
        linkedHashSet.add("Third inserted");
        linkedHashSet.add("Fourth inserted");
        linkedHashSet.add("Fifth inserted");
        System.out.println(linkedHashSet);
    }
}

Operation results:

TreeSet

TreeSet is the implementation class of SortedSet interface, which can ensure that the collection elements are in sorting state. TreeSet also provides the following additional methods:

package ex.hql.set;

import java.util.TreeSet;

public class TreeSetDemo {
    public static void main(String[] args) {
        TreeSet treeSet=new TreeSet();
        treeSet.add(4);
        treeSet.add(-6);
        treeSet.add(19);
        treeSet.add(8);
        System.out.println(treeSet);
        //Output last element
        System.out.println(treeSet.last());
        //Elements smaller than 4 in the output set
        System.out.println(treeSet.headSet(4));
        //Output elements greater than 8 in the set. If 8 exists in the set, it will contain 8 
        System.out.println(treeSet.tailSet(8));
        //Returns a subset greater than - 1 and less than 2
        System.out.println(treeSet.subSet(-1,2));
    }
}

The output results are as follows:

TreeSet supports two sorting methods: natural sorting and custom sorting. Natural sorting is adopted by default.

Natural sorting

TreeSet will call the compareTo(Object obj) method of collection elements to compare the sizes between elements, and then arrange the collection elements in ascending order. This method is natural sorting.
Java provides a Comparable interface, which defines a compareTo(Object obj) method, which returns an integer value. The class implementing the interface must implement the method, and the class implementing the interface can compare the size. When an object calls this method to compare with another object, if the method returns 0, it indicates that the two objects are equal; If the method returns a positive integer, it indicates greater than; Returns a negative number indicating less than.
If you want to add an object to TreeSet, the object must implement the Comparable interface, otherwise an exception will be thrown:

package ex.hql.set;

import java.util.TreeSet;

class F{

}

public class NatureDemo {
    public static void main(String[] args) {
        TreeSet set=new TreeSet();
        set.add(new F());
    }

}

TreeSet can only add objects of the same type. The only criterion to judge whether two objects are equal is to compare whether they return 0 through the compareTo(Object obj) method. If 0 is equal, otherwise it is not equal.

package ex.hql.set;
import java.util.TreeSet;

class F implements Comparable{

    int number;

    public F(int number) {
        this.number = number;
    }

    @Override
    public boolean equals(Object obj) {
        return true;
    }

    @Override
    public int compareTo(Object o) {
        return 1;
    }
}

public class NatureDemo {
    public static void main(String[] args) {
        TreeSet set=new TreeSet();
        F f=new F(3);
        //Add two duplicate elements
        set.add(f);
        set.add(f);
        System.out.println(set);
    }

}

The output is as follows:

It can be seen from the figure that although the two elements are equal through the equals() method, the compareTo() method can add two duplicate elements if they are not equal. Therefore, if you want to rewrite the method, make sure that the returned values of the two methods are the same.
It should be noted that it is recommended not to modify the key instance variables of the elements placed in the TreeSet set, which is the same as the HashSet above, because this may lead to errors.

Custom sorting

To realize customized sorting, you need the help of the Comparator interface, which contains an int compare(T o1,T o2) method, which is used to compare the sizes of o1 and o2; If the method returns 0, it means equal; Returns a negative number, indicating that o1 is less than o2
If you need to implement customized sorting, you need to provide a Comparator object to associate with the TreeSet collection when creating the TreeSet collection object, and the Comparator object is responsible for the arrangement logic of the collection elements.

package ex.hql.set;

import java.util.Comparator;
import java.util.TreeSet;

class H{
    int number;

    public H(int number) {
        this.number = number;
    }

    @Override
    public String toString() {
        return "H{" +
                "number=" + number +
                '}';
    }
}

public class SpecificDemo {
    public static void main(String[] args) {
    //To realize descending arrangement, anonymous inner classes are used here, or lambda expressions can be used
        TreeSet set=new TreeSet(new Comparator() {
            @Override
            public int compare(Object o1, Object o2) {
                H h1=(H)o1;
                H h2=(H)o2;
                return h1.number>h2.number?-1:h1.number<h2.number?1:0;
            }
        });
        set.add(new H(-2));
        set.add(new H(6));
        set.add(new H(-11));
        set.add(new H(8));
        System.out.println(set);
    }
}

The output results are as follows:

EnumSet class

EnumSet is a collection class specially designed for enumeration class. All elements in EnumSet must be the enumeration value of the specified enumeration type, which is indicated or implicitly specified when it is created. EnumSet collections are also ordered. The order of collection elements is determined by the definition order of enumeration values in Enum class (null is not allowed).
This class provides the following methods:

Performance of each Set

HashSet always performs better than TreeSet (especially the most commonly used operations such as adding and querying elements), because TreeSet needs additional red black trees to maintain the order of collection elements. TreeSet should be used only when a Set that maintains sorting is required, otherwise HashSet should be used
LinkedHashSet is slightly slower to insert, delete and traverse than HashSet.
EnumSet has the best performance among all Set classes, but it can only save the enumeration values of the same enumeration class as collection elements.
HashSet, TreeSet and enumset threads are not safe. Generally, these Set sets can be wrapped through the synchronizedSortedSet method of the Collections tool class to make them safe.

Topics: Java