Source code analysis of add() method in TreeSet -- de duplication and automatic sorting

Posted by phat_hip_prog on Wed, 22 Dec 2021 20:45:43 +0100

1. Overview of TreeSet class functions

First, the TreeSet class uses the natural order of elements to sort, or according to the Comparator provided when creating the set, depending on the construction method used.

We know that the underlying data structure of TreeSet is a red black tree. Let's see how TreeSet ensures the automatic sorting and uniqueness of data through the source code

2. Case

public class TreeSetDemo2 {
    public static void main(String[] args) {

        //Create a TreeSet object
        TreeSet<Integer> treeSet = new TreeSet<>();

        //Add various data to TreeSet
        treeSet.add(4);
        treeSet.add(2);
        treeSet.add(6);
        treeSet.add(6);
        treeSet.add(10);
        treeSet.add(9);
        treeSet.add(7);
        treeSet.add(1);
        treeSet.add(4);

        //Traversal output TreeSet
        for (Integer tree : treeSet) {
            System.out.print(tree+" ");
        }
    }
}

Output results

1 2 4 6 7 9 10 
Process finished with exit code 0

Obviously, unordered inserted data with duplicates becomes ordered when added to TreeSet.

3. Source code analysis

First look at the constructor

TreeSet < integer > TreeSet = new TreeSet < > () what is the underlying operation
//Only relevant code snippets are intercepted below
public class TreeSet<E> extends AbstractSet<E>
    implements NavigableSet<E>, Cloneable, java.io.Serializable{

    /**
     * The backing map.
     */
    private transient NavigableMap<E,Object> m;


    private static final Object PRESENT = new Object();

     /**
     * Constructs a set backed by the specified navigable map.
     */
//2. Create a TreeMap object and assign it to the member variable m
    TreeSet(NavigableMap<E,Object> m) {
        this.m = m;
    }

    public TreeSet() {
//1. Call the construction method TreeSet (navigablemap < e, Object > m) with parameters of this class through the parameterless constructor
        this(new TreeMap<E,Object>());
             
    }


}

As can be seen from the above constructor, a TreeMap object is created at the bottom during the process of creating TreeMap

So how does TreeSet add data?

treeSet.add(4);

Let's look at the underlying code

public class TreeSet<E> extends AbstractSet<E>
    implements NavigableSet<E>, Cloneable, java.io.Serializable{

    private static final Object PRESENT = new Object();
    
//1. Call the add method, E is Integer type, m is TreeMap according to the above code
    public boolean add(E e) {
        return m.put(e, PRESENT)==null;
    }

}

From this, we can see that the put method of TreeMap is called to add data finally. Let's take a look at the put method

public class TreeMap<K,V>
    extends AbstractMap<K,V>
    implements NavigableMap<K,V>, Cloneable, java.io.Serializable{
    
    private final Comparator<? super K> comparator;

    private transient Entry<K,V> root;

    public TreeMap() {
        comparator = null;
    }

    public V put(K key, V value) {//key is 6 and value is Object
        Entry<K,V> t = root;//Header node, initial value is null
        //Insert data when the header node is null
        if (t == null) {
            compare(key, key); // type (and possibly null) check

            root = new Entry<>(key, value, null);
            size = 1;
            modCount++;
            return null;
        }
        //When inserting again, the header node is not null. Go to the next step
        int cmp;//The initial value of cmp is 0
        Entry<K,V> parent;//The initial value is null
        // split comparator and comparable paths
        Comparator<? super K> cpr = comparator;//The initial value is null
        //The initial value of cpr is null. Enter else
        if (cpr != null) {
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        else {
            if (key == null)//If the incoming data is null, an error is reported
                throw new NullPointerException();
            @SuppressWarnings("unchecked")
            //In the upward transformation, k can only point to the parent class of comparable, because Integer implements the comparable interface
            //So there is no transformation error
                Comparable<? super K> k = (Comparable<? super K>) key;
            do {
                parent = t;
                cmp = k.compareTo(t.key);//The node to be inserted is compared with the parent node. If it is smaller than the parent node, insert it to the left
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)//If it is larger than the parent node, insert it to the right
                    t = t.right;
                else
                    return t.setValue(value);//If they are equal, the insertion assignment operation is not performed
            } while (t != null);
        }
        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        fixAfterInsertion(e);
        size++;
        modCount++;
        return null;
    }



}

In the above code, we don't need to delve into the sorting method based on red black tree. A chapter will be devoted to the implementation of various sorting algorithms of tree, and the key is

Comparable<? super K> k = (Comparable<? super K>) key;

The comparable interface is implemented, the comparato method is called, and the key comparison is realized. Obviously, if the incoming reference data type does not implement the comparable interface, a type conversion error will occur. As shown below

This time I passed in the student object

public class Student1{
    private String name;
    private int age;

    public Student1(String name, int age) {
        this.name = name;
        this.age = age;
    }

    public Student1() {
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Student1 student1 = (Student1) o;
        return age == student1.age &&
                Objects.equals(name, student1.name);
    }

    @Override
    public int hashCode() {

        return Objects.hash(name, age);
    }

    @Override
    public String toString() {
        return "Student1{" +
                "name='" + name + '\'' +
                ", age=" + age +
                '}';
    }


}
public class TreeSetDemo1 {
    public static void main(String[] args) {
        Student1 s1 = new Student1("Zhang San", 18);
        Student1 s2 = new Student1("Li Si", 18);
        Student1 s3 = new Student1("Wang Wu", 18);
        Student1 s4 = new Student1("sunspot", 18);
        Student1 s5 = new Student1("Zhang San", 19);
        Student1 s6 = new Student1("Li Si", 18);

        TreeSet<Student1> treeSet = new TreeSet<Student1>();

        treeSet.add(s1);
        treeSet.add(s2);
        treeSet.add(s3);
        treeSet.add(s4);
        treeSet.add(s5);
        treeSet.add(s6);

        for (Student1 tree : treeSet) {
            System.out.println(tree);
        }


    }
}

Type conversion exception after running

Exception in thread "main" java.lang.ClassCastException: dat24.Student1 cannot be cast to java.lang.Comparable
	at java.util.TreeMap.compare(TreeMap.java:1294)
	at java.util.TreeMap.put(TreeMap.java:538)
	at java.util.TreeSet.add(TreeSet.java:255)
	at dat24.TreeSetDemo1.main(TreeSetDemo1.java:16)

Process finished with exit code 1

From the above, we can see that the type conversion exception occurs because the Student does not implement the comparable interface

When the Student implements the comparable interface

public class Student1 implements Comparable<Student1>{
    private String name;
    private int age;

    public Student1(String name, int age) {
        this.name = name;
        this.age = age;
    }

    public Student1() {
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Student1 student1 = (Student1) o;
        return age == student1.age &&
                Objects.equals(name, student1.name);
    }

    @Override
    public int hashCode() {

        return Objects.hash(name, age);
    }

    @Override
    public String toString() {
        return "Student1{" +
                "name='" + name + '\'' +
                ", age=" + age +
                '}';
    }

    @Override
    public int compareTo(Student1 o) {
        //return 0;
        //return 1;
        //return -1;

        //What is returned here should actually be sorted according to our rules
        //For example, I want to sort by age on the premise of weight removal
        //return this.age - o.age;

        //The same age, not necessarily the same name
        int i = this.age - o.age;
        //Implicit conditions (you need to dig them yourself)
        int i2 = i == 0 ? this.name.compareTo(o.name) : i;
        return i2;
    }
}

Output results:

Student1{name='Zhang San', age=18}
Student1{name='Li Si', age=18}
Student1{name='Wang Wu', age=18}
Student1{name='sunspot', age=18}
Student1{name='Zhang San', age=19}

Process finished with exit code 0

Summary:

1. TreeSet implements automatic insertion and sorting through the underlying red black tree, and ensures the uniqueness of nodes

2. To insert a reference data type into a TreeSet, you must implement the comparable interface and change the relevant code as required

Topics: Java data structure