A thorough understanding of the ArrayList source code

Posted by alexscan on Thu, 10 Feb 2022 14:53:29 +0100

preface

ArrayList is a List implemented by array. Compared with array, it has the ability of dynamic expansion, so it can also be called dynamic array.

Any type of data can be stored in the ArrayList collection, and it is a sequential container. The order of stored data is consistent with the order we put it in, and it also allows us to put null elements.

Inheritance system

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{...}
  • ArrayList implements List and provides basic operations such as addition, deletion and traversal.
  • ArrayList implements RandomAccess and provides the ability of random access.
  • ArrayList implements Cloneable and can be cloned.
  • ArrayList implements Serializable and can be serialized.

Source code analysis

attribute

/**
 * Default capacity
 */
private static final int DEFAULT_CAPACITY = 10;

/**
 * Empty array, used if the passed in capacity is 0
 */
private static final Object[] EMPTY_ELEMENTDATA = {};

/**
 * Empty array, which is used when passing in capacity. When adding the first element, it will be re initialized to the default capacity
 */
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

/**
 * An array of storage elements
 */
transient Object[] elementData; // non-private to simplify nested class access

/**
 * Number of elements in the collection
 */
private int size;

(1)DEFAULT_CAPACITY: the default capacity is 10, that is, the default capacity when created through new ArrayList().

(2)EMPTY_ELEMENTDATA: an empty array, which is used when it is created through new ArrayList(0).

(3)DEFAULTCAPACITY_EMPTY_ELEMENTDATA: it is also an empty array. This empty array is used when it is created through new ArrayList(), which is the same as empty_ The difference of elementdata is that when the first element is added, the empty array will be initialized to default_ Capability (10) elements.

(4) elementData: the real place to store elements.

(5) size: the number of elements actually stored, not the length of the elementData array.

Why should the elementData array of ArrayList be decorated with transient?

Because ArrayList has an automatic capacity expansion mechanism, the size of the elementData array of ArrayList is often larger than the number of existing elements. If it is not serialized directly without transient, the empty positions in the array will be serialized, wasting a lot of space.

The writeObject and readObject methods corresponding to serialization and deserialization are rewritten in ArrayList. When traversing array elements, only the existing elements in ArrayList are serialized with size as the end flag.

ArrayList(int initialCapacity) construction method

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        // If the initial capacity passed in is greater than 0, a new array storage element will be created
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        // If the initial capacity passed in is equal to 0, the empty array empty is used_ ELEMENTDATA
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        // If the initial capacity passed in is less than 0, an exception is thrown
        throw new IllegalArgumentException("Illegal Capacity: " + initialCapacity);
    }
}

ArrayList() construction method

public ArrayList() {
    // If no initial capacity is passed in, the empty array defaultcapability is used_ EMPTY_ ELEMENTDATA
    // This array will be expanded to the default size of 10 when adding the first element
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

ArrayList construction method

/**
* Initialize the elements passed into the collection into the ArrayList
*/
public ArrayList(Collection<? extends E> c) {
    // Set to array
    elementData = c.toArray();
    if ((size = elementData.length) != 0) {
        // Check whether the type returned by c.toArray() is Object []. If not, copy it back to Object [] Class type
        if (elementData.getClass() != Object[].class)
            elementData = Arrays.copyOf(elementData, size, Object[].class);
    } else {
        // If c is an empty set, it is initialized to an empty array EMPTY_ELEMENTDATA
        this.elementData = EMPTY_ELEMENTDATA;
    }
}

Add (E) method

Add elements to the end with an average time complexity of O(1).

public boolean add(E e) {
    // Check whether capacity expansion is required
    ensureCapacityInternal(size + 1);
    // Insert the element to the last bit
    elementData[size++] = e;
    return true;
}

private void ensureCapacityInternal(int minCapacity) {
    ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

private static int calculateCapacity(Object[] elementData, int minCapacity) {
    // If it is an empty array, DefaultAttribute_ EMPTY_ Elementdata is initialized to the default size of 10
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        return Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    return minCapacity;
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    if (minCapacity - elementData.length > 0)
        // Capacity expansion
        grow(minCapacity);
}

private void grow(int minCapacity) {
    int oldCapacity = elementData.length;
    // The new capacity is 1.5 times the old capacity
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    // If the new capacity is found to be smaller than the required capacity, the required capacity shall prevail
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    // If the new capacity has exceeded the maximum capacity, the maximum capacity is used
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // Copy out a new array with new capacity
    elementData = Arrays.copyOf(elementData, newCapacity);
}

add(int index, E element) method

Add the element to the specified location, and the average time complexity is O(n).

public void add(int index, E element) {
    // Check whether it is out of bounds
    rangeCheckForAdd(index);
    // Check whether capacity expansion is required
    ensureCapacityInternal(size + 1);
    // Move inex and its subsequent elements back one bit, and the index position will be empty
    System.arraycopy(elementData, index, elementData, index + 1,
                     size - index);
    // Insert the element at the location of the index
    elementData[index] = element;
    // Increase size by 1
    size++;
}

private void rangeCheckForAdd(int index) {
    if (index > size || index < 0)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

Why is ArrayList slow to add?

From the above source code, we can see that ArrayList can be added with specified index or directly. Before that, it will have to check the length to determine ensureCapacityInternal, that is, if the length is not enough, it needs to be expanded.

During capacity expansion, the old version of jdk is different from the version after 8. The efficiency after 8 is higher. It adopts bit operation and shifts one bit to the right, which is actually divided by 2. int newCapacity = oldCapacity + (oldCapacity >> 1); The capacity of the new array is 1.5 times that of the old array.

When the specified location is added, the operation after verification is very simple, that is, the copy and system of the array arraycopy(elementData, index, elementData, index + 1, size - index);, For better explanation, draw a diagram here as follows:

For example, there is an array like the following. I need to add an element a at the position of index 4

From the code, we can see that it copies an array, starting from the position of index 4, and then puts it in the position of index 4+1

Make room for the element we want to add, and then put element a in the position of index to complete the adding operation.

This is just an operation in such a small List. If I add an element to a List with the size of millions or tens of thousands, I need to copy all the following elements, and then it will be slower if it comes to capacity expansion, isn't it.

addAll method

Find the union of two sets.

/**
* Adds all elements in the collection c to the current ArrayList
*/
public boolean addAll(Collection<? extends E> c) {
    // Convert set c to array
    Object[] a = c.toArray();
    int numNew = a.length;
    // Check whether capacity expansion is required
    ensureCapacityInternal(size + numNew);
    // Copy all the elements in c to the end of the array
    System.arraycopy(a, 0, elementData, size, numNew);
    // Size increases the size of c
    size += numNew;
    // If c is not empty, it returns true; otherwise, it returns false
    return numNew != 0;
}

get(int index) method

Gets the element at the specified index position with a time complexity of O(1).

public E get(int index) {
    // Check whether it is out of bounds
    rangeCheck(index);
    // Returns the element at the index position of the array
    return elementData(index);
}

private void rangeCheck(int index) {
    if (index >= size)
        throw new IndexOutOfBoundsException(outOfBoundsMsg(index));
}

E elementData(int index) {
    return (E) elementData[index];
}

(1) Check whether the index is out of bounds. Here, only check whether it is out of bounds. If it is out of bounds, an IndexOutOfBoundsException exception will be thrown. If it is out of bounds, an ArrayIndexOutOfBoundsException exception will be thrown.

(2) Returns the element at the index position;

remove(int index) method

Delete the element at the specified index position, and the time complexity is O(n).

public E remove(int index) {
    // Check whether it is out of bounds
    rangeCheck(index);

    modCount++;
    // Gets the element at the index location
    E oldValue = elementData(index);

    // If index is not the last bit, move the element after index one bit forward
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);

    // Delete the last element to help GC
    elementData[--size] = null; // clear to let GC do its work

    // Return old value
    return oldValue;
}

remove(Object o) method

Delete the element with the specified element value, and the time complexity is O(n).

public boolean remove(Object o) {
    if (o == null) {
        // Traverse the entire array, find the position where the element first appears, and quickly delete it
        for (int index = 0; index < size; index++)
            // If the element to be deleted is null, compare it with null and use==
            if (elementData[index] == null) {
                fastRemove(index);
                return true;
            }
    } else {
        // Traverse the entire array, find the position where the element first appears, and quickly delete it
        for (int index = 0; index < size; index++)
            // If the element to be deleted is not null, compare it and use the equals() method
            if (o.equals(elementData[index])) {
                fastRemove(index);
                return true;
            }
    }
    return false;
}

private void fastRemove(int index) {
    // Missing a cross-border check
    modCount++;
    // If index is not the last bit, move the element after index one bit forward
    int numMoved = size - index - 1;
    if (numMoved > 0)
        System.arraycopy(elementData, index+1, elementData, index, numMoved);
    // Delete the last element to help GC
    elementData[--size] = null; // clear to let GC do its work
}

(1) Find the first element equal to the specified element value;

(2) For fast deletion, fastRemove(int index) has fewer operations to check the index out of bounds than remove(int index).

retainAll method

Find the intersection of two sets.

public boolean retainAll(Collection<?> c) {
    // Collection c cannot be null
    Objects.requireNonNull(c);
    // Call the batch delete method. In this case, true is passed in the complex, which means to delete the elements that are not included in c
    return batchRemove(c, true);
}

/**
* Batch delete elements
* complement true means to delete elements not contained in c
* complement false means to delete the elements contained in c
*/
private boolean batchRemove(Collection<?> c, boolean complement) {
    final Object[] elementData = this.elementData;
    // Use read and write two pointers to traverse the array at the same time
    // The read pointer is incremented by 1 each time, and the write pointer is incremented by 1 when it is put into the element
    // In this way, there is no need for additional space, just operate on the original array
    int r = 0, w = 0;
    boolean modified = false;
    try {
        // Traverse the entire array. If c contains this element, put the element at the position of the write pointer (subject to the completion)
        for (; r < size; r++)
            if (c.contains(elementData[r]) == complement)
                elementData[w++] = elementData[r];
    } finally {
        // Normally, r is finally equal to size, unless c.contains() throws an exception
        if (r != size) {
            // If c.contains() throws an exception, all unread elements are copied after the write pointer
            System.arraycopy(elementData, r,
                             elementData, w,
                             size - r);
            w += size - r;
        }
        if (w != size) {
            // Set the element after the write pointer to null to help GC
            for (int i = w; i < size; i++)
                elementData[i] = null;
            modCount += size - w;
            // The new size is equal to the position of the write pointer (because the write pointer is incremented by 1 every time, the new size is exactly equal to the position of the write pointer)
            size = w;
            modified = true;
        }
    }
    // Return true after modification
    return modified;
}

(1) Traverse the elementData array;

(2) If the element is in c, add this element to the w position of the elementData array and move the w position back one bit;

(3) After traversal, the elements before W are shared by both, and the elements after w (included) are not shared by both;

(4) Set the elements after w (included) to null to facilitate GC recycling;

removeAll

Find the one-way difference set of two sets, and only keep the elements in the current set that are not in c, not the elements in c that are not in the current collective.

public boolean removeAll(Collection<?> c) {
    // Set c cannot be empty
    Objects.requireNonNull(c);
    // Similarly, the batch delete method is called. At this time, false is passed in the complex to delete the elements contained in c
    return batchRemove(c, false);
}

It is similar to the retain all (collection <? > c) method, except that the elements not in c are retained here.

summary

(1) ArrayList uses an array to store elements. When expanding capacity, add half of the space each time, and ArrayList will not shrink.

(2) ArrayList supports random access. It is extremely fast to access elements through index, and the time complexity is O(1).

(3) ArrayList adds elements to the tail very quickly, with an average time complexity of O(1).

(4) ArrayList is slow to add elements to the middle because the average time complexity to move elements is O(n).

(5) ArrayList removes elements from the tail very quickly, with a time complexity of O(1).

(6) ArrayList is slow to delete elements from the middle because the average time complexity to move elements is O(n).

(7) ArrayList supports union sets. Just call the addall (collection <? Extensions E > C) method.

(8) ArrayList supports intersection. You can call the retainAll (collection <? Extensions E > C) method.

(7) ArrayList supports one-way subtraction. Just call the removeAll (collection <? Extensions E > C) method.

Topics: Java set arraylist