ArrayList (source code analysis)

Posted by Capoeirista on Thu, 03 Mar 2022 07:44:41 +0100

ArrayList overview
(1) ArrayList is a variable length collection class, which is implemented based on fixed length array.
(2) ArrayList allows null values and duplicate elements. When the number of elements added to ArrayList is greater than the capacity of its underlying array, it will regenerate a larger array through the capacity expansion mechanism.
(3) Because the bottom layer of ArrayList is implemented based on array, it can ensure that the random search operation can be completed under O(1) complexity.
(4) ArrayList is a non thread safe class. In a concurrent environment, multiple threads operating ArrayList at the same time will cause unpredictable exceptions or errors.
Member properties of ArrayList
Before introducing the various methods of ArrayList, let's take a look at the basic attribute members. Of which, defaultcapability_ EMPTY_ELEMENTDATA and empty_ The difference between elementdata is that when we add the first element to the array, default copy_ EMPTY_ELEMENTDATA will know how much the array should be expanded.
//Default initialization capacity
private static final int DEFAULT_CAPACITY = 10;

//The default empty array, which is mainly used when initializing an empty array by the constructor
private static final Object[] EMPTY_ELEMENTDATA = {};

//Empty array instance with default size, and empty_ Element data,
//In this way, you can know how much to expand when the first element is added
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

//The underlying data of ArrayList is stored in the form of array, and the length of ArrayList is the length of array.
//An empty instance elementData is the above default capability_ EMPTY_ elementData, when adding the first element
//The capacity will be expanded. The expansion size is the above default capacity DEFAULT_CAPACITY
transient Object[] elementData; // non-private to simplify nested class access

//Size of arrayList
private int size;
Copy code static modified empty_ Default and default_ EMPTY_ELEMENTDATA

ArrayList construction method
(1) Construction method with initialization capacity

If the parameter is greater than 0, elementData is initialized to an array of initialCapacity size
If the parameter is less than 0, elementData is initialized to an empty array
If the parameter is less than 0, an exception is thrown

//The parameter is initialization capacity
public ArrayList(int initialCapacity) {

//Judge the legitimacy of capacity
if (initialCapacity > 0) {
    //elementData is the array that actually stores elements
    this.elementData = new Object[initialCapacity];
} else if (initialCapacity == 0) {
    //If the passed length is 0, you can directly use the member variable you have defined (an empty array)
    this.elementData = EMPTY_ELEMENTDATA;
} else {
    throw new IllegalArgumentException("Illegal Capacity: "+
                                       initialCapacity);
}

}
Copy code (2) parameterless construction

Initializes elementData to an empty array DefaultAttribute in the constructor_ EMPTY_ elementData
When the add method is called to add the first element, the capacity will be expanded
Expand the capacity to DEFAULT_CAPACITY=10

//For parameterless construction, the empty array with the default size of 10 is used. The length of the array is not set in the construction method, and the capacity will be expanded when the add method is called later
public ArrayList() {

this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;

}
The copy code (3) parameter is a constructor of type Collection
//Convert a Collection whose parameter is Collection into ArrayList (in fact, change the elements in the Collection into the form of array). If
//If the incoming collection is null, a null pointer exception will be thrown (when calling the c.toArray() method)
public ArrayList(Collection<? extends E> c) {

elementData = c.toArray();
if ((size = elementData.length) != 0) {
    //c.toArray() may not return an Object [] array correctly, so use arrays Copyof() method
    if (elementData.getClass() != Object[].class)
        elementData = Arrays.copyOf(elementData, size, Object[].class);
} else {
    //If the array length is 0 after the collection is converted to an array, you can directly initialize elementData with your own empty member variable
    this.elementData = EMPTY_ELEMENTDATA;
}

}
Copy the code. The above construction methods are relatively simple to understand. Pay attention to what the first two construction methods do. The purpose is to initialize the underlying array elementData(this.elementData=XXX). The difference is that the parameterless construction method initializes elementData into an empty array. When inserting elements, the expansion will reinitialize the array according to the default value. The construction method with parameters initializes elementData into an array of parameter value size (> = 0). Generally, we can use the default construction method. If you know how many elements will be inserted into ArrayList, you can use the parametric construction method.
As mentioned above, when using parameterless construction, the capacity will be expanded when calling the add method, so let's take a look at the add method and the details of capacity expansion
add method of ArrayList
General flow of add method
//Adds the specified element to the end of the list
public boolean add(E e) {

//Because the element needs to be added, the capacity may be insufficient after adding, so it needs to be judged (expanded) before adding
ensureCapacityInternal(size + 1);  // Increments modCount!! (fast fail will be introduced later)
elementData[size++] = e;
return true;

}
Copying the code, we see that the size of the add method will be determined before adding elements, so let's take a look at the details of the ensureCapacityInternal method
Analysis of ensureCapacityInternal method
private void ensureCapacityInternal(int minCapacity) {

//Here is to judge whether the elementData array is an empty array
//(when using parameterless construction, elementdata = defaultprotocol_empty_elementdata)
//If yes, compare size + 1 (size + 1 = 1 when calling add for the first time) and DEFAULT_CAPACITY,
//So obviously, the capacity is 10
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
    minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
}
ensureExplicitCapacity(minCapacity);

}
Copy the code. When you want to add the first element, minCapacity is (size+1=0+1=)1. In math After comparison with max() method, minCapacity is 10. Then call ensureExplicitCapacity to update the value of modCount and judge whether capacity expansion is needed
Analysis of ensureExplicitCapacity method
private void ensureExplicitCapacity(int minCapacity) {

modCount++; //Here is the increment modcount annotated in the add method
//overflow
if (minCapacity - elementData.length > 0)
    grow(minCapacity);//Here is the method of capacity expansion

}
Copy the code. Let's take a look at the main method of capacity expansion.
Growth method analysis
private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
private void grow(int minCapacity) {

// oldCapacity is the capacity of the old array
int oldCapacity = elementData.length;
// newCapacity is the capacity of the new array (oldCap+oldCap/2: update to 1.5 times the old capacity)
int newCapacity = oldCapacity + (oldCapacity >> 1);
// Check whether the size of the new capacity is less than the minimum required capacity. If it is less than the old capacity, the minimum capacity will be the new capacity of the array
if (newCapacity - minCapacity < 0)
    newCapacity = minCapacity;
//If the new capacity is greater than MAX_ARRAY_SIZE, using hugeCapacity to compare the two
if (newCapacity - MAX_ARRAY_SIZE > 0)
    newCapacity = hugeCapacity(minCapacity);
// minCapacity is usually close to size, so this is a win:
// Copy the elements in the original array
elementData = Arrays.copyOf(elementData, newCapacity);

}
Copy code hugeCapacity method
Here's a brief look at the hugeCapacity method
private static int hugeCapacity(int minCapacity) {

if (minCapacity < 0) // overflow
    throw new OutOfMemoryError();
//For minCapacity and MAX_ARRAY_SIZE for comparison
//If minCapacity is large, integer MAX_ Value as the size of the new array
//If Max_ ARRAY_ If size is large, set MAX_ARRAY_SIZE as the size of the new array
//MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;
return (minCapacity > MAX_ARRAY_SIZE) ? Integer.MAX_VALUE : MAX_ARRAY_SIZE;

}
Summary of execution process of copy code add method
Let's use a diagram to briefly sort out the execution process after the first call of the add method when using parameterless construction

This is the first time to call the add method. When the capacity expansion value is 10,

Continue to add the second element (first note that the parameter passed by calling the ensureCapacityInternal method is size+1=1+1=2)

In the ensureCapacityInternal method, elementdata = = defaultcapability_ EMPTY_ Elementdata is not valid, so directly execute the ensureExplicitCapacity method

minCapacity in the ensureExplicitCapacity method is 2 just passed, so the second if judgment (2-10 = - 8) will not hold, that is, newCapacity is not higher than max_ ARRAY_ If the size is large, the grow th method will not be entered. The capacity of the array is 10, return true in the add method, and the size is increased to 1.

Suppose you add 3, 4 10 elements (the process is similar, but the grow th expansion method will not be executed)

When the 11th element is added, the grow th method will be entered. The newCapacity calculated is 15, which is larger than minCapacity (10 + 1 = 11). The first if judgment is not tenable. If the new capacity is not greater than the maximum size of the array, it will not enter the hugeCapacity method. The array capacity is expanded to 15, return true in the add method, and the size is increased to 11.

add(int index,E element) method
//Insert at element sequence index position
public void add(int index, E element) {

rangeCheckForAdd(index); //Verify whether the passed index parameter is legal
// 1. Check whether capacity expansion is required
ensureCapacityInternal(size + 1);  // Increments modCount!!
// 2. Move the index and all subsequent elements back one bit
System.arraycopy(elementData, index, elementData, index + 1,
                 size - index);
// 3. Insert the new element into the index
elementData[index] = element;
size++;

}
private void rangeCheckForAdd(int index) {

if (index > size || index < 0) //Index > size (ensure the continuity of the array) is judged here, and index is less than 0
    throw new IndexOutOfBoundsException(outOfBoundsMsg(index));

}
The process of copying the code add(int index, E element) method (inserting at the specified position of the element sequence (assuming that the position is reasonable) is roughly as follows

Check whether the array has enough space (the implementation here is the same as that above)
Moves the index and all subsequent elements back one bit
Insert the new element at index

To insert a new element into the specified position of the sequence, you need to move the position and subsequent elements back one bit to make room for the new element. The time complexity of this operation is O(N). Frequent movement of elements may lead to efficiency problems, especially when there are a large number of elements in the set. In daily development, if not required, we should try to avoid calling second insertion methods in large collections.
remove method of ArrayList
ArrayList supports two ways to delete elements
1. remove(int index) delete by subscript
public E remove(int index) {

rangeCheck(index); //Verify whether the subscript is legal (if index > size, the old one throws IndexOutOfBoundsException exception)
modCount++;//To modify the list structure, you need to update this value
E oldValue = elementData(index); //Find this value directly in the array

int numMoved = size - index - 1;//The number of moves required is calculated here
//If this value is greater than 0, it indicates that subsequent elements need to be moved to the left (size=index+1)
//If the last element is an object that does not need to be removed, it indicates that the last element is 0
if (numMoved > 0)
    //All elements in the index are shifted to the left by one bit, overwriting the elements in the index position
    System.arraycopy(elementData, index+1, elementData, index,
                     numMoved);
//After the move, the size position in the original array is null
elementData[--size] = null; // clear to let GC do its work
//Return old value
return oldValue;

}
//src: source array
//srcPos: move from the srcPos position of the source array
//dest: target array
//desPos: elements that start moving at the srcPos position of the source array. These elements are filled from the desPos position of the target array
//Length: the length of the moving source array
public static native void arraycopy(Object src, int srcPos,

                                Object dest, int destPos,
                                int length);

The copy code} deletion process is shown in the figure below

2. remove(Object o) deletes the first element matching the parameter according to the element
public boolean remove(Object o) {

//If the element is null, traverse the array to remove the first null
if (o == null) {
    for (int index = 0; index < size; index++)
        if (elementData[index] == null) {
            //Traverse to find the subscript of the first null element and call the method of subscript removing the element
            fastRemove(index);
            return true;
        }
} else {
    //Find the subscript corresponding to the element, and call the method of subscript removing the element
    for (int index = 0; index < size; index++)
        if (o.equals(elementData[index])) {
            fastRemove(index);
            return true;
        }
}
return false;

}
//Remove elements according to subscripts (delete by moving the position of array elements)
private void fastRemove(int index) {
modCount++;
int numMoved = size - index - 1;
if (numMoved > 0)

System.arraycopy(elementData, index+1, elementData, index,
                 numMoved);

elementData[--size] = null; // clear to let GC do its work
}
Other ways to copy code ArrayList
ensureCapacity method
It is best to use the ensureCapacity method before add ing a large number of elements to reduce the number of incremental new allocations
public void ensureCapacity(int minCapacity) {

int minExpand = (elementData != DEFAULTCAPACITY_EMPTY_ELEMENTDATA)
    // any size if not default element table
    ? 0
    // larger than default for default empty table. It's already
    // supposed to be at default size.
    : DEFAULT_CAPACITY;

if (minCapacity > minExpand) {
    ensureExplicitCapacity(minCapacity);
}

}
Copy code ArrayList summary
(1) ArrayList is a variable length collection class, which is implemented based on a fixed length array. The capacity initialized by using the default construction method is 10 (delayed initialization after 1.7, that is, the elementData capacity is initialized to 10 when the add method is called for the first time to add elements).
(2) ArrayList allows null values and duplicate elements. When the number of elements added to ArrayList is greater than the capacity of its underlying array, it will regenerate a larger array through the capacity expansion mechanism. The expanded length of ArrayList is 1.5 times of the original length
(3) Because the bottom layer of ArrayList is implemented based on array, it can ensure that the random search operation can be completed under O(1) complexity.
(4) ArrayList is a non thread safe class. In a concurrent environment, multiple threads operating ArrayList at the same time will cause unpredictable exceptions or errors.
(5) Sequential addition is very convenient
(6) Deleting and inserting need to copy the array, and the performance is poor (you can use LinkindList)
(7)Integer.MAX_VALUE -8: mainly considering different JVMs, some JVMs will add some data headers. When the capacity after expansion is greater than MAX_ARRAY_SIZE, we will compare the minimum required capacity with MAX_ARRAY_SIZE for comparison. If it is larger than it, only integer can be taken MAX_ Value, otherwise integer MAX_ VALUE -8. This is from jdk1 Only at the beginning of July
Fast fail mechanism
Explanation of fail fast:

In system design, a rapid failure system is a system that can immediately report any condition that may indicate a failure. Rapid failure systems are usually designed to stop normal operation rather than trying to continue a potentially defective process. This design usually checks the state of the system at multiple points in operation, so any fault can be detected early. The responsibility of the fast failure module is to detect errors and then let the next highest level of the system deal with them.

When designing the system, first consider the abnormal situation. In case of any abnormality, stop and report it directly, such as the following simple example
//The code here is a method of dividing two integers in fast_ fail_ In the method method, we do a simple check on the divisor. If its value is 0, we will directly throw an exception and clearly prompt the reason for the exception. This is actually the practical application of the fail fast concept.
public int fast_fail_method(int arg1,int arg2){

if(arg2 == 0){
    throw new RuntimeException("can't be zero");
}
return arg1/arg2;

}
Copy code} this mechanism is used in many places in Java collection classes for design. If it is not used properly, the code designed by the fail fast mechanism will be triggered, and unexpected situations will occur. The fail fast mechanism in Java, which we usually refer to, by default, refers to an error detection mechanism of Java collection. When multiple threads change the structure of some collections, this mechanism may be triggered, and then the concurrent modification exception concurrent modificationexception will be thrown Of course, if you are not in a multithreaded environment, you may throw this exception if you use the add/remove method during foreach traversal. With reference to the fast fail mechanism, here is a brief summary
The reason why the ConcurrentModificationException exception is thrown is that our code uses the enhanced for loop. In the enhanced for loop, the collection traversal is carried out through the iterator, but the add/remove of the element is the method of the collection class itself. This causes the iterator to find that an element has been deleted / added unknowingly when traversing, and an exception will be thrown to indicate that concurrent modifications may have occurred! Therefore, when using Java collection classes, if a ConcurrentModificationException occurs, give priority to the situation related to fail fast. In fact, it may not really occur concurrency, but the iterator uses the fail fast protection mechanism. As long as it finds that a certain modification has not been made by itself, it will throw an exception.

Author: RoadTrip
Link: https://juejin.cn/post/684490...
Source: rare earth Nuggets
The copyright belongs to the author. For commercial reprint, please contact the author for authorization. For non-commercial reprint, please indicate the source.

Topics: Java