Due to the long length of this article, in order to facilitate everyone's reading and collection, this article has been specially transformed into a PDF document.
click Download Java learning manual, pdf tutorial.
1. Collection overview
Collections in Java are mainly divided into three categories:
- List: sequential and repeatable.
- Set: no sequence and cannot be repeated.
- Map: no sequence and cannot be repeated.
In the List class collection, the most commonly used is ArrayList.
2. ArrayList overview
ArrayList is divided into two words: Array+List. Array represents array and List represents List. Therefore, it also indicates that the underlying of ArrayList is implemented using arrays.
The length of a traditional array must be defined during initialization, and the length cannot be changed.
int[] arr1 = new int[10]; // Syntax definition method. The length is determined according to the initial member during initialization int[] arr2 = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
ArrayList is a dynamic array. The length can not be defined during initialization. When the JVM judges that the capacity of ArrayList is insufficient, the array will be expanded automatically.
ArrayList<Integer> arrayList = new ArrayList<>();
3. Array properties
The bottom layer of ArrayList is implemented by array, so ArrayList has all the characteristics of array. Before studying ArrayList, you must understand the underlying implementation principle of arrays in the JVM.
Array properties:
- Array elements must be of the same data type.
- Array elements are stored continuously in memory.
- The random access efficiency of array elements is particularly high. Constant level random access can be realized, and the time complexity is O(1).
According to the first feature, the space occupied by each element in the array is the same. The third characteristic can be obtained by combining the first characteristic with the second characteristic.
Question: why is array query more efficient than linked list?
When an array object is created, the JVM assigns it a base address. When querying the K + 1st element in the array, you only need [base address + k * element size] to get the address of the K + 1st element directly, so that you can access the data in the element. This process only performs one addressing operation.
When querying the K + 1st element in the linked list, generally, the K + 1st element will be found from the head node of the linked list through the next pointer. This operation requires K addressing operations.
4. ArrayList source code analysis
4.1 inheritance structure
Type 4.2 structure
public class ArrayList<E> extends AbstractList<E> implements List<E>, RandomAccess, Cloneable, java.io.Serializable { // During serialization, verify whether the versions of the transport class and the local class are consistent private static final long serialVersionUID = 8683452581122892189L; // Array default initialization capacity private static final int DEFAULT_CAPACITY = 10; // The number of elements currently contained in the array private int size; // Array of data stores transient Object[] elementData; // Shared empty array instance for empty instance private static final Object[] EMPTY_ELEMENTDATA = {}; // Shared empty array instance for empty instances of default size private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {}; // Collection version number. Every time you add or delete elements in the collection, it will be + 1 protected transient int modCount = 0; ... }
- The ArrayList class inherits the AbstractList abstract class and implements the List interface, indicating that the ArrayList instance has the most basic location operations such as add, remove, set and get.
- The ArrayList class implements the RandomAccess tag interface, and the tagged ArrayList instance has the ability of fast random access.
- The ArrayList class implements the Cloneable tag interface, and the marked ArrayList instance can be cloned.
- The ArrayList class implements the Serializable tag interface. The marked ArrayList instance supports serialization and can be transmitted in the network.
- The elementData member variable is modified by the transient keyword, indicating that elementData will be ignored during the serialization of ArrayList instances. Because in the actual use scenario, elementData may not be full, and only the data part needs to be serialized. Therefore, ArrayList uses rewriting writeObject method and readObject method to define the serialization process of elementData.
- The modCount member variable is used to trigger the fail fast mechanism, which will be described in detail below.
4.3 initialization
4.3.1 nonparametric constructor
public ArrayList() { // Assign the value defaultcapability_ EMPTY_ Elementdata empty array this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA; }
Use defaultcapability_ EMPTY_ An empty array of elementData is assigned to elementData.
4.3.2 parametric int constructor
public ArrayList(int initialCapacity) { // Judge whether the initial capacity is 0 if (initialCapacity > 0) { // Creates an array of the specified capacity this.elementData = new Object[initialCapacity]; } else if (initialCapacity == 0) { // Assign EMPTY_ELEMENTDATA empty array this.elementData = EMPTY_ELEMENTDATA; } else { throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity); } }
When the initial capacity is not 0, directly create an array with the specified capacity and assign it to elementData.
When the initial capacity is 0, empty is used_ An empty array of elementData is assigned to elementData.
4.3.3 parameterized Collection constructor
// Other collections can be converted to ArrayList public ArrayList(Collection<? extends E> c) { // The collection is converted into an array through the toArray method and assigned to elementData elementData = c.toArray(); if ((size = elementData.length) != 0) { // c. The array type converted by toArray method may not be Object [] (this is a bug, which has been fixed by jdk9) if (elementData.getClass() != Object[].class) // Create an Object [], and copy the contents of the original elementData elementData = Arrays.copyOf(elementData, size, Object[].class); } else { // Assign EMPTY_ELEMENTDATA empty array this.elementData = EMPTY_ELEMENTDATA; } }
When the capacity of the Collection is not 0, it is directly converted into an array and assigned to elementData.
Empty is used when the capacity of the Collection is 0_ An empty array of elementData is assigned to elementData.
4.4 adding elements
public boolean add(E e) { // Verify whether elementData needs to be expanded. After adding elements, the minimum capacity is size+1 ensureCapacityInternal(size + 1); // Add element action elementData[size++] = e; return true; } private void ensureCapacityInternal(int minCapacity) { // Calculate the minimum capacity first, and then confirm whether the calculated minimum capacity is greater than the original capacity. If it is greater than the original capacity, expand the capacity ensureExplicitCapacity(calculateCapacity(elementData, minCapacity)); } // Calculate the minimum capacity of elementData private static int calculateCapacity(Object[] elementData, int minCapacity) { // To create a parameterless constructor, you need to initialize the default capacity and return the larger of the minimum capacity and the initial default capacity if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) { // DEFAULT_CAPACITY = 10 return Math.max(DEFAULT_CAPACITY, minCapacity); } // The parameter constructor does not need to initialize the default capacity, but directly returns the minimum capacity return minCapacity; } // Confirm the capacity of elementData private void ensureExplicitCapacity(int minCapacity) { // Set version number + 1 modCount++; if (minCapacity - elementData.length > 0) // Capacity expansion grow(minCapacity); } private void grow(int minCapacity) { int oldCapacity = elementData.length; // General capacity expansion rule newCapacity = 1.5 * oldCapacity int newCapacity = oldCapacity + (oldCapacity >> 1); // If the capacity after normal capacity expansion is still smaller than the minimum capacity, the minimum capacity is directly used as the capacity after capacity expansion if (newCapacity - minCapacity < 0) newCapacity = minCapacity; // Judge whether the capacity after expansion exceeds the maximum capacity // MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8; if (newCapacity - MAX_ARRAY_SIZE > 0) // Set integer MAX_ Value is assigned to newCapacity newCapacity = hugeCapacity(minCapacity); // For the actual operation of capacity expansion, create an array with capacity equal to newCapacity and copy all the elements of elementData elementData = Arrays.copyOf(elementData, newCapacity); }
Main process:
- Before adding elements each time, you need to calculate the minimum capacity by case.
- If you create using a parameterless constructor, select the larger from 10 and size + 1 as the minimum capacity.
- If you create with a parametric constructor, select size + 1 as the minimum capacity.
- Then confirm whether to expand the capacity according to the calculated minimum capacity.
- If the minimum capacity is greater than the original capacity, elementData expands the capacity, and then adds elements.
- If the minimum capacity is less than or equal to the original capacity, you can add it directly.
be careful:
-
capacity is elementdata length.
-
The elementData in the ArrayList instance created by the parameterless constructor is an empty array with capacity 0 before adding any elements. When an element is added for the first time, it will be expanded to an array with capacity of 10.
4.5 removing elements
The ArrayList class provides three ways to remove elements: remove according to subscripts, remove according to elements, and remove by iterators.
Removing according to subscript is essentially the same as removing according to element. All elements after the target element is deleted are moved forward by one bit to cover the target element and achieve the purpose of deletion.
How to remove iterators is explained separately below.
4.5.1 remove according to subscript
// Remove element by subscript public E remove(int index) { // Judge whether the index is greater than or equal to size. If it is greater than, an exception will be thrown rangeCheck(index); // Set version number + 1 modCount++; // Find elementData[index] according to index E oldValue = elementData(index); // Number of elements to be moved int numMoved = size - index - 1; if (numMoved > 0) // Move all elements after the target node is deleted forward by one bit System.arraycopy(elementData, index+1, elementData, index, numMoved); // Empty the last position of the elementData data part, size - 1 elementData[--size] = null; return oldValue; } // @SuppressWarnings("unchecked") means to have the compiler ignore unchecked warnings @SuppressWarnings("unchecked") E elementData(int index) { return (E) elementData[index]; }
Main process:
- Judge whether the subscript is out of bounds. If it is out of bounds, it will end directly.
- Determine the number of elements to move.
- Moves all elements after the target element is deleted one bit forward.
4.5.2 remove according to element
public boolean remove(Object o) { // Traverse the array, match elements one by one, find the corresponding elements and delete them if (o == null) { for (int index = 0; index < size; index++) if (elementData[index] == null) { fastRemove(index); return true; } } else { for (int index = 0; index < size; index++) if (o.equals(elementData[index])) { fastRemove(index); return true; } } return false; } private void fastRemove(int index) { // Set version number + 1 modCount++; // Calculate the number of elements that need to be moved int numMoved = size - index - 1; if (numMoved > 0) // Move all elements after the target node is deleted forward by one bit System.arraycopy(elementData, index+1, elementData, index, numMoved); // Empty the last position of the elementData data part, size - 1 elementData[--size] = null; }
Main process:
- The subscript traverses the array, matching element by element.
- After matching to the target deletion element, record the subscript of the element.
- Determine the number of elements to move.
- Moves all elements after the target element is deleted one bit forward.
4.6 traversal elements
ArrayList provides three traversal methods, including iterator traversal, for loop traversal, and enhanced for loop traversal.
But there are only two real traversal schemes, one is iterator traversal, and the other is array traversal.
-
Iterator traversal: iterator traversal and enhanced for loop traversal.
-
Array traversal: for loop traversal.
Iterator traversal has the specific implementation scheme of ArrayList iterator, which is given below.
Array traversal uses the underlying implementation of ArrayList, which is array. Through the base address and spatial continuity of the array, you can access the array circularly through subscripts.
// Iterator traversal Iterator<Integer> iterator = arr.iterator(); while (iterator.hasNext()) { System.out.println(iterator.next()); } // for loop traversal (not recommended) for (int i = 0; i < arr.size(); i ++) { System.out.println(arr.get(i)); } // Enhanced for loop traversal for (Integer cur : arr) { System.out.println(cur); } // Decompile code for enhanced for loop traversal Iterator var = arr.iterator(); while(var.hasNext()) { Integer cur = (Integer)var.next(); System.out.println(cur); }
It is not recommended to use array traversal to traverse ArrayList. The reason why we know that we can use the for loop to traverse ArrayList is because we know that the underlying maintenance of ArrayList is an array. Such code is tightly coupled with the collection itself, and the access logic cannot be separated from the collection class and the client code. Different collections correspond to different traversal methods, and the client code cannot be reused. In practical application, how to integrate the above two sets is quite troublesome, so there is the emergence of iterators.
4.7 iterators
4.7.1 iterator mode
Java provides many kinds of collections, and the internal structure of each collection is different. For example, the bottom layer of ArrayList maintains an array, the bottom layer of LinkedList maintains a linked list, and the bottom layer of HashSet maintains a hash table. Because the internal structure of the container is different, you often don't know how to traverse a collection, so Java extracts the access logic from different types of collections and abstracts it into the iterator pattern.
Iterator pattern: provides a way to access individual elements in a container object without exposing the internal details of the object container.
Type 4.7.2 structure
The implementation of iterators in ArrayList is to define a private internal class Itr in the ArrayList class, and then expose a member method for creating iterators.
public Iterator<E> iterator() { return new Itr(); } private class Itr implements Iterator<E> { // Index of the next element int cursor; // The index of the previous element, if not, is - 1 int lastRet = -1; // Iterator version number, which is initialized to the collection version number when the iterator is instantiated int expectedModCount = modCount; ... }
- The Itr class implements the Iterator interface, which means that the Iterator has the basic rules for iterating over the Collection.
- The difference between the Iterator interface and the iteratable interface is
- The Iterator interface is the Iterator that can actually traverse the Collection. If a Collection only needs to design an Iterator, it can directly implement the Iterator interface.
- Iterator interface is aggregated in iteratable interface. In this way, if a collection needs to design many different iterators, the Iterable interface can be implemented. For example, listIterator and descending iterator are designed in LinkedList.
- Only a collection of internal Iterator classes that implement the Iterator interface can be used as an object to enhance for loop traversal.
Source code of two interfaces:
public interface Iterable<T> { Iterator<T> iterator(); } public interface Iterator<E> { boolean hasNext(); E next(); void remove(); }
4.7.3 iterator traversal
The ArrayList iterator uses the hasNext method and the next method together with the while loop to traverse the elements.
// Determines whether the next object element exists public boolean hasNext() { return cursor != size; } //Get next element @SuppressWarnings("unchecked") public E next() { // Check whether the iterator version number is equal to the collection version number. If not, throw an exception checkForComodification(); // The subscript of the next element is assigned to i int i = cursor; // Judge whether i exceeds the data area. If i exceeds the data area, throw an exception if (i >= size) throw new NoSuchElementException(); Object[] elementData = ArrayList.this.elementData; // Judge whether i is out of bounds. If it is out of bounds, throw an exception if (i >= elementData.length) throw new ConcurrentModificationException(); // The cursor points to the next element cursor = i + 1; // i is assigned to lastRet to return the previous element pointed to by the cursor return (E) elementData[lastRet = i]; } // Check that the iterator version number and the collection version number are equal final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); }
Main process:
- Check whether the version number matches. If not, the traversal ends.
- Judge whether the current subscript pointed by cursor exceeds the data area. If so, the traversal ends.
- Judge whether the current subscript pointed by cursor exceeds the array boundary. If so, the traversal ends.
- cursor points to the next element.
- lastRet points to the previous element that cursor points to.
- Returns the element pointed to by lastRet.
- When cursor points to size, the traversal ends.
4.7.4 iterator removal
It is recommended to use the remove method provided by the iterator to remove elements while traversing the ArrayList with the iterator.
public void remove() { // Judge whether the cursor being traversed is in the data part. If not, throw an exception if (lastRet < 0) throw new IllegalStateException(); // Check whether the iterator version number is equal to the collection version number. If not, throw an exception checkForComodification(); try { // Use the remove method in the ArrayList class to remove elements ArrayList.this.remove(lastRet); // cursor points to the previous element cursor = lastRet; // Reset lastRet lastRet = -1; // Sync iterator version number expectedModCount = modCount; } catch (IndexOutOfBoundsException ex) { throw new ConcurrentModificationException(); } } // Check that the iterator version number and the collection version number are equal final void checkForComodification() { if (modCount != expectedModCount) throw new ConcurrentModificationException(); }
Main process:
- Judge whether the target element to be removed is in the valid data area. If not, discard the removal.
- Check whether the version numbers match. If not, discard the removal.
- Use the remove method defined by the ArrayList class to remove the element.
- Synchronously update iterator version number.
4.7.5 fail fast mechanism
The fail fast mechanism, that is, the fast failure mechanism, is an error detection mechanism in the Java collection. In the process of traversing a collection with iterators, when the structure of the collection changes, it is possible to trigger fail fast, that is, throw a ConcurrentModificationException. The fail fast mechanism does not guarantee that exceptions will be thrown under unsynchronized modifications. It just tries its best to throw exceptions, so this mechanism is generally only used to detect bugs.
Note that it is only possible to trigger the fail fast mechanism by traversing the collection with an iterator.
In ArrayList, the fail fast mechanism is also implemented. The ArrayList class sets modCount as the collection version number, and modCount will be increased by 1 every time elementData is modified. The iterator class in the ArrayList class also sets expectedModCount as the iterator version number. When the iterator is created, the current modCount is assigned to expectedModCount. When the iterator traverses each element, it will match the modCount and expectedModCount. If the matching is inconsistent, it will immediately throw a ConcurrentModificationException to stop traversal.
For example, execute the following code to trigger the fail fast mechanism:
public static void main(String[] args) { ArrayList<String> arr = new ArrayList<>(); arr.add("1"); arr.add("2"); arr.add("2"); arr.add("3"); Iterator<String> iterator = arr.iterator(); while (iterator.hasNext()) { String cur = iterator.next(); if ("2".equals(cur)) { // It should be changed to iterator remove(); arr.remove(cur); } } }
Therefore, when traversing the ArrayList with the iterator, if you want to delete elements, you must use the remove method provided by the iterator instead of the remove method provided by the ArrayList class.
If the remove method defined by the ArrayList class is used alone, the collection version number modCount will be increased by 1, and the iterator version number will not change. In this way, the checkForComodification method called in the next method checks that two version numbers do not match and thus throws an exception.
Although the remove method provided by the iterator is also the remove method defined by the ArrayList class at the bottom to remove elements, the remove method provided by the iterator adds the operation of synchronously updating the iterator version number expectedModCount after removing elements. In this way, the modCount will always be consistent with the expectedModCount to ensure the normal traversal.
5. Interview questions
5.1 topic 1
Title: defaultprotocol_ EMPTY_ELEMENTDATA and EMPTY_ELEMENTDATA is an empty array. What's the difference between the two?
A: the two are used to share empty arrays. They are mainly used to distinguish.
An empty array constructed by a parameterless constructor will use DefaultAttribute_ EMPTY_ELEMENTDATA assigns a value to elementData, and the empty array constructed by the parametric constructor will be EMPTY_ELEMENTDATA assigns a value to elementData.
For ArrayList s created by different constructors, the capacity expansion strategy is slightly different. During capacity expansion, it will judge whether elementData is created by a parameterless constructor or a parameterless constructor, so as to select the corresponding strategy for capacity expansion.
5.2 topic 2
Title: how is ArrayList expanded?
A: the capacity expansion strategy of ArrayList is:
-
It is created with a parameterless constructor. The initial capacity is 10 and each expansion is 1.5 times the original capacity. (general capacity expansion strategy)
-
It is created with a parametric constructor, and each expansion is 1.5 times of the original capacity.
For more information, please go to Complete collection of java learning materials Receive view