ArrayList and LinkedList parsing

Posted by kpegram on Sat, 18 Dec 2021 20:22:41 +0100

preface

ArrayList and LinkedList are two data structures that we often use. The main methods are: add, get,remove and the groove method (capacity expansion) that we generally don't perceive. The groove method is the core method of dynamic array, which is why we can always add without managing the length of array.

ArrayList

The bottom layer of ArrayList is based on array and supports random access, that is, the corresponding value can be obtained directly through subscript

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    // Initial capacity size
    private static final int DEFAULT_CAPACITY = 10;
    // An array of values. All add values are stored in elementData
    transient Object[] elementData; 
    // Record the size of the current array
    private int size;
}

add method

    public boolean add(E e) {
    	// Ensure sufficient array capacity
        ensureCapacityInternal(size + 1);
        // Put values into an array
        elementData[size++] = e;
        return true;
    }

The add method does two things:

  1. Ensure sufficient array capacity
  2. Put the value into the array elementData

How to ensure that the array capacity is sufficient?

private void ensureCapacityInternal(int minCapacity) {
        ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
    }

First, execute the calculateCapacity method to calculate the required capacity

    private static int calculateCapacity(Object[] elementData, int minCapacity) {
       // If the current elemntData array is the default empty array
       if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
       // Calculate the default capacity and the minimum required capacity, and return the larger one between them
           return Math.max(DEFAULT_CAPACITY, minCapacity);
       }
       // If the elemntData array is not the default empty array, the minimum required capacity is returned
       return minCapacity;
   }

After calculating the required capacity, enter the ensureExplicitCapacity method to judge whether capacity expansion is required. If the minimum required capacity is greater than the current array length, capacity expansion is required

   private void ensureExplicitCapacity(int minCapacity) {
   	//Add and modify times. Each time you modify the structure of the list (add elements or delete elements), it will be modified
       modCount++;
       // Judge whether capacity expansion is required according to the required capacity returned by the previous method. If mincapacity > the current array length, capacity expansion is required
       if (minCapacity - elementData.length > 0)
           grow(minCapacity);
   }

get method

    public E get(int index) {
    	// Check whether the array subscript is out of bounds
       rangeCheck(index);
       // Returns the value of the corresponding subscript in the elementData array
       return elementData(index);
   }
  1. Check whether the array subscript is out of bounds. If it is out of bounds, an IndexOutOfBoundsException exception will be thrown
  2. Returns the value of the corresponding subscript in the elementData array
    Split

Growth method

The call time of the grow method is to ensure whether the array capacity is sufficient in the add method. If the array capacity is less than the required capacity, the grow method will be called

private void grow(int minCapacity) {
       // Calculate old array length
       int oldCapacity = elementData.length;
       // Calculate new array length
       int newCapacity = oldCapacity + (oldCapacity >> 1);
       if (newCapacity - minCapacity < 0)
           newCapacity = minCapacity;
       // Determine whether the length of the new array is within the maximum array length
       if (newCapacity - MAX_ARRAY_SIZE > 0)
           newCapacity = hugeCapacity(minCapacity);
       // Use the arrays.jdk provided by JDK The copyof method returns an array of new length
       elementData = Arrays.copyOf(elementData, newCapacity);
   }

The grow th method does four things:

  1. Calculate old array length
  2. Calculate new array length
  3. Determine whether the maximum array length is exceeded
  4. Assign a value to elementData with the new array

Calculation of new array length
When calculating the length of the new array, two values will be compared and a larger value will be selected. The two values are oldcapacity + (oldcapacity > > 1) and minCapacity respectively. The former is 1.5 times the length of the original array and the latter is the minimum required capacity we pass in, that is, elementdata Length + 1, which is generally 1.5 times the length of the original array
Under what circumstances will an array of minSize length be allocated to elementData?
For example, an array has 10 capacities. At this time, a 16 element ArrayList comes. To add all the ArrayLists of these 15 elements into the previous array, the previous array will use minSize = 10+15 to expand the capacity instead of 10 + 10 > > > 1 = 15
Why is there a maximum length of an array?
The maximum length of an array is generally the max of Integer_ Value - 8, because some virtual machines store some header information in the array, trying to allocate more space to an array will OutOfMemoryError

remove method

The remove method deletes an element on a specified subscript in an ArrayList

public E remove(int index) {
        rangeCheck(index);
        modCount++;
        E oldValue = elementData(index);
        int numMoved = size - index - 1;
        if (numMoved > 0)
            System.arraycopy(elementData, index+1, elementData, index,
            numMoved);
        elementData[--size] = null;
        return oldValue;
    }

remove does several things:

  1. Judge whether the subscript is out of bounds
  2. Add modification times
  3. Get the element of the subscript to prepare for the return
  4. Calculates the length of the element that needs to be moved forward
  5. Use arrayCopy to achieve the purpose of moving the array, and set the elementData starting with index+1 subscript
  6. Assign the index position of elementData to null and let the GC collect it
    This operation of arrayCopy moving the array, that is, the deletion operation in ArrayList takes a lot of time. It is necessary to move the element after each index forward by one unit.

LinkedList

The underlying implementation of LinkedList is a linked list, which encapsulates elements into a Node. The Node contains the Value of the element, the next reference to the next Node and the prev reference to the previous Node. Generally, the linked list has only the next reference, and a prev reference is also saved in the Node of LinkedList

private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }

ArrayList:

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    transient int size = 0;
    transient Node<E> first;
    transient Node<E> last;
}

You can see that only three attributes are saved in the LinkedList: first header reference, last tail reference, and length size

add

The add of LinkedList is added to the tail by default or directly from the head. Because LinkedList is implemented based on linked list, the operation of modifying structure is very fast. You only need to modify the referenced object

    void linkLast(E e) {
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);
        last = newNode;
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
    }

The add operation does several things:

  1. Get tail reference
  2. Create a new Node object with the e object passed in
  3. Assign the new Node object to last
  4. Determine whether the original tail reference is empty
    4.1 if it is empty, it means that there are no elements in the original List, and the modified header reference is initialized as the current reference
    4.2 if it is not empty, the next referenced by the tail is assigned as the new Node object
  5. Modify size

get method

LinkedList provides three different get methods: get(int index),getFirst() and getLast()
getFirst and getLast. The time complexity of these two methods is O (1), because the First and Last references have been saved in the LinkedList. You can directly return these two references.

    public E getFirst() {
        final Node<E> f = first;
        if (f == null)
            throw new NoSuchElementException();
        return f.item;
    }

If you access according to index, you need O (n) time complexity. Because LinkedList is implemented through linked list, you can't directly calculate the address according to the subscript like array to obtain the value. You must access it one by one through Node.next.

    public E get(int index) {
        checkElementIndex(index);
        return node(index).item;
    }
    Node<E> node(int index) {
        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

Combining the two methods, it can be found that the LinkedList has optimized the value taking operation. If the subscript is less than size/2, it means that this node appears in the first half of the array. Start looking back from the beginning node, otherwise start looking forward from the end node

Growth method

There is no grow th method for LinkedList, because the implementation of LinkedList is a linked list rather than an array. The array needs to allocate a continuous memory space for storage, while the linked list generates Node objects. This Node object does not need to be continuous physically, and there is no need to allocate space in advance. Only when necessary, allocate a space to the Node object, Concatenate it into the whole linked list

remove method

The remove operation of LinkedList mainly consumes time to find the node to be deleted, because LinkedList does not support random access and can only traverse one by one. When the node with the specified subscript is found, the unLink method is called

    E unlink(Node<E> x) {
        // assert x != null;
        final E element = x.item;
        final Node<E> next = x.next;
        final Node<E> prev = x.prev;

        if (prev == null) {
            first = next;
        } else {
            prev.next = next;
            x.prev = null;
        }

        if (next == null) {
            last = prev;
        } else {
            next.prev = prev;
            x.next = null;
        }

        x.item = null;
        size--;
        modCount++;
        return element;
    }

In the remove method, there are two special cases:

  1. The node to be deleted is the head node
  2. The node to be deleted is the tail node

If the head node is to be deleted, the first node should move back one bit
If the tail node is to be deleted, the last node should move forward one bit
If none, point the next of the prev node to next, and point the prev of the next node to prev

The difference between ArrayList and Linked

  • The bottom layer of ArrayList is implemented by array, and a continuous space needs to be allocated in advance; The underlying LinkedList is implemented by linked list, and there is no need to allocate space in advance
  • Both ArrayList and LinkedList can realize array dynamic expansion, but the implementation is different; ArrayList uses the growth method to maintain a large enough memory space to store data, while LinkedList does not need the growth method to maintain a memory space. It only needs to allocate a memory space to the Node when storing elements and connect the nodes in series into the linked list
  • ArrayList supports random access and can be accessed directly using subscripts; LinkedList does not support random access and needs to be traversed one by one
  • When deleting an ArrayList, you need to move all elements after the deleted element forward by a subscript; When deleting a LinkedList, you only need to change the next and prev references of the nodes before and after the deleted element

Topics: Java data structure