当前位置：首页 > news >正文

ArrayList 源码浅析

news 来源：原创 2024/5/7 10:48:36

类的关系

ArrayList 继承了 AbstractList，并实现了 List、RandomAccess、Cloneable 和 Serializable 接口，List 是 Collection 的子接口，RandomAccess 是标识性接口，代表 ArrayList 具有快速随机访问的能力；Cloneable 表示 ArrayList 实现了 clone 方法，是浅拷贝；Serializable 代表 ArrayList 可以进行序列化 / 反序列化操作。

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable

几个成员变量

    /**
     * 序列化 ID
     */
	private static final long serialVersionUID = 8683452581122892189L;
	
	/**
     * Default initial capacity.
     * 
     * 默认容量，扩容时会用到
     */
    private static final int DEFAULT_CAPACITY = 10;

	/**
     * Shared empty array instance used for empty instances.
     * 
     * 空数组的实例，只有当我们传入构造器的容量为零或者集合为空时才会使用
     */
    private static final Object[] EMPTY_ELEMENTDATA = {};

    /**
     * Shared empty array instance used for default sized empty instances. We
     * distinguish this from EMPTY_ELEMENTDATA to know how much to inflate when
     * first element is added.
     * 
     * 空数组的实例，只有我们调用 ArrayList 的无参构造器时才会使用，目的是为了与另外两个构造器的为空的情况区分开来
     */
    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    /**
     * The array buffer into which the elements of the ArrayList are stored.
     * The capacity of the ArrayList is the length of this array buffer. Any
     * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
     * will be expanded to DEFAULT_CAPACITY when the first element is added.
     * 
     * ArrayList 中真正存储元素的底层数组，即 elementData，Object 数组，ArrayList 的容量就是这个数组的长度
     * 空的 ArrayList 为空数组，只有第一次添加元素时，才会扩容至默认大小10
     */
    transient Object[] elementData; // non-private to simplify nested class access

    /**
     * The size of the ArrayList (the number of elements it contains).
     * 
     * ArrayList 中所存储的元素数量
     */
    private int size;

三个构造方法

	/**
     * Constructs an empty list with the specified initial capacity.
     *
     * 可指定初始容量的构造方法，如果初始容量为0，则使用 EMPTY_ELEMENTDATA 空数组
     */
    public ArrayList(int initialCapacity) {
        if (initialCapacity > 0) {
            this.elementData = new Object[initialCapacity];
        } else if (initialCapacity == 0) {
            this.elementData = EMPTY_ELEMENTDATA;
        } else {
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        }
    }

    /**
     * Constructs an empty list with an initial capacity of ten.
     * 
     * 无参构造器，使用 DEFAULTCAPACITY_EMPTY_ELEMENTDATA 空数组
     */
    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

    /**
     * Constructs a list containing the elements of the specified
     * collection, in the order they are returned by the collection's
     * iterator.
     *
     * 可传入集合构造器，如果集合为空，则使用 EMPTY_ELEMENTDATA 空数组
     */
    public ArrayList(Collection<? extends E> c) {
        Object[] a = c.toArray();
        if ((size = a.length) != 0) {
            if (c.getClass() == ArrayList.class) {
                elementData = a;
            } else {
                elementData = Arrays.copyOf(a, size, Object[].class);
            }
        } else {
            // replace with empty array.
            elementData = EMPTY_ELEMENTDATA;
        }
    }

为什么要使用两个不同的空数组呢？

这里需要看一下 add 方法的源码

	public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

add 方法执行时，先会判断一下容量，通过 ensureCapacityInternal 方法确保 Object 数组可以装下 size + 1 个元素

跟一下 ensureCapacityInternal 方法，这个方法会调用 calculateCapacity 方法，计算需要最终需要扩容的大小，而 calculateCapacity 会检查 elementData 是否是空参构造器的默认容量的空数组，如果为真，则返回默认容量 10 与最小容量的最大值，否则直接返回最小容量。

可以看出，这两个空数组主要是在扩容时用到了，用于区分 ArrayList 是我们指定初始容量为0 或传入空集合，还是调用了空参构造器的默认容量 10

	private void ensureCapacityInternal(int minCapacity) {
        ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
    }
    
    private static int calculateCapacity(Object[] elementData, int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            return Math.max(DEFAULT_CAPACITY, minCapacity);
        }
        return minCapacity;
    }

ArrayList 扩容机制
每一次 add 新元素 ArrayList 都会确保容量的大小，如果需要扩容，即 ensureExplicitCapacity 方法中，最小容量小于当前数组的长度，则需要进行底层数组扩容，调用 grow 方法。

	private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

在 grow 方法中，先拿到了 oldCap，并进行了 1.5 倍扩容，通过 oldCap + oldCap >> 1，右移一位相当于除以 2，并且效率更高；

我们还能注意到，grow 中有一行注释 overflow-conscious code，表示这部分代码考虑到了溢出的情况。我们来看看：

newCap 为扩容了之后的容量，如果扩容了之后的 newCap 依然小于 minCap，那么就直接将 minCap 赋值给 newCap，不过这种情况应该很少出现。
如果 newCap 大于了最大数组容量限制，那么就代表 newCap 发生了 int 溢出，说明 1.5 的扩容太大了，调用 hugeCapacity 来对 minCap 进行尽可能的扩容，hugeCapacity 先检查 minCap 是否发生溢出，然后在最大数组容量限制上尽可能的扩容。

    /**
     * The maximum size of array to allocate.
     * Some VMs reserve some header words in an array.
     * Attempts to allocate larger arrays may result in
     * OutOfMemoryError: Requested array size exceeds VM limit
     * 
     * 最大数组长度，防止某些 VM 会 OOM
     */
    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;

    /**
     * Increases the capacity to ensure that it can hold at least the
     * number of elements specified by the minimum capacity argument.
     *
     * @param minCapacity the desired minimum capacity
     */
    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    private static int hugeCapacity(int minCapacity) {
        if (minCapacity < 0) // overflow
            throw new OutOfMemoryError();
        return (minCapacity > MAX_ARRAY_SIZE) ?
            Integer.MAX_VALUE :
            MAX_ARRAY_SIZE;
    }

Arrays.copyOf 与 System.arraycopy

源码如下，Arrays.copyOf 底层调用了 System.arraycopy，而 System.arraycopy 是一个 native 方法；
从功能上看，Arrays.copyOf 是对数组进行扩容的复制，System.arraycopy 是将一个数组复制到另外一个数组上。

	public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
        @SuppressWarnings("unchecked")
        T[] copy = ((Object)newType == (Object)Object[].class)
            ? (T[]) new Object[newLength]
            : (T[]) Array.newInstance(newType.getComponentType(), newLength);
        System.arraycopy(original, 0, copy, 0,
                         Math.min(original.length, newLength));
        return copy;
    }
    
	public static native void arraycopy(Object src,  int  srcPos,
                                        Object dest, int destPos,
                                        int length);

为什么是 1.5 倍扩容呢？

个人理解原因应该是：为了取得性能与内存的折衷，如果为 2 倍扩容，可能会有更大的空间浪费，如果为更小的扩容可能会导致扩容更加频繁！而且 1.5 倍可以充分利用位运算的优势！

为什么不能使用 foreach 对集合进行 remove / add 操作呢？

foreach 是增强 for 循环，Java 中的一种语法糖，实际上底层还是调用了 iterator 迭代器，但是非 fail-safe的集合具有 fail-fast 机制即快速失败机制，我们在对 ArrayList 做修改时，会记录一个 modCount 变量，而 iterator 也会有一个变量 expectedModCount，Iterator.next 会调用Iterator.checkForComodification 方法检查 modCount 与 expectedModCount 是否相等，不相等会抛出并发修改异常 ConcurrentModificationException。

对于 expectedModCount，我们在通过 list.iterator() 获取构造器时，会将 modCount 赋值给 expectedModCount，但是！只有采用 iterator 的 remove / add，才会触发 expectedModCount 的修改，否则 expectedModCount 没有改变，这就导致了与 modCount 的不一致！

也就是说，在 foreach 循环中，集合遍历是通过 Iterator 进行的，但是元素的 add/remove 却是直接使用集合类自己的方法，这就导致 Iterator 在遍历的时候会发现有一个元素在自己不知不觉的情况下就被添加/删除了，就会抛出一个异常，用来提示用户可能发生了并发修改。