The other day, I participated in Introduction to Scala & Functional Programming Learned from Implementation of List and the data of Scala List. I've been studying a little about structure. Let's review it along with the Java List data structure.

Java List

Speaking of Java List, I think there are ArrayList and LinkedList, so I would like to take a look at the contents. The following is an excerpt of only the part related to the addition and acquisition of elements (data).

ArrayList ArrayList internally holds element information as an array.

`ArrayList.class`


package java.util;

public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{

    private static final int DEFAULT_CAPACITY = 10;

    private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

    transient Object[] elementData;

    private int size;

    public ArrayList() {
        this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
    }

    /*Omission*/

    public boolean add(E e) {
        ensureCapacityInternal(size + 1);  // Increments modCount!!
        elementData[size++] = e;
        return true;
    }

    private void ensureCapacityInternal(int minCapacity) {
        if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
            minCapacity = Math.max(DEFAULT_CAPACITY, minCapacity);
        }

        ensureExplicitCapacity(minCapacity);
    }

    private void ensureExplicitCapacity(int minCapacity) {
        modCount++;

        // overflow-conscious code
        if (minCapacity - elementData.length > 0)
            grow(minCapacity);
    }

    private void grow(int minCapacity) {
        // overflow-conscious code
        int oldCapacity = elementData.length;
        int newCapacity = oldCapacity + (oldCapacity >> 1);
        if (newCapacity - minCapacity < 0)
            newCapacity = minCapacity;
        if (newCapacity - MAX_ARRAY_SIZE > 0)
            newCapacity = hugeCapacity(minCapacity);
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }

    public E get(int index) {
        rangeCheck(index);

        return elementData(index);
    }

    E elementData(int index) {
        return (E) elementData[index];
    }

    /*Omission*/

}

elementData is an array of element information. Elements are added with the add method, and the input elements are stored in elementData. The interesting thing is how to extend the array, but when you store the first element in elementData, an array of 10 length is created by default. After that, if you try to store an element that exceeds the length of elementData, an array with a length 1.5 times that of elementData is created and the data is transferred to that array. You're reducing the cost of creating an array instance by creating some play in the array.

int newCapacity = oldCapacity + (oldCapacity >> 1) // 1.Get 5 times longer

To get the element, the index of the array is directly referenced using the index input by the get method, so high performance is achieved not only for sequential access but also for random access. I don't have a clue about the picture, but I made an image, so I'll post it. LinkedList LinkedList manages elements in an inner class called Node. The LinkedList itself has only the information of the first node and the last node, and the node itself has the information of the previous and next nodes. Every time you add an element with the add method, a new Node is created, and the new Node information is stored in the next of the trailing Node, and the trailing Node information is stored in the prev of the new Node.

`LinkedList`


package java.util;

public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{

    transient int size = 0;

    transient Node<E> first;

    transient Node<E> last;

    void linkLast(E e) {
        final Node<E> l = last;
        final Node<E> newNode = new Node<>(l, e, null);
        last = newNode;
        if (l == null)
            first = newNode;
        else
            l.next = newNode;
        size++;
        modCount++;
    }

    public boolean add(E e) {
        linkLast(e);
        return true;
    }

    public E get(int index) {
        checkElementIndex(index);
        return node(index).item;
    }

      /*Omission*/

    private static class Node<E> {
        E item;
        Node<E> next;
        Node<E> prev;

        Node(Node<E> prev, E element, Node<E> next) {
            this.item = element;
            this.next = next;
            this.prev = prev;
        }
    }

    Node<E> node(int index) {
        // assert isElementIndex(index);

        if (index < (size >> 1)) {
            Node<E> x = first;
            for (int i = 0; i < index; i++)
                x = x.next;
            return x;
        } else {
            Node<E> x = last;
            for (int i = size - 1; i > index; i--)
                x = x.prev;
            return x;
        }
    }

      /*Omission*/

}

To get the element, follow the index entered by the get method from the beginning or the end, whichever is closer. This is the reason why it is vulnerable to random access unlike ArrayList. In the image below, when you want to get the element of node_3, you have to follow from node_1.

How to use ArrayList and LinkedList properly

This is my personal opinion, and I'd appreciate it if you could give me some advice from someone who is familiar with algorithms, but I wonder if ArrayList is the basis. This is because ArrayList allows quick access to elements and uses less memory than LinkedList. However, if you want to have a large amount of data in a List and repeat adding and deleting elements (for example, a role like a queue), you should use LinkedList. In the case of ArrayList, element movement processing occurs when there is space in the array due to element deletion or when adding between elements. In the case of LinkedList, it is efficient because it only rewrites the information of the previous and next Nodes.

Scala List

In Scala List, element information is represented by head and tail. Looking only at the name, it seems to be similar to Java's LinkedList, but in fact it has a completely different structure, head represents the information of the first element, tail expresses the element information other than the beginning in List type. .. In the first place, Scala's List has an immutable design, and instead of making changes to the List object itself as in Java, it is designed to create a new List after the change.

`List.scala`


package scala
package collection
package immutable

sealed abstract class List[+A] extends AbstractSeq[A]
                                  with LinearSeq[A]
                                  with Product
                                  with GenericTraversableTemplate[A, List]
                                  with LinearSeqOptimized[A, List[A]]
                                  with Serializable {
  override def companion: GenericCompanion[List] = List

  import scala.collection.{Iterable, Traversable, Seq, IndexedSeq}

  def isEmpty: Boolean
  def head: A
  def tail: List[A]

  def ::[B >: A](x: B): List[B] =
    new scala.collection.immutable.::(x, this)

    /*Omission*/

}

final case class ::[B](overridevalhead:B,private[scala]vartl:List[B]) extends List[B] {
  override def tail : List[B] = tl
  override def isEmpty: Boolean = false
}

object List extends SeqFactory[List] {
  /** $genericCanBuildFromInfo */
  implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, List[A]] =
    ReusableCBF.asInstanceOf[GenericCanBuildFrom[A]]

  def newBuilder[A]: Builder[A, List[A]] = new ListBuffer[A]

  override def empty[A]: List[A] = Nil

  override def apply[A](xs:A*): List[A] = xs.toList
  
  private[collection] val partialNotApplied = new Function1[Any, Any] { def apply(x: Any): Any = this }

    /*Omission*/

}

In Scala, actually run the program and check the head and tail. First, add some elements to the List. Elements are added using the :: method.

//(1)List creation
val list1 = List("ABC")
println(list1) // List(ABC)
//(2)Add element
val list2 = "DEF"::list1
println(list2) // List(DEF, ABC)
//(3)Add element
val list3 = list2.::("GHI")
println(list3) // List(GHI, DEF, ABC)

(2) and (3) use the same method, although the notation is different. When adding normally without specifying index, unlike Java, it is an image of adding an element at the beginning, because Java adds an element at the end. Let's take a look at the head and tail of the objects (1) to (3), respectively.

List(list1, list2, list3).zip(Range(1, 4)).foreach { 
  case (x, i) => println(s"list${i} => head:${x.head} tail:${x.tail}") 
}
//  list1 => head:ABC tail:List()
//  list2 => head:DEF tail:List(ABC)
//  list3 => head:GHI tail:List(DEF, ABC)

After all, the head element returns the first element, and the tail returns the elements other than the beginning as a List. I also made an image of this, so please take a look if you like.

By the way, if you get the class name from the instance and take a look

val list4 = List("ABC")
println(list4.getClass) // class scala.collection.immutable.$colon$colon
val list5 = Array("ABC").toList
println(list5.getClass) // class scala.collection.immutable.$colon$colon

It has become. This seems to be a scala.collection.immutable. :: class, and it seems that this class is called cons (probably a constructor). If you look at the List. :: () method, you can see that the :: instance is created, the argument is set to head, and the List itself is tail.

Method

I will write a little about the method. That said, writing is just a list of things that I personally find useful.

val range = Range(0, 10).toList
println(range) // List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

val fill = List.fill(10)(5)
println(fill) // List(5, 5, 5, 5, 5, 5, 5, 5, 5, 5)

val rangefill = range.zip(fill).map { case (x, y) => x * y }
println(rangefill) // List(0, 5, 10, 15, 20, 25, 30, 35, 40, 45)

val flatmap = List(List(1, 2, 3), List(3, 4, 5), List(5, 6, 7)).flatten.distinct
println(flatmap) // List(1, 2, 3, 4, 5, 6, 7)

val reduce = List("ABC", "DEF", "GHI").reduceLeft(_+_)
println(reduce) // ABCDEFGHI

Most of the common processes are defined as methods, which is very easy (there are many other useful methods). Scala also has type inference and tuples, so I think it's more intuitive and efficient to program than Java, and above all, it's fun to write.

Finally

Even a small glance at the standard library can be very educational. The standard library is packed with genius and artistic designs and techniques. It seems good to understand and learn object-oriented programming in Java and functional programming in Scala from these points.

List data structure [Java / Scala]