When executing in parallel with Java, I'm addicted to not paying attention to thread safety

Introduction

At work, I received the following consultation. "When parallel processing is performed using Completable Future, the value may not be returned correctly about once every tens to hundreds of times."

Make a note of the correspondence at that time.

problem

The execution environment is

is.

There are the following processes. Information is stored in summaryItem in parallel using CompletableFuture. xxxAPI.search (number); calls an external API to get Item information. Where are the bugs lurking?

  public List<Item> search(int length) {
    //Variables that hold search results
    List<Item> summaryItem = new ArrayList<>(length);

    List<CompletableFuture<Void>> futures = new ArrayList<>(length);
    for (int i = 0; i < length; i++) {
      final int number = i;
      CompletableFuture<Void> f = CompletableFuture.runAsync(() -> {
        //Get data by calling an external API
        Item item = xxxAPI.search(number);
        summaryItem.add(item);
      }, pool);
      futures.add(f);
    }

    //Wait until all parallel processing is finished
    CompletableFuture<Void> all =
        CompletableFuture.allOf(futures.toArray(new CompletableFuture[length]));
    all.join();

    return summaryItem;
  }

Answer

It is the part of the following processing.

        summaryItem.add(item);

summaryItem is java.util.ArrayList. ArrayList is not ** thread safe. ** ** Let's take a look at JavaDoc.

https://docs.oracle.com/javase/jp/8/docs/api/java/util/ArrayList.html

** This implementation is not synchronized. ** If multiple threads access an ArrayList instance in parallel and at least one of those threads structurally modifies the list, it must be synchronized externally. Structural changes are the process of adding or removing one or more elements, or explicitly resizing the underlying array. The process of setting only the value of an element is not a structural change. This is usually achieved by synchronizing with some objects that naturally encapsulate the list. If no such object exists, be sure to use the Collections.synchronizedList method to "wrap" the list. It is recommended to do this at creation time to prevent the list from being accidentally accessed without being synchronized.

You wrote it firmly.

Correspondence

Use a thread-safe List. Or, change to thread-safe processing in the first place. We responded with an approach in the form of.

Use a thread-safe List

I have a nicely synchronized list in the java.util.concurrent package. Let's use this.

List<Item> summaryItem = new ArrayList<>(length);

Just change the above code as below and it will work normally.

List<Item> summaryItem = new CopyOnWriteArrayList<>(length);

In some cases, using Collections.synchronizedList will solve the problem, but if you use iterator, you will still need to synchronize yourself.

Change to thread-safe processing

In the first place, you should return the value in java.util.function.Supplier format.

  public List<Item> search(int length) {
    //Parallel execution processing of search
    List<CompletableFuture<Item>> futures = new ArrayList<>(length);
    for (int i = 0; i < length; i++) {
      final int number = i;
      CompletableFuture<Item> f = CompletableFuture.supplyAsync(() -> {
        //Get data by calling an external API
        return xxxAPI.search(number);
      }, pool);
      futures.add(f);
    }

    //Collection of search results
    try {
      List<Item> summaryItem = new ArrayList<>();
      for (CompletableFuture<Item> f : futures) {
        summaryItem.add(f.get());
      }
      return summaryItem;
    } catch (ExecutionException | InterruptedException ex) {
      throw new RuntimeException(ex);//I wrote it appropriately.
    }
  }

It happens tremendously accidentally, and the point is that it is difficult to find a problem without some knowledge of the Java API. The test code will also (roughly) go through.

I think it is useful to do code review for these parts rather than debugging.

What I want to do more

You probably can't find it with CheckStyle, but I feel like you can find the problem with a static code analysis tool. I will look for it in the future.

Recommended Posts

When executing in parallel with Java, I'm addicted to not paying attention to thread safety
A note when I'm addicted to using Docker Hub Vault in server mode
Two ways to start a thread in Java + @
Summary of how to use the proxy set in IE when connecting with Java
Code to use when you want to process Json with only standard library in Java
How to call functions in bulk with Java reflection
Make "I'm not a robot" in Java EE (Jakarta EE)
Notice multi thread problem when working with Java Servlet
I dealt with Azure Functions not working in Java
When you want to dynamically replace Annotation in Java8
When you want to implement Java library testing in Spock with multi-module in Gradle in Android Studio 3