[Java] Try to implement Eratosthenes sieve using Java standard library

6 minute read

Introduction

I usually develop package software, but recently I started a competition professional. I was confused because it was completely different from the usual way of writing code (I think that I will not be killed if I write such code), and I was wondering what would happen if I implemented this as usual.

In such a case, @nkojima’s [Java] Find the prime number with the Eratosthenes sieveand[Java]FindtheprimenumberwiththeEratosthenessieve(Part2), I thought it would be a good subject, and tried to improve readability by using the standard library.

At this time, I will try to use the collection properly.

Code

[Java] Find prime numbers with Eratosthenes sieve (Part 2), referencing the search list as ArrayList, HashSet, ArrayList<Boolean I changed it to > and implemented it.

ListImplement.java


package practise.algorithm.eratosthenes.qiita;

import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

public class ListImplement {
private static final int MIN_PRIME_NUMBER = 2;

public static void main(String[] args) {
new ListImplement().execute();
}

private void execute() {
Scanner sc = new Scanner(System.in);
int maxNumber = sc.nextInt();

List<Integer> primeNumbers = enumeratePrimeNumbers(maxNumber);

System.out.println(primeNumbers.toString());
}

private List<Integer> enumeratePrimeNumbers(int maxNumber) {
List<Integer> targets = createSearchTargets(maxNumber);

sieve(targets, maxNumber);

return targets;
}

//Create a search list from 2 to maxNumber
private List<Integer> createSearchTargets(int maxNumber) {
List<Integer> targets = new ArrayList<>(maxNumber);
for(int i = MIN_PRIME_NUMBER; i <= maxNumber; i++) {
targets.add(i);
}
return targets;
}

// Sift a multiple of the first value of the search list from the search list
// Do the above until the top value of the search list reaches the square root of maxNumber.
private void sieve(List<Integer> targets, int maxNumber) {
int sqrt = (int) Math.sqrt(maxNumber);
for(int i = MIN_PRIME_NUMBER; i <= sqrt; i++) {
int firstNum = i;
if(targets.contains(firstNum)) {//★★★ The part that determines whether it is a sieve mesh
//Because it has already been screened up to the square of the first value,
for(int j = firstNum * firstNum; j <= maxNumber; j += firstNum) {
targets.remove(Integer.valueOf(j)); //★★★ Sieving part ★★★
}
}
}
}

}

SetImplement.java


package practise.algorithm.eratosthenes.qiita;

import java.util.HashSet;
import java.util.Scanner;
import java.util.Set;

public class SetImplement {
private static final int MIN_PRIME_NUMBER = 2;

public static void main(String[] args) {
new SetImplement().execute();
}

private void execute() {
Scanner sc = new Scanner(System.in);
int maxNumber = sc.nextInt();

Set<Integer> primeNumbers = enumeratePrimeNumbers(maxNumber);

System.out.println(primeNumbers.toString());
}

private Set<Integer> enumeratePrimeNumbers(int maxNumber) {
Set<Integer> targets = createSearchTargets(maxNumber);

sieve(targets, maxNumber);

return targets;
}

//Create a list of numbers from 2 to maxNumber
private Set<Integer> createSearchTargets(int maxNumber) {
Set<Integer> targets = new HashSet<>(maxNumber);
for(int i = MIN_PRIME_NUMBER; i <= maxNumber; i++) {
targets.add(i);
}
return targets;
}

// Sift a multiple of the first value of the search list from the search list
// Do the above until the top value of the search list reaches the square root of maxNumber.
private void sieve(Set<Integer> targets, int maxNumber) {
int sqrt = (int) Math.sqrt(maxNumber);
for(int i = MIN_PRIME_NUMBER; i <= sqrt; i++) {
int firstNum = i;
if(targets.contains(firstNum)) {//★★★ The part that determines whether it is a sieve mesh ★★★
//Because it has already been screened up to the square of the first value,
for(int j = firstNum * firstNum; j <= maxNumber; j += firstNum) {
targets.remove(j);//★★★ Sieving part ★★★
}
}
}
}
}

BooleanArrayListImplement.java


package practise.algorithm.eratosthenes.qiita;

import java.util.ArrayList;
import java.util.List;
import java.util.Scanner;

public class BooleanArrayListImplement {
private static final int MIN_PRIME_NUMBER = 2;

public static void main(String[] args) {
new BooleanArrayListImplement().execute();
}

private void execute() {
Scanner sc = new Scanner(System.in);
int maxNumber = sc.nextInt();

List<Integer> primeNumbers = enumeratePrimeNumbers(maxNumber);

System.out.println(primeNumbers.toString());
}

private List<Integer> enumeratePrimeNumbers(int maxNumber) {
List<Boolean> targets = createSearchTargets(maxNumber);

List<Integer> primeNumbers = extracePrimeNumbers(targets);

return primeNumbers;
}

//Create a Boolean search list from 0 to maxNumber
//Start from 0 to make index the number of search targets
// Use boolean to determine if it is a prime number
private List<Boolean> createSearchTargets(int maxNumber) {
List<Boolean> targets = new ArrayList<>(maxNumber + 1);
targets.add(false); //0 is not a prime number, so
targets.add(false);//1 is not a prime number, so
for(int i = MIN_PRIME_NUMBER; i <= maxNumber; i++) {
targets.add(true);
}
return targets;
}

// Sift a multiple of the first value of the search list from the search list
// Do the above until the top value of the search list reaches the square root of maxNumber.
private List<Integer> extracePrimeNumbers(List<Boolean> targets) {
int maxNumber = targets.size()-1;

sieve(targets, maxNumber);

List<Integer> primeNumbers = convertToNumList(targets);

return primeNumbers;
}

private void sieve(List<Boolean> targets, int maxNumber) {
int sqrt = (int) Math.sqrt(maxNumber);
for(int i = MIN_PRIME_NUMBER; i <= sqrt; i++) {
int firstNum = i;if(targets.get(i)) {//★★★ The part that determines whether it is a sieve mesh ★★★
//Because it has already been screened up to the square of the first value,
for(int j = firstNum * firstNum; j <= maxNumber; j += firstNum) {
targets.set(j, false); //★★★ Sieving part ★★★
}
}
}
}

private List<Integer> convertToNumList(List<Boolean> targets) {
List<Integer> numbers = new ArrayList<>();
for(int i = 0; i <targets.size(); i++) {
if(targets.get(i)) {
numbers.add(i);
}
}
return numbers;
}
}

Speed summary

It is a summary of execution time (average executed 5 times). ArrayList<Boolean> is the fastest, and ArrayList is out of the question. None of the above implementations use arrays. .. ..

Implementation Upper limit is $10^5$ Upper limit is $10^6$ Upper limit is $10^7$  
Implemented in ArrayList (ListImplement.java) 3,221 ms Did not return -  
Implemented by HashSet (SetImplement.java) 24 ms 118 ms 1,120 ms  
Implemented by ArrayList(BooleanArrayListImplement.java) 16 ms 50 ms 1,247 ms
Reference 1 31 ms 276 ms 7,845 ms  
Reference 2 4 ms 21 ms 66 ms  

Cause of the difference in speed

I wrote the ★ mark in the code, but I think the following two places are the major factors.

  • The part that determines whether it is a sieve mesh
  • Sieving part

The element is accessed at each place, but the access method at this time is as follows.

  • Implementation of ArrayList: Search from the front of the array in order → Linear search
  • Implementation of HashSet: access target object with hash table → Via hash table, but eventually random access
  • Implementation of ArrayList<Boolean>: access to array by index → Random access

After all, the computational order is ArrayList is $O(n^2loglogn)$, and HashSet and ArrayList<Boolean> are $O(nloglogn)$ (looking at the execution results, this implementation does not seem to be like this. I think it is. (The part of loglogn is googled. This article

Arrays are faster than ArrayList<Boolean> and ArrayList<Boolean> than HashSet because there is no calculation of hash value when accessing elements, there is no array expansion operation, Probably because there are no other extra operations.

Summary

When I actually tried it, I felt the following.

  • When it is used in practice, it should be implemented as a Set because it puts more importance on readability although it will be slightly slower.
  • When speed is important, write a gutsly comment and make it an array implementation
  • Make an array if you use ArrayList<Boolean> (comment is necessary anyway)

It would be interesting to see a lot of algorithms that I have never seen before when doing a competitive pro. It is around this time that I want to make good use of my practical work.

Finally

  • When selecting a collection type data structure, it was implemented as if it were a set when eliminating duplicates, and a list when there were duplicates. It just happened that I didn’t want to affect speed so much that it became a problem, so I’m glad I noticed it now.
  • Thanks to @nkojima for her wonderful post, I was able to study hard. Thank you.
  • I think that there are many places where I don’t study well, but if there is something wrong, I’d like you to gently throw Masakari.