It's nice to participate in the Advent calendar somehow, but I don't have anything to write, so I'll talk about Stream, which I can write. The code I tried this time is also stored in GitHub.
This guy added in Java 8. As many of you may already know, when you benchmark and measure the execution time, if the elements to be handled are small, the execution time will be slower than the procedural type. The advantage is that the method name is clear and it is easy to understand what the source wants to do.
I don't want to use tools or benchmark the number of times to 1 million It seems that you can use nanoTime, but I chose currentTimeMillis because I only need to know the difference.
Bench.java
public class Main {
public static void main(String[] args) {
System.out.println("Excution time:"+benchMark()+"[sec]");
}
private static double benchMark(){
long start = System.currentTimeMillis();
HogeSomeTask task = new HogeSomeTask();
task.something_do();
long end = System.currentTimeMillis();
return ((end - start) / 1000.0);
}
}
OS:Ubuntu 16.04 CPU: Intel Core i7-2700K CPU @ 5.9GHz (Please forgive the old one) JDK:Open-jdk9
The following code is the result of benchmarking with the code of the process that outputs the target data of less than 5 characters as a condition. On the contrary, I also tried the process of outputting data of 5 characters or more, but the result was almost the same, so I omitted it.
Procedural.java
public void use_for(){
List<String>list = Arrays.asList("Java","Ruby","Csharp","Scala","Haskell");
for(String lang : list){
if(lang.length() < 5){
System.out.println(lang);
}
}
}
The average execution time for 10 times is 0.001 [sec]. Pretty fast
Stream.java
public void use_stream(){
List<String>list = Arrays.asList("Java","Ruby","Csharp","Scala","Haskell");
list.stream().filter(lang -> lang.length() < 5).forEach(System.out::println);
}
The average execution time for 10 times is 0.025 [sec]. It is a little slower than the procedural type.
It feels more like this now, but let's take a quick look at the method. It can be written like Scala by the take / dropWhile method. takeWhile A method that can process target data (while the conditions are met) simply by specifying the target conditions. Intermediate processing has been reduced.
takeWhileExample.java
List<String>list = Arrays.asList("Java","Ruby","Csharp","Scala","Haskell")
list.stream().takeWhile(lang -> lang.length() < 5).forEach(System.out::println);
dropWhile A method that can output the target data (after the condition is matched) just by specifying the target condition. As with takeWhile, intermediate processing has been reduced.
dropWhileExample.java
List<String> list = Arrays.asList("Java","Ruby","Csharp","Scala","Haskell");
list.stream().dropWhile(lang -> lang.length() < 5).forEach(System.out::println);
ofNullable If the target data is not null, return Stream. If null, a method that returns an empty Stream. You can now write Streams directly from Optional as shown below
optional.java
Optional.ofNullable(null).stream().forEach(System.out::println);
I've tried everything and benchmarked it, so I've summarized it in a table. Regarding ofNullable, it seems that there is a big point to handle null safely, so this time we are verifying the performance of handling data, we do not have time for ad-care, so I will omit it.
Average run time(10 times) | |
---|---|
Procedural | 0.001[sec] |
Stream | 0.025[sec] |
parallelStream | 0.026[sec] |
takeWhile | 0.026[sec] |
takeWhile(Use parallelStream) | 0.032[sec] |
There is no big difference compared to some methods from Java 8. Intermediate processing seems to be slow
Average run time(10 times) | |
---|---|
Procedural | 0.001[sec] |
Stream | 0.023[sec] |
parallelStream | 0.031[sec] |
dropWhile | 0.024[sec] |
dropWhile(Use parellelStream) | 0.028[sec] |
Same as above
Roughly speaking, in the procedural type, the processing is written out almost as it is and compiled with jdk, so even if the data is simple, the Stream that performs the intermediate processing seems to be slow.
I want to clarify the grounds for being late. For that purpose, I tried to follow the process written in the Stream method with the function of IntelliJ. It is divided into small method calls, and the mechanism such as delayed execution is slowed down.
When to write it, you only have to select Stream that is easy to read, and even if it is slow, the difference is not so big. However, in terms of performance, I tried it because it seems to be effective when the elements to be handled are large.
As test data, create 1 million elements of a random character string consisting of 20 letters of uppercase letters and numbers.
BigData.java
Random r = new Random(2111);
List<String> data = range(0, 1_000_000)
.mapToObj(i->
r.ints().limit(20)
.map(n -> Math.abs(n) % 36)
.map(code -> (code < 10) ? '0' + code : 'A' + code - 10)
.mapToObj(ch -> String.valueOf((char)ch))
.toString())
.collect(toList());
From this element, only the numbers are extracted, and the total is 30 or less.
Procedural.java
public static long use_for(List<String> data){
long result = 0;
for(String d : data){
String numOnly = d.replaceAll("[^0-9]", "");
if(numOnly.isEmpty()) continue;
int total = 0;
for(char ch : numOnly.toCharArray()){
total += ch - '0';
}
if(total >= 30) continue;
long value = Long.parseLong(numOnly);
result += value;
}
return result;
}
Stream.java
public static long streamSum(List<String>data){
return data.stream()
.map(d -> d.replaceAll("[^0-9]", ""))
.filter(d -> !d.isEmpty())
.filter(d -> d.chars().map(ch -> ch - '0').sum() < 30)
.mapToLong(d -> Long.parseLong(d)).sum();
}
I'm curious about how it will be compared to just stream and parallelStream, so I'll try it.
takeWhileSample.java
public static long takeWhileSum(List<String> data){
return data.stream()
.map(d -> d.replaceAll("[^0-9]", ""))//Remove non-numbers
.takeWhile(d -> !d.isEmpty())
.takeWhile(d -> d.chars().map(ch -> ch - '0').sum() < 30)//The sum of the numbers is less than 30
.mapToLong(d -> Long.parseLong(d)).sum();
}
dropWhileSample
public static long dropWhileSum(List<String> data){
return data.stream()
.map(d -> d.replaceAll("[^0-9]", ""))
.dropWhile(d -> d.isEmpty())
.dropWhile(d -> d.chars().map(ch -> ch - '0').sum() > 30)
.mapToLong(d -> Long.parseLong(d)).sum();
}
Average run time(10 times) | |
---|---|
Procedural | 2.132[sec] |
parallelStream | 1.321[sec] |
Stream | 2.107[sec] |
takeWhile | 0.457[sec] |
takeWhile(parallelStream) | 1.325[sec] |
dropWhile | 2.175[sec] |
dropWhile(parallelStream) | 1.377[sec] |
I feel like I finally saw the true value. takeWhile is by far the fastest. dropWhile was about the same as procedural, but calling it from parallelStream made it a lot better.
that's all. It was a good opportunity to try what I was interested in.
Purpose of Java 8 Stream, ease of writing, readability, and the effect of concurrency Measure the execution result of the program in C ++, Java, Python.
Such.
Recommended Posts