With all due respect, you may not really be able to java Article 6: Stream performance is poor?Don't make a cloud out of a cloud

Posted by amites on Wed, 24 Jun 2020 02:51:30 +0200

1. Feedback from fans

Q: stream is five times slower than for loops. What is the reason for this? Answer: The Internet is a time when news is flooding. Three people become tigers when something is false and real happens.As a technology developer, you have to do it yourself. Don't let the world go by.

Indeed, I've read this article by this fan, so I don't put an address on it and don't need to take traffic to it.How to put it?It is a performance test by a non-inflow development engineer who does not know how to test and gives a sensational conclusion.

2. All performance test results are one-sided

Performance testing is necessary, but the results of performance testing should always be questioned.Why do you say that?

  • Deviating performance testing from business scenarios is one-sided performance testing.Can you cover all business scenarios?
  • Performance testing out of hardware environment is one-sided performance testing.Can you cover all hardware environments?
  • Performance testing is one-sided performance testing, which is out of the developer's knowledge.Can you cover all kinds of strange code for developers?

So I never trust any performance test articles on the web.Every business scenario I do is tested by myself on a machine close to the production environment.All performance test results are one-sided, and only the results from your production environment are true.

3. Manual testing of Stream performance

3.1. Environment

windows10, 16G memory, i7-7700HQ 2.8HZ, 64-bit operating system, JDK 1.8.0_171

3.2. Test cases and test conclusions

In the previous section, we talked about:

  • Stream streams perform differently for different data structures
  • Stream streams also perform differently for different data sources

So keep in mind the author's words: all performance test results are one-sided, you have to do it yourself, and trust your own code and the tests in your environment!My test results only represent my own test cases and test data structures!

3.2.1. Test Case 1

Test case: 500 million int s of random numbers, minimum Test conclusions (see below for test code):

  • With a normal for loop, the execution efficiency is twice that of a Stream serial stream.This means that the normal for loop performs better.
  • Stream parallel stream computing is 4-5 times more efficient than a normal for loop.
  • Stream Parallel Stream Computing > Ordinary for Loop > Stream Serial Stream Computing

3.2. Test Case 2

Test case: 10,000,000 random strings, minimum Test conclusions (see below for test code):

  • Normal for loop execution efficiency is equal to Stream serial stream
  • Stream parallel streams execute much more efficiently than normal for loops
  • Stream Parallel Stream Computing > Ordinary for Loop= Stream Serial Stream Computing

3.3. Test Case 3

Test case: 10 users, 200 orders per person.Total price of the order by user. Test conclusions (see below for test code):

  • Stream parallel streams execute much more efficiently than normal for loops
  • Stream serial stream is more efficient than or equal to a normal for loop
  • Stream Parallel Stream Computing > Stream Serial Stream Computing >=Ordinary for Loop

4. Final test conclusion

  • For simple list-Int traversal, ordinary for loops are indeed more efficient (1.5-2.5 times) than Stream serial streams.However, Stream streams can take advantage of the multi-core advantage of the CPU by using parallel execution, so parallel stream computing is more efficient than for loops.
  • For data traversal of the list-Object type, there is no advantage over ordinary for loops and Stream serial streams, let alone Stream parallel stream computing.

Although in different scenarios, different data structures, different hardware environments.Stream flow and for cycle performance test results are quite different, even reversed.But overall:

  • Stream Parallel Stream Calculation > Ordinary for Loop ~= Stream Serial Stream Calculation (Why use two greater than signs, your details)
  • The larger the data capacity, the more efficient the Stream stream will be.
  • Stream parallel stream computing can usually take advantage of the multi-core advantage of the CPU.The more CPU cores there are, the more efficient Stream parallel stream computing will be.

Stream is five times slower than for loop?Perhaps, single-core CPU, serial Stream, int-type data traversal?I haven't tried this scenario before, but I know it's not the core scenario of an application system.Read more than a dozen test posts, and my test results.My conclusion is that Stream performs more efficiently than for loops in most core business scenarios and in common data structures.After all, we usually have real entity objects in our business. Who traverses the List <Int>type all the time?Whose production server is single core?

5. Test Code

<dependency>
    <groupId>com.github.houbb</groupId>
    <artifactId>junitperf</artifactId>
    <version>2.0.0</version>
</dependency>

Test Case 1:

import com.github.houbb.junitperf.core.annotation.JunitPerfConfig;
import com.github.houbb.junitperf.core.report.impl.HtmlReporter;
import org.junit.jupiter.api.BeforeAll;

import java.util.Arrays;
import java.util.Random;

public class StreamIntTest {

    public static int[] arr;

    @BeforeAll
    public static void init() {
        arr = new int[500000000];  //500 million random Int s
        randomInt(arr);
    }

    @JunitPerfConfig( warmUp = 1000, reporter = {HtmlReporter.class})
    public void testIntFor() {
        minIntFor(arr);
    }

    @JunitPerfConfig( warmUp = 1000, reporter = {HtmlReporter.class})
    public void testIntParallelStream() {
        minIntParallelStream(arr);
    }

    @JunitPerfConfig( warmUp = 1000, reporter = {HtmlReporter.class})
    public void testIntStream() {
        minIntStream(arr);
    }

    private int minIntStream(int[] arr) {
        return Arrays.stream(arr).min().getAsInt();
    }

    private int minIntParallelStream(int[] arr) {
        return Arrays.stream(arr).parallel().min().getAsInt();
    }

    private int minIntFor(int[] arr) {
        int min = Integer.MAX_VALUE;
        for (int anArr : arr) {
            if (anArr < min) {
                min = anArr;
            }
        }
        return min;
    }

    private static void randomInt(int[] arr) {
        Random r = new Random();
        for (int i = 0; i < arr.length; i++) {
            arr[i] = r.nextInt();
        }
    }
}

Test Case 2:

import com.github.houbb.junitperf.core.annotation.JunitPerfConfig;
import com.github.houbb.junitperf.core.report.impl.HtmlReporter;
import org.junit.jupiter.api.BeforeAll;

import java.util.ArrayList;
import java.util.Random;

public class StreamStringTest {

    public static ArrayList<String> list;

    @BeforeAll
    public static void init() {
        list = randomStringList(1000000);
    }

    @JunitPerfConfig(duration = 10000, warmUp = 1000, reporter = {HtmlReporter.class})
    public void testMinStringForLoop(){
        String minStr = null;
        boolean first = true;
        for(String str : list){
            if(first){
                first = false;
                minStr = str;
            }
            if(minStr.compareTo(str)>0){
                minStr = str;
            }
        }
    }

    @JunitPerfConfig(duration = 10000, warmUp = 1000, reporter = {HtmlReporter.class})
    public void textMinStringStream(){
        list.stream().min(String::compareTo).get();
    }

    @JunitPerfConfig(duration = 10000, warmUp = 1000, reporter = {HtmlReporter.class})
    public void testMinStringParallelStream(){
        list.stream().parallel().min(String::compareTo).get();
    }

    private static ArrayList<String> randomStringList(int listLength){
        ArrayList<String> list = new ArrayList<>(listLength);
        Random rand = new Random();
        int strLength = 10;
        StringBuilder buf = new StringBuilder(strLength);
        for(int i=0; i<listLength; i++){
            buf.delete(0, buf.length());
            for(int j=0; j<strLength; j++){
                buf.append((char)('a'+ rand.nextInt(26)));
            }
            list.add(buf.toString());
        }
        return list;
    }
}

Test Case Three:

import com.github.houbb.junitperf.core.annotation.JunitPerfConfig;
import com.github.houbb.junitperf.core.report.impl.HtmlReporter;
import org.junit.jupiter.api.BeforeAll;

import java.util.*;
import java.util.stream.Collectors;

public class StreamObjectTest {

    public static List<Order> orders;

    @BeforeAll
    public static void init() {
        orders = Order.genOrders(10);
    }

    @JunitPerfConfig(duration = 10000, warmUp = 1000, reporter = {HtmlReporter.class})
    public void testSumOrderForLoop(){
        Map<String, Double> map = new HashMap<>();
        for(Order od : orders){
            String userName = od.getUserName();
            Double v; 
            if((v=map.get(userName)) != null){
                map.put(userName, v+od.getPrice());
            }else{
                map.put(userName, od.getPrice());
            }
        }

    }

    @JunitPerfConfig(duration = 10000, warmUp = 1000, reporter = {HtmlReporter.class})
    public void testSumOrderStream(){
        orders.stream().collect(
                Collectors.groupingBy(Order::getUserName, 
                        Collectors.summingDouble(Order::getPrice)));
    }

    @JunitPerfConfig(duration = 10000, warmUp = 1000, reporter = {HtmlReporter.class})
    public void testSumOrderParallelStream(){
        orders.parallelStream().collect(
                Collectors.groupingBy(Order::getUserName, 
                        Collectors.summingDouble(Order::getPrice)));
    }
}


class Order{
    private String userName;
    private double price;
    private long timestamp;
    public Order(String userName, double price, long timestamp) {
        this.userName = userName;
        this.price = price;
        this.timestamp = timestamp;
    }
    public String getUserName() {
        return userName;
    }
    public double getPrice() {
        return price;
    }
    public long getTimestamp() {
        return timestamp;
    }

    public static List<Order> genOrders(int listLength){
        ArrayList<Order> list = new ArrayList<>(listLength);
        Random rand = new Random();
        int users = listLength/200;// 200 orders per user
        users = users==0 ? listLength : users;
        ArrayList<String> userNames = new ArrayList<>(users);
        for(int i=0; i<users; i++){
            userNames.add(UUID.randomUUID().toString());
        }
        for(int i=0; i<listLength; i++){
            double price = rand.nextInt(1000);
            String userName = userNames.get(rand.nextInt(users));
            list.add(new Order(userName, price, System.nanoTime()));
        }
        return list;
    }
    @Override
    public String toString(){
        return userName + "::" + price;
    }
}

Welcome to my blog, which has a lot of collections

  • This reprint notes the source (must be connected, not text only): Alphabetic Blog.

If you find it helpful, give me a compliment and share!Your support is my endless creative power!.In addition, the author has recently output the following boutique content, looking forward to your attention.

Topics: Programming github Java Junit Spring