Java 8 Stream from entry to advanced -- playing with collections like SQL

Posted by cyanblue on Tue, 08 Mar 2022 05:03:41 +0100

theme: smartblue

0. After reading this article, you will

  • Understand the definition and characteristics of Stream
  • Understand the basic and high-order usage of Stream 1 preface In our daily use of Java, we have to deal with collections. Various operations for sets are somewhat similar to SQL - add, delete, modify, query and aggregate operations, but their convenience is not as good as SQL.

So is there such a way that we can not use the loop again and again to process the set, but can easily operate the set?

The answer is yes, it is -- Stream introduced by Java 8, also known as Stream.

2. Definition of flow

A Stream is a sequence of elements from a source. A stream is a queue of elements from a data source.

In short, flow is the packaging of data sources. It allows us to aggregate data sources and batch process them conveniently and quickly.

In daily life, we see water flowing in the pipeline. Streams in Java can also be transmitted in a "pipe". It can also be processed at the "pipeline" node, such as filtering, sorting, etc.

+--------------------+       +------+   +------+   +---+   +-------+
| stream of elements +-----> |filter+-> |sorted+-> |map+-> |collect|
+--------------------+       +------+   +------+   +---+   +-------+

The element flow is processed by intermediate operation in the pipeline. Finally, the terminal operation obtains the results of the previous processing (each flow can only have one terminal processing).

Intermediate operations can be divided into stateless operations and stateful operations. The former means that the processing of elements is not affected by previous elements; The latter means that the operation cannot continue until all elements are obtained.

Terminal operation can also be divided into short-circuit operation and non short-circuit operation. The former means that the final result can be obtained when meeting the qualified elements, while the latter must deal with all elements to obtain the final result.

The following figure shows us the specific methods of intermediate operation and terminal operation.

How to quickly distinguish between intermediate operation and terminal operation? Look at the return value of the method. Generally, the return value of Stream is an intermediate operation, otherwise it is a terminal operation.

Let's look at the characteristics of flow:

  1. Stream does not store data, so it is not a data structure, and it will not modify the underlying data source. It is born for functional programming.
  2. Lazy execution, such as filter and map, is delayed. The intermediate operation of the flow is always inert.

When the terminal operation needs intermediate operation, the intermediate operation will be called.

Let's take an example to illustrate this point:

List<String> list = Arrays.asList("a", "b", "c");
Stream<String> stream = -> {
    System.out.println("filter() was invoked");
    return element.contains("b");

The above code tells us that there are three elements in the stream, so we should call the filter() method three times and print the filter() was invoked three times.

But in fact, it is not printed once, which means that the filter() method has not been called once. This is because the terminal operation is missing from the code.

Let's change the code and add a map() method and terminal operation.

List<String> list = Arrays.asList("a", "b", "c");
Optional<String> stream = -> {
      System.out.println("filter() was invoked");
      return element.contains("b");
}).map(ele -> {
      System.out.println("map() was invoked");
      return ele.toUpperCase();

The input results are as follows:

filter() was invoked
filter() was invoked
map() was invoked

The print result shows that we called the filter() method twice and the map() method once.

Instead of calling the filter() method in the first segment of the program, the filter() method is called directly in the second segment of the program.

Because the findFirst() method only selects the first element, we call at least one less filter() method. This is precisely because of the mechanism of lazy invocation.

  1. Flow can be infinite. Although the collection has a limited size, the stream does not need to. Short circuit operation, such as limit(n) or findFirst(), allows the calculation of infinite flow to be completed in a limited time.
  2. Flow or consumables. In the life cycle of a flow, the elements of the flow are accessed only once. As with iterators, a new stream must be generated to revisit the same elements of the source. The accessed stream will be closed.

For example 4, let's look at one.

IntStream intStream = IntStream.of(1, 2, 3);
OptionalInt anyElement = intStream.findAny();
OptionalInt firstElement = intStream.findFirst();

Executing this code will get the following error Java lang.IllegalStateException:

Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed

Because the intStream in the code has experienced a terminal operation findAny(), the intStream has been closed. Another terminal operation will report an error.

This design conforms to the characteristics of logic and flow, because flow is not to store elements.

By changing the code to the following, we can perform multiple terminal operations.

int[] intArray = {1, 2, 3};
OptionalInt anyElement =;
OptionalInt firstElement =;

These features distinguish Stream from Collection.

Note that the Stream "Stream" here is different from the Java I/O Stream. There is little relationship between them.

3. Create a flow

There are many ways to create a Java stream. Once a stream is created, it cannot modify the data source, so we can create multiple streams for one data source.

3.1 create an empty stream

We can use the empty() method to create an empty stream:

Stream<String> emptyStream = Stream.empty();

We can also use the empty() method to return an empty stream to avoid returning null:

public Stream<String> streamOf(List<String> list) {
    return list == null || list.isEmpty() ? Stream.empty() :;

3.2 creating streams using arrays

We can use all or part of the array to create a stream:

String[] arr = new String[]{"1", "2", "3","4", "5"};
Stream<String> entireArrayStream =;
Stream<String> partArrayStream =, 1, 4);

3.3 creating streams using collections

We can also use collections to create streams:

Collection<String> collection = Arrays.asList("1", "2", "3");
Stream<String> collectionStream =;

3.4 using stream Builder () to create a stream

When creating a stream in this way, please note that you must declare the type you want, otherwise you will create a stream < obejct >:

Stream<String> streamBuilder =

3.5 create a stream using File

We can use files Lines () method to create a stream. Each line of the file becomes each element of the stream.

Path path = Paths.get("C:\\tmp\\file.txt");
        Stream<String> fileStream = Files.lines(path);
        Stream<String> fileStreamWithCharset = Files.lines(path, Charset.forName("UTF-8"));

3.6 Stream.iterate()

We can also use iterate() to create a stream:

Stream<Integer> iteratedStream = Stream.iterate(10, n -> n + 1).limit(10);

In the above code, a stream of continuous elements will be created.

The first element is 10, the second element is 11, and so on until the number of elements reaches size.

3.7 Stream.generate()

The generate() method accepts a supplier < T > to generate elements.

Because the flow is infinite, we need to set the size of the flow.

The following code will create a stream that contains five "ele" strings.

Stream<String> generatedStream =
  Stream.generate(() -> "ele").limit(5);

3.8 basic types of flows

1. range() and rangeClosed()

In Java 8, three basic types - int, long and double - can create corresponding streams.

Because stream < T > is a generic interface, the basic type cannot be used as a type parameter, because we use IntStream, LongStream and DoubleStream to create a stream.

IntStream intStream = IntStream.range(1, 3);//1,2
LongStream longStream = LongStream.rangeClosed(1, 3);//1,2,3

The range(int start, int end) method creates an ordered flow from start to end in steps of 1, but it does not include end.

The difference between rangeClosed(int start, int end) and range() methods is that the former will include end.

2. Of method

In addition, basic types can create streams through the of() method.

int[] intArray = {1,2,3};
IntStream intStream = IntStream.of(intArray);//1,2,3
IntStream intStream2 = IntStream.of(1, 2, 3);//1,2,3

long[] longArray = {1L, 2L, 3L};
LongStream longStream = LongStream.of(longArray);//1,2,3
LongStream longStream2 = LongStream.of(1L, 2L, 3L);//1,2,3

double[] doubleArray = {1.0, 2.0, 3.0};
DoubleStream doubleStream = DoubleStream.of(doubleArray);
DoubleStream doubleStream2 = DoubleStream.of(1.0, 2.0, 3.0);//1.0,2.0,3.0

3. Random class

In addition, starting with Java 8, the Random class also provides a series of methods to generate basic types of streams. For example:

Random random = new Random();
IntStream intStream = random.ints(3);
LongStream longStream = random.longs(3);
DoubleStream doubleStream = random.doubles(3);

3.9 stream of strings

1. Character stream

Because Java does not have CharStream, we use InStream instead of character stream.

IntStream charStream = "abc".chars();

2. Stream of string

We can create a stream of strings through regular expressions.

Stream<String> stringStream = Pattern.compile(",").splitAsStream("a,b,c");

4. Usage of stream

4.1 basic usage

4.1.1 forEach() method

We should be familiar with the forEach() method, which is found in the Collection. Its function is to perform the specified action on each element, that is, to traverse the element.

Arrays.asList("Try", "It", "Now")

Output result:


1. Method reference

Readers may wonder about system What is the writing method of out:: println? The normal writing method should not be the following?

Arrays.asList("Try", "It", "Now")
                .forEach(ele -> System.out.println(ele));

In fact, the two writing methods are equivalent, but the former is the abbreviation of the latter. The former is called method references, which is used to replace some specific forms of lambda expressions.

If the whole content of a lambda expression is to call an existing method, you can replace the lambda expression with a method reference.

It is also worth learning a lot.

As an aside, we can subdivide method references into the following four categories:



Reference static method


A method that references an object


Methods that reference a class


Reference construction method


And system Out:: println is a method that references an object.

2. Side effects

In fact, in the above example, we use forEach() to print the results, which is a common scenario using side effects.

But beyond this scenario, we should avoid the side effects of using streams.

According to my own understanding, do not modify the state outside the function, and do not write to attributes other than lambda expressions in intermediate operations.

Especially in parallel streams, this operation will lead to unpredictable results, because parallel streams are disordered.

// error
List<String> list = new ArrayList<>();
stream.filter(s -> pattern.matcher(s).matches())
      .forEach(s -> list.add(s));//Wrong side effect usage scenario
// correct
List<String> list2 =
     stream.filter(s -> pattern.matcher(s).matches())
             .collect(Collectors.toList());//No side effects

4.1.2 filter() method

The function of filter() method is to return qualified streams.

Arrays.asList("Try", "It", "Now")
                .filter(ele -> ele.length() == 3)

Output result:


4.1.3 distinct() method

The distinct() method returns a de duplicated stream.

Arrays.asList("Try", "It", "Now", "Now")

4.1.4 sorted() method

There are two sorting functions, one is natural order, and the other is custom comparator sorting.

Arrays.asList("Try", "It", "Now")
                .sorted((str1, str2) -> str1.length() - str2.length())

Output result:


4.1.5 map() method

The map() method converts each element according to a certain operation. The elements of the converted stream will not change, but the element type depends on the type after conversion.

Arrays.asList("Try", "It", "Now")

Output result:


4.1.6 flatMap() method

Flat means "flat" in English, and the function of flatMap() method is to flatten the flow elements. We can better understand it with the help of the following example:

Stream.of(Arrays.asList("Try", "It"), Arrays.asList("Now"))
                .flatMap(list ->

Output result:


In the above code, the original stream has two elements, two lists respectively. After executing flatMap(), each List is "flattened" into one element, so a stream composed of three strings will be generated.

4.2 reduction operation

The last section introduced the basic usage of Stream, but how can such a powerful Stream stop here? Let's take a look at the main play of flow - reduction operation.

reduction operation, also known as fold operation, is a process of summarizing all elements into one result through some kind of connection action. Element summation, maximum value, minimum value and total number, and converting all elements into a set are all reduction operations.

Stream class library has two general reduction operations reduce() and collect(), and some special reduction operations designed to simplify writing, such as sum(), max(), min(), count().

These are easy to understand, so we will focus on reduce() and collect().

4.2.1 reduce()

The reduce operation can generate a value from a group of elements. For example, sum(), max(), min(), count(), etc. are all reduce operations.

The reduce() method can be defined in three forms:

Optional<T> reduce(BinaryOperator<T> accumulator)

T reduce(T identity, BinaryOperator<T> accumulator)

<U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)

1. identity initial value

2. accumulator accumulator

3. combiner splicer, which can only be used in parallel execution.

Let's look at examples of these three methods:

Optional<Integer> reducedInt = Stream.of(1, 2, 3).reduce((a, b) -> a + b);

reducedInt = 1 + 2 + 3 = 6. There is no initial value in the above code, only an accumulator, so it is a very simple accumulation of a and b.

int reduceIntWithTwoParams = Stream.of(1, 2, 3).reduce(10, (a, b) -> a + b);

reduceIntWithTwoParams = 10 + 1 + 2 + 3 = 16

The above code has an initial value and an accumulator, so the initial value should be added first and then accumulated step by step.

int reducedIntWithAllParams = Stream.of(1, 2, 3).reduce(10, (a, b) -> a + b, (a, b) -> {
    System.out.println("Combiner was invoked.");
    return a + b;

The result of this code is the same as that of the previous paragraph, and it is not printed, indicating that combiner has not been called. If we need to make combiner work, we should use the parallelStream() method here.

int reducedIntWithAllParams = Arrays.asList(1, 2, 3).parallelStream().reduce(10, (a, b) -> a + b, (a, b) -> {
    System.out.println("Combiner was invoked");
    return a + b;

reducedIntWithAllParams = (10 + 1)+ ((10 + 2) + (10 + 3)) = 36

Why 36? This is because of the function of combiner, which splices multiple parallel results together. and collection Parallelstream() generates a serialized stream (normal stream) and a parallel stream, respectively. There is a difference between parallel and concurrency. Concurrency means that a processor processes multiple tasks at the same time. Parallel refers to multiple processors or multi-core processors processing multiple different tasks at the same time. Concurrency occurs simultaneously in logic, while parallelism occurs simultaneously in physics. For example: concurrency is that one person eats three steamed buns at the same time, while concurrency is that three people eat three steamed buns at the same time. And parallelism is not necessarily fast, especially when the amount of data is very small, it may be slower than ordinary flow. Only in the case of large amount of data and multi-core can parallel flow be considered. In the case of parallel processing, the collection class passed to reduce() needs to be thread safe, otherwise the execution result will be different from the expected result.

4.2.2 collect() method

collect() should be the ultimate ace in Stream. Basically, you can find all the functions you want here.

And using it is also an excellent way to get started with Java functional programming.

Let's start with a practical example!

List<Student> students = Arrays.asList(new Student("Jack", 90)
                , new Student("Tom", 85)
                , new Student("Mike", 80));

1. Conventional reduction operation

Get average

Double averagingScore =;

Access and

Double summingScore =;

Get analysis data

DoubleSummaryStatistics doubleSummaryStatistics =;

You can get common statistics such as maximum, minimum and average from doubleSummaryStatistics.

These methods provided by Collectors eliminate the additional map() method. Of course, you can also use the map() method before operation.

2. Convert stream to Collection

Through the following code, we can extract the Name attribute of the Student in the collection and load it into the collection of string type.

List<String> studentNameList =;//[Jack, Tom, Mike]

You can also use collectors Join () method to connect strings. And the Collector will help you deal with the problem that the last element should not be separated.

String studentNameList =",", "[", "]"));//Print out [Jack,Tom,Mike]

3. Convert stream to Map

A Map cannot be directly converted into a Stream, but it is feasible to generate a Map from a Stream. Before generating a Map, we should first define what the Key and Value of the Map represent respectively.

Generally, the result of collect() will be Map in the following three cases:

  • Collectors.toMap(), the user needs to specify the key and value of the Map;
  • Collectors.groupingBy(), group elements;
  • Collectors.partitioningBy(), the second partition operation is performed on the element.


The following example shows us how to convert the students list into a map composed of < student, student, double score >.

Map<Student, Double> collect = students
                .collect(Collectors.toMap(Function.identity(), Student::getScore));


This operation is somewhat similar to the groupBy operation in SQL. The data is grouped according to a certain attribute, and the elements with the same attribute will be assigned to the same key.

The following example will group students according to Score:

Map<Double, List<Student>> nameStudentMap =;


partitioningBy() divides the elements in the stream into two parts according to a binary logic. For example, the following example divides students into passing or failing parts.

Map<Boolean, List<Student>> map = -> ele.getScore() >= 85));

Print results:

{false=[Student{name='Mike', score=80.0}], true=[Student{name='Jack', score=90.0}, Student{name='Tom', score=85.0}]}

5. Conclusion

Java 8 Stream is a powerful tool, but we must comply with the specification when using it, otherwise it may bring you unexpected surprises~

If you see here, you still have more meaning, you might as well take a look at the selected articles in previous periods~