Stream streaming operation in Java 8 - Introduction

Posted by houssam_ballout on Fri, 04 Mar 2022 16:15:57 +0100

Author: Tangyuan

Personal blog: javalover.cc

preface

In the past, it was always called by friends, which was suspected of being close, so I'd better change the title for you later

Because people come to see things, let's call them officials for the time being (inspired by Jin Ping Mei, one of the four famous works spread among the people)

Hello, officials. I'm tangyuan. Today I bring you Stream streaming operation in Java8 - introduction. I hope it can help. Thank you

The article is purely original, and personal summary will inevitably make mistakes. If so, please reply in the comment area or send a private message backstage. Thank you

brief introduction

Stream operation, also known as functional operation, is a new feature of Java 8

Streaming operations are mainly used to process data (such as collections), just as generics are mostly used in collections (it seems that collections are still a key gadget, which can be found everywhere)

Let's mainly use examples to introduce the basic operation of flow (it is recommended to take a look at it first) lambda expressions , which introduces lambda expressions, functional interfaces, method references, etc., which will be used below)

Let's look at the catalogue first

catalogue

  1. What is flow

  2. Boss, put on the chestnuts

  3. Operation steps of flow

  4. Characteristics of flow

  5. The difference between streaming operation and collection operation

text

1. What is flow

Stream is an API that processes data declaratively

What is the declarative way?

It is just declaration, not implementation, similar to abstract methods (polymorphism)

2. Boss, put on chestnuts

Let's take a chestnut to see what streaming operation is, and then introduce the following related concepts for this chestnut

Demand: screen cats older than 1 (1 year for cats ≈ 5 years for people), sort them by age, and finally extract their names and store them separately in the list

public class BasicDemo {
    public static void main(String[] args) {
      // The following cat names are real names and are not fictional
        List<Cat> list = Arrays.asList(new Cat(1, "tangyuan"), new Cat(3, "dangdang"), new Cat(2, "milu"));
        // ===Old code before Java 8===
        List<Cat> listTemp = new ArrayList<>();
        // 1. Screening
        for(Cat cat: list){
            if(cat.getAge()>1){
                listTemp.add(cat);
            }
        }
        // 2. Sorting
        listTemp.sort(new Comparator<Cat>() {
            @Override
            public int compare(Cat o1, Cat o2) {
                // sort ascending 
                return Integer.compare(o1.getAge(), o2.getAge());
            }
        });
        // 3. Extract name
        List<String> listName = new ArrayList<>();
        for(Cat cat: listTemp){
            listName.add(cat.getName());
        }
        System.out.println(listName);
        
        // ===New code after Java 8===
        List<String> listNameNew = list.stream()
          			// Boolean test (T) abstract method of functional interface predict
                .filter(cat -> cat.getAge() > 1)
								// Method reference of lambda expression
			          .sorted(Comparator.comparingInt(Cat::getAge))
          			// R apply(T t) abstract method of functional interface function
                .map(cat-> cat.getName())
             	  // Collect data and turn the flow into a set List
                .collect(Collectors.toList());
        System.out.println(listNameNew);
    }
}
class Cat{
    int age;
    String name;

    public Cat(int age, String name) {
        this.age = age;
        this.name = name;
    }
	// Omit getter/setter
}

As you can see, with streaming operation, the code is much simpler (wow)

Q: Some officials may think that this is a bit like the combination operation of lambda expressions above.

A: You're right. It's really just like that. There's still a big difference. Because the combination operation of lambda expression is actually an operation directly aimed at the set;

At the end of the article, we will talk about the difference between direct operation set and streaming operation. Let's skip here

Let's introduce the knowledge points involved based on this chestnut

3. Operation steps of flow

Let's ignore the collection operation of the old version (we'll talk about the difference between flow and collection later). Let's introduce the operation of flow first (after all, flow is the protagonist today)

The operation of flow is divided into three steps: creating flow, intermediate operation and terminal operation

The flow chart is as follows:

Here we should pay attention to a very important point:

Before the terminal operation starts, the intermediate operation will not perform any processing, it just declares what operation to perform;

You can imagine that the above process is a pipeline: let's simplify it here

  1. Purpose: first tell you that we want to process bottled water (first create a stream and tell you what data to process)
  2. Then build an assembly line for these bottles and water: fixture for fixing bottles, water pipe for water, claw for screwing the lid, packer for packing (intermediate operation, stating the operation to be performed)
  3. Finally, press the start button and the pipeline starts to work (terminal operation starts to process data according to intermediate operation)

Because each intermediate operation returns a Stream, so they can be combined all the time (I seem to have eaten something?), However, their combination order is not fixed, and the flow will choose the appropriate combination order according to the system performance

We can print something to see:

List<Cat> list = Arrays.asList(new Cat(1, "tangyuan"), new Cat(3, "dangdang"), new Cat(2, "milu"));
List<String> listNameNew = list.stream()
  .filter(cat -> {
    System.out.println("filter: " + cat);
    return cat.getAge() > 1;
  })
  .map(cat-> {
    System.out.println("map:" + cat);
    return cat.getName();
  })
  .collect(Collectors.toList());

The output is as follows:

filter: Cat{age=1}
filter: Cat{age=3}
map:Cat{age=3}
filter: Cat{age=2}
map:Cat{age=2}

It can be seen that the filter and map of the intermediate operation are combined together and cross executed, although they are two independent operations (this technology is called circular merging)

This combination is mainly determined by the streaming operation according to the performance of the system

Now that we have talked about circular merging, let's talk about the short-circuit technique

We should be familiar with the word short circuit (such as brain short circuit). It means that a - > b - > C should be executed originally, but there is a short circuit in B, so it becomes a - > C

The short circuit here refers to the intermediate operation. Due to some reasons (such as the following limit), only part of the operation is performed, not all of it

Let's modify the above example (add an intermediate operation limit):

List<Cat> list = Arrays.asList(new Cat(1, "tangyuan"), new Cat(3, "dangdang"), new Cat(2, "milu"));
List<String> listNameNew = list.stream()
  .filter(cat -> {
    System.out.println("filter: " + cat);
    return cat.getAge() > 1;
  })
  .map(cat-> {
    System.out.println("map:" + cat);
    return cat.getName();
  })
  // Only this line is added
  .limit(1)
  .collect(Collectors.toList());

The output is as follows:

filter: Cat{age=1}
filter: Cat{age=3}
map:Cat{age=3}

As you can see, because limit(1) only needs one element, when filter filters, as long as it finds one that meets the conditions, it will stop the filtering operation (the following elements will give up). This technique is called short circuit technique

This largely reflects the advantages of the combination sequence of intermediate operations: how much is needed and how much is processed, that is, on-demand processing

4. Characteristics of flow

There are three characteristics:

  • Declarative: concise and easy to read, greatly reducing the number of lines of code (except for companies requiring the number of lines of code per day)
  • Composable: more flexible, all kinds of combinations are OK (as long as you want, as long as the flow has)
  • Parallelizable: better performance (no need for us to write multithreading, how good)

5. The difference between streaming operation and collection operation:

Now let's review the collection operation in the initial example: filter - > sort - > extract

List<Cat> listTemp = new ArrayList<>();
// 1. Screening
for(Cat cat: list){
  if(cat.getAge()>1){
    listTemp.add(cat);
  }
}
// 2. Sorting
listTemp.sort(new Comparator<Cat>() {
  @Override
  public int compare(Cat o1, Cat o2) {
    // sort ascending 
    return Integer.compare(o1.getAge(), o2.getAge());
		/**
    * Q: Why not subtract return O1 getAge() - o2. getAge()?
    * A: Because subtraction risks data overflow
    * 	 If O1 Getage() is 2 billion, O2 If getage () is - 200 million, the result will exceed the limit of int by more than 2.1 billion
    **/ 
  }
});
// 3. Extract name
List<String> listName = new ArrayList<>();
for(Cat cat: listTemp){
  listName.add(cat.getName());
}
System.out.println(listName);

You can see that there are two points different from streaming operation:

  1. There is a listTemp temporary variable in the collection operation (no stream operation),
  2. The operation of data collection - > sorting is performed by the terminal in sequence, and the operation of data collection - > sorting is not performed until the last step of data processing

Below we use a table to list the differences, which should be more intuitive

Streaming operation Collection operation
function Data processing Mainly storing data
Iterative method Internal iteration (only one iteration), only declaration, no implementation, internal implementation of the flow) External iterations (which can be iterated all the time) need their own foreach
Processing data Data processing (on-demand processing) will not start until the terminal is operated Always processing data (all processing)

If you compare with examples in life, you can compare them with movies

Streaming is like online viewing. Collection is good for local viewing (download to local)

summary

  1. What is flow:
    • Stream is an API that processes data declaratively
    • A stream is a sequence of elements generated from a source that supports data processing operations
      • Source: the source of data, such as collection, file, etc. (this section only introduces the streaming operation of collection, because it is used more; others will be introduced later when there is time)
      • Data processing operation: it is the intermediate operation of the stream, such as filter and map
      • Element sequence: the result set returned through the terminal operation of the stream
  2. Operation flow of flow:
    • Create stream - > intermediate operation - > terminal operation
    • The intermediate operation is only a declaration and does not process the data truly. It will not be executed until the terminal operation starts
  3. Circular merging: intermediate operations will be combined freely (the flow will determine the order of combination according to the system itself)
  4. Short circuit technique: if the data processed by the intermediate operation has reached the demand, the data processing will be stopped immediately (for example, limit(1), and the processing will be stopped when one is processed)
  5. The difference between streaming operation and collection operation:
    • Stream on-demand processing, set full processing
    • Stream main attack data processing, collection main attack data storage
    • The stream is concise and the set is not
    • Internal iteration of the flow (only one iteration, and the flow will disappear after completion), external iteration of the collection (it can be iterated all the time)

Postscript

Finally, thank you for watching, thank you

It's not easy to be original. I'm looking forward to the three companies of officials

Topics: Java jq aslist