Statement: 1 ***
2. Because it is a personal summary, write the article with the most concise words
3. If there is any mistake or improper place, please point out
Side output
That is, tributaries can be used to receive late data or classify data into multiple tributaries
For sliding windows, there are many overlapping windows. When the late data is not received by all windows, it will enter the side output stream
Only Process, the lowest API, can use the side output stream through the environment context
Case: output data with temperature value lower than 30 degrees to SideOutput
// Define the side output stream label, and pay attention to its anonymous implementation class // The side output stream label must be a subclass of it. OutputTagb cannot be used directly final OutputTag<SensorReading> lowTempTag = new OutputTag<SensorReading>("lowTemp") { }; SingleOutputStreamOperator<SensorReading> highTempStream = dataStream.process(new ProcessFunction<SensorReading, SensorReading>( ) { @Override public void processElement(SensorReading value, Context ctx, Collector<SensorReading> out) { if (value.getTemperature( ) < 30) { ctx.output(lowTempTag, value); } else { out.collect(value); } } }); DataStream<SensorReading> lowTempStream = highTempStream.getSideOutput(lowTempTag); highTempStream.print("high"); lowTempStream.print("low");
8 process APIs:
-
ProcessFunction
-
KeyedProcessFunction
You have to keyBy first,
Each element of the stream is processed to out Output any number of elements in the form of collect (xxx)
-
·processElement(I value, Context ctx, Collector<O> out)
ctx can
-
Timestamp of the access element
-
key to access element
-
Access TimerService(ctx.timerService())
TimerService:
method:
- EventTime correlation
- long currentWatermark() returns the event time of the current data
- Void registereventtimer (long timestamp) registers the timer of the current key
- Void deleteeventtimer (long timestamp) deletes the timer. If not, it will not be executed
- ProcessingTime related
- long currentProcessingTime() returns the processing time of the current data
- Void registerprocessingtimer (long timestamp) registers the timer of the current key
- Void deleteprocessingtimer (long timestamp) deletes the timer. If not, it will not be executed
-
When the Timer timer is triggered, the callback function onTimer() will be executed
-
If the timer started when the registration window is closed, it is better to delay 1s based on WindowEndTime;
Because at the critical point, it is necessary to trigger both window calculation and timer;
The timer task depends on the calculation of the window first, so it is better to give a delay of 1s
Case requirement: if the temperature value rises continuously within 10 seconds (processing time), an alarm will be given
public class TempIncreaseWarning extends KeyedProcessFunction<String, SensorReading, String> { private Integer interval; public TempIncreaseWarning(Integer interval) { this.interval = interval; } // Record the last temperature private ValueState<Double> lastTempState; // Record timer trigger time private ValueState<Long> timerTsState; @Override public void open(Configuration parameters) throws Exception { lastTempState = getRuntimeContext( ).getState(new ValueStateDescriptor<Double>("last-temp", Double.class, Double.MIN_VALUE)); timerTsState = getRuntimeContext( ).getState(new ValueStateDescriptor<Long>("timer-ts", Long.class)); } @Override public void processElement(SensorReading value, Context ctx, Collector<String> out) throws Exception { // Take out status Double lastTemp = lastTempState.value( ); Long timerTs = timerTsState.value( ); // Update temperature status lastTempState.update(value.getTemperature( )); // Whenever the temperature rises, && there is no timer if (value.getTemperature( ) > lastTemp && timerTs == null) { long ts = ctx.timerService( ).currentProcessingTime( ) + interval * 1000L; // Register timer ctx.timerService( ).registerProcessingTimeTimer(ts); // For subsequent deletion, the timer can find the registration timestamp timerTsState.update(ts); } // The & & timer is empty whenever the temperature rises else if (value.getTemperature( ) <= lastTemp && timerTs != null) { // Clear the timer. Note that ts cannot be used. We are looking for the time stamp of the registered timer ctx.timerService( ).deleteProcessingTimeTimer(timerTs); timerTsState.clear( ); } } @Override public void onTimer(long timestamp, OnTimerContext ctx, Collector<String> out) { out.collect("sensor" + ctx.getCurrentKey( ) + "Continuous temperature" + interval + "Second rise"); timerTsState.clear( ); } }
- EventTime correlation
-
Output data to side output stream
-
-
. Ontimer (long timestamp, ontimercontext CTX, collector < o > out) is a callback function, which is called when the previously registered timer triggers.
timestamp is the time stamp set by the timer to trigger the operation
If you register an expired time, it will trigger the timer when you enter the data again
-
-
CoProcessFunction
The flow after connect is re process
There are processElement1() and processElement2()
-
ProcessJoinFunction
-
BroadcastProcessFunction
Stream A has one partition and stream B has four partitions. Stream B uses the data of stream A, so it is necessary to broadcast the data of one partition of stream A to the four partitions of stream B
process after broadcasting
-
KeyedBroadcastProcessFunction
-
ProcessWindowFunction
Such as aggregate(AggregateFunction<IN, ACC, OUT>aggFunction,ProcessWindowFunction<IN, OUT, KEY, W> windowFunction)
-
ProcessAllWindowFunction