flink DataStream API status and fault tolerance - broadcast status mode

Posted by josborne on Tue, 21 Dec 2021 14:15:08 +0100

Broadcast status mode

In this section, you will learn how to use broadcast status in practice. See yes Stateful Stream Processing To understand the concepts behind stateful flow processing.

API provided

In order to demonstrate the provided API s, we will start with an example and then show their full functionality before showing their full functionality.

In this example, the first stream will contain elements of type Item with Color and Shape attributes. Another flow will contain rules. As our running example, we will use a scenario where we have a stream of objects of different colors and shapes, and we want to find a set of objects of the same Color that follow a specific pattern, such as a rectangle followed by a triangle.

Starting from the project flow, we just need to press the color key to control it, because we need a set of objects of the same color. This will ensure that elements of the same color eventually appear on the same physical machine.

// key the items by color
KeyedStream<Item, Color> colorPartitionedStream = itemStream
                        .keyBy(new KeySelector<Item, Color>(){...});

The broadcast stream should be broadcast to all tasks downstream, and these tasks store them locally so that the incoming item s can be evaluated through these rules

The following code snippet will

  • Using the MapStateDescriptor provided, it creates a broadcast state that will store rules.
  • Broadcast rule flow
// a map descriptor to store the name of the rule (string) and the rule itself.
MapStateDescriptor<String, Rule> ruleStateDescriptor = new MapStateDescriptor<>(
			"RulesBroadcastState",
			BasicTypeInfo.STRING_TYPE_INFO,
			TypeInformation.of(new TypeHint<Rule>() {}));
		
// broadcast the rules and create the broadcast state
BroadcastStream<Rule> ruleBroadcastStream = ruleStream
                        .broadcast(ruleStateDescriptor);

Finally, in order to apply rules to the incoming elements from the Item stream, we need to:

  • Connect two streams
  • Specify our match detection logic.

Connect a stream to a BroadcastStream by calling the connect(BroadcastStream) method of a non broadcast stream. This will return a BroadcastConnectedStream on which we can process(). The function will contain our matching logic:

  • If it is keyed, the function is keyedroadcastprocessfunction.
  • If it is non keyed, the function is a BroadcastProcessFunction.

Since our non broadcast stream is keyed, the following code segment contains the above call:

DataStream<String> output = colorPartitionedStream
                 .connect(ruleBroadcastStream)
                 .process(
                     
                     // type arguments in our KeyedBroadcastProcessFunction represent: 
                     //   1. the key of the keyed stream
                     //   2. the type of elements in the non-broadcast side
                     //   3. the type of elements in the broadcast side
                     //   4. the type of the result, here a string
                     
                     new KeyedBroadcastProcessFunction<Color, Item, Rule, String>() {
                         // my matching logic
                     }
                 );

BroadcastProcessFunction and keyedroadcastprocessfunction

Like coprocess functions, these functions have two processing methods to implement; processBroadcastElement() is responsible for processing the incoming elements in broadcast streams, and processElement() is used for non broadcast streams. The complete signatures of these methods are as follows:

public abstract class BroadcastProcessFunction<IN1, IN2, OUT> extends BaseBroadcastProcessFunction {

    public abstract void processElement(IN1 value, ReadOnlyContext ctx, Collector<OUT> out) throws Exception;

    public abstract void processBroadcastElement(IN2 value, Context ctx, Collector<OUT> out) throws Exception;
}
public abstract class KeyedBroadcastProcessFunction<KS, IN1, IN2, OUT> {

    public abstract void processElement(IN1 value, ReadOnlyContext ctx, Collector<OUT> out) throws Exception;

    public abstract void processBroadcastElement(IN2 value, Context ctx, Collector<OUT> out) throws Exception;

    public void onTimer(long timestamp, OnTimerContext ctx, Collector<OUT> out) throws Exception;
}

The first thing to note is that both functions need to implement the processBroadcastElement() method to handle broadcast side elements and processElement() to handle non broadcast side elements.

The two methods differ in the Context in which they are provided. The non broadcast end has a ReadOnlyContext, while the broadcast end has a Context.

These two contexts (ctx in the following enumeration):

  • Access broadcast status: CTX getBroadcastState(MapStateDescriptor<K, V> stateDescriptor)
  • Timestamp of allowed query elements: CTX timestamp()
  • Get current watermark: CTX currentWatermark()
  • Get current processing time: CTX currentProcessingTime()
  • Output element to side: CTX output(OutputTag<X> outputTag, X value).

The stateDescriptor in getBroadcastState() should be the same as the one above The stateDescriptor in broadcast(ruleStateDescriptor) is the same.

The difference is that they have different access types to broadcast status. The broadcasting end has read-write access to it, while the non broadcasting end has read-only access. The reason is that cross task communication cannot be carried out in flink. Therefore, in order to ensure that the content in the broadcast state is the same in all parallel operator instances, the non broadcast end can only read the broadcast.

Note: the logic implemented in processBroadcastElement() must have the same deterministic behavior in all parallel instances!

Finally, because keyedroadcastprocessfunction runs on the keyed stream, it exposes some functions that cannot be used by BroadcastProcessFunction. Namely:

  • The ReadOnlyContext in the processElement() method provides access to the Flink underlying timer service, which allows the registration of events and / or processing time timers. When the timer is triggered, Ontimer () (as shown above) is called by OnTimerContext, which exposes the same functions as ReadOnlyContext plus

  • It has the ability to ask whether the triggered timer is event time or processing time

  • You can query the key associated with the timer.

  • The Context in the processBroadcastElement() method contains the applytokeyedstate (stateDescriptor < s, vs > stateDescriptor, KeyedStateFunction < KS, s > function) method. This method allows the registration of a KeyedStateFunction, which is applied to all States of all keys associated with the provided stateDescriptor.

Note: timers can only be registered at processElement() of keyedroadcastprocessfunction, and timers can only be registered at this location. This is not possible in the processBroadcastElement() method because there is no key associated with the broadcast element.

Returning to our original example, our keyedroadcastprocessfunction may be as follows:

new KeyedBroadcastProcessFunction<Color, Item, Rule, String>() {

    // store partial matches, i.e. first elements of the pair waiting for their second element
    // we keep a list as we may have many first elements waiting
    private final MapStateDescriptor<String, List<Item>> mapStateDesc =
        new MapStateDescriptor<>(
            "items",
            BasicTypeInfo.STRING_TYPE_INFO,
            new ListTypeInfo<>(Item.class));

    // identical to our ruleStateDescriptor above
    private final MapStateDescriptor<String, Rule> ruleStateDescriptor = 
        new MapStateDescriptor<>(
            "RulesBroadcastState",
            BasicTypeInfo.STRING_TYPE_INFO,
            TypeInformation.of(new TypeHint<Rule>() {}));

    @Override
    public void processBroadcastElement(Rule value,
                                        Context ctx,
                                        Collector<String> out) throws Exception {
        ctx.getBroadcastState(ruleStateDescriptor).put(value.name, value);
    }

    @Override
    public void processElement(Item value,
                               ReadOnlyContext ctx,
                               Collector<String> out) throws Exception {

        final MapState<String, List<Item>> state = getRuntimeContext().getMapState(mapStateDesc);
        final Shape shape = value.getShape();
    
        for (Map.Entry<String, Rule> entry :
                ctx.getBroadcastState(ruleStateDescriptor).immutableEntries()) {
            final String ruleName = entry.getKey();
            final Rule rule = entry.getValue();
    
            List<Item> stored = state.get(ruleName);
            if (stored == null) {
                stored = new ArrayList<>();
            }
    
            if (shape == rule.second && !stored.isEmpty()) {
                for (Item i : stored) {
                    out.collect("MATCH: " + i + " - " + value);
                }
                stored.clear();
            }
    
            // there is no else{} to cover if rule.first == rule.second
            if (shape.equals(rule.first)) {
                stored.add(value);
            }
    
            if (stored.isEmpty()) {
                state.remove(ruleName);
            } else {
                state.put(ruleName, stored);
            }
        }
    }
}

Important notes

After describing the provided API s, this section focuses on the important things to remember when using broadcast status. These are:

  • **Cannot communicate across tasks: * * as mentioned earlier, this is why only the broadcast side of (Keyed)-BroadcastProcessFunction can modify the content of broadcast status. In addition, the user must ensure that all tasks modify the content of the broadcast state in the same way for each incoming element. Otherwise, different tasks may have different contents, resulting in inconsistent results.

  • The order of events in the broadcast state may vary from task to task: Although the elements of the broadcast stream can ensure that all elements (eventually) will enter all downstream tasks, the order in which elements arrive at each task may be different. Therefore, the update of the state of each incoming element cannot depend on the order of incoming events.

  • **All tasks check their broadcast status: * * although the broadcast status of all tasks has the same element when the checkpoint occurs, all tasks check their broadcast status. This is a design, To avoid reading all tasks from the same file during recovery (thus avoiding hot issues), although the cost of doing so is to increase the size of the checkpoint state p times (equal to parallelism). Flink guarantees that there will be no duplication and loss of data during restoring/rescaling. If you restore with the same or less parallelism, each task will read its checkpoint status.

  • **There is no RocksDB status backend: * * the broadcast status is saved in memory during operation, and the memory should be configured accordingly. This applies to all operator states.

Topics: flink