Timer used in Flink program

Posted by LiamS94 on Thu, 23 Dec 2021 17:49:53 +0100

Timer used in Flink program

Sometimes, we need to use timers in computing tasks to help us deal with business, such as automatic settlement of orders? Automatic praise? Regular collection? Wait

However, it should be noted that we cannot flexibly configure CRON expressions for computing tasks, but only specify the trigger time.

1, What kind of Flink job can start the timer

The JOB that needs to start the scheduled JOB must be processed by the KeyedProcessFunction low-order function, not Window

We can execute our data processing logic and start the timer in the processElement method.

OnTimer method is the specific method executed when the timer is triggered,

2, Timer function display

Result display

Order time and automatic approval time

When the corresponding order praise time is up, check whether to evaluate and turn on automatic praise

3, Logical implementation

(1) Custom data source

I'm dead here. You can implement the RichSourceFunction interface according to your own needs

package com.leilei;

import cn.hutool.core.util.RandomUtil;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;

import java.util.UUID;
import java.util.concurrent.atomic.AtomicInteger;

/**
 * @author lei
 * @version 1.0
 * @date 2021/3/21 22:00
 * @desc Simulated order source
 */
public class MyOrderSource extends RichSourceFunction<Order> {
    private Boolean flag = true;
    private final String[] products = new String[]{"Braised chicken and rice", "Beijing Roast Duck", "Bridgehead spareribs"};
    private final String[] users = new String[]{"Ma bond", "Huang sirang", "Zhang Mazi"};
    AtomicInteger num;

    @Override
    public void run(SourceContext<Order> ctx) throws Exception {
        while (flag) {
            Order order = Order.builder()
                    .product(products[RandomUtil.randomInt(3)])
                    .username(users[RandomUtil.randomInt(3)])
                    .orderId(UUID.randomUUID().toString().replace("-", "."))
                    .orderTime(System.currentTimeMillis())
                    .build();
            Thread.sleep(5000);
            // The comment code simulates that there is only one timer for the same key. After the execution time, it overwrites before
            //if (num.get()<4) {
            Thread.sleep(5000);
            }else {
                Thread.sleep(500000);
            }
            num.incrementAndGet();
            ctx.collect(order);
        }
    }

    @Override
    public void cancel() {
        flag = false;
    }
}
package com.leilei;

import lombok.AllArgsConstructor;
import lombok.Builder;
import lombok.Data;
import lombok.NoArgsConstructor;

@NoArgsConstructor
@AllArgsConstructor
@Data
@Builder
/**
 * @author lei
 * @version 1.0
 * @date 2021/3/21 22:00
 * @desc Order simulation object
 */
public  class Order {
    private String product;
    private String username;
    private String orderId;
    private Long orderTime;
}

(2) Custom order KEY generator

package com.leilei;

import java.text.SimpleDateFormat;

/**
 * @author lei
 * @version 1.0
 * @desc
 * @date 2021-03-23 11:16
 */
public class KeyUtil {
    public static String buildKey(Order order) {
        String date = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss").format(order.getOrderTime());
        return order.getUsername()+"--" + order.getProduct()+"--" + order.getOrderId() + "--Order time :" + date;
        // The comment code simulates that there is only one timer for the same key. After the execution time, it overwrites before
        //return order.getUsername();
    }
}

(3) ProcessFunction logic processing and timer function

(1) Define lower order processing functions

We need to inherit the abstract class KeyedProcessFunction, because my computing tasks will be grouped according to the KEY

(2) Write calculation logic and timer registration in processing function

If our computing program is ProcessFunction, one element will trigger the processElement method once. It is a real flow processing, and can only be processed one by one. Unlike window, we can customize the size of the window (window is the bridge between flow and batch).

[the external chain image transfer fails. The source station may have an anti-theft chain mechanism. It is recommended to save the image and upload it directly (img-tagcu2me-1629024341221) (C: \ users \ leile \ appdata \ roaming \ typora \ user images \ image-20210815175350234. PNG)]

value: is our input element

ctx: it is an execution environment in which we can start the timer, obtain the current KEY, obtain the side bit output (OutPutTag), and so on

out: is the output element result collector

(1) Register timer

The trigger time of the timer supports processing time and event time... We can register ourselves according to our choice

(2) What is the difference between processing time timer and event time timer?

Processing time: that is, the time for the flink operator to process data (increasing with the processing of data (elements))

ex: there are now three data sources in the following order: a > b > C

C enters the operator and registers a processing time timer (12:02:00). The time on the machine where the Flink job is located is 12:00:00, so now the processing time of the Flink job is 12:00:00

B enters the operator. The time on the machine where the Flink job is located is 12:01:00, so now the processing time of the Flink job is 12:01:00

A enters the operator. The time on the machine where the Flink job is located is 12:02:00, so now the processing time of the Flink job is 12:02:00

After the job processes the A element, it will trigger the timer registered by C (the processing time has been greater than or equal to 12:02:00)

The event time is the time attribute carried by the data itself (whether it will increase is affected by the time of the source data. The selection of event time is affected by whether there is a KEY or not. Please refer to the previous TimeWindow for details)

ex: now there are three data sources in the following order: a > b > C. assuming that their data enters the Flink program a little late, but they each contain their own time attribute a (12:01:00) > b (12:01:00) > C (12:00:00)

C enters the operator and registers an event timer (12:02:00), and the time of C element becomes the latest event time of Flink job

B enters the operator. The event time of element B is 12:01:00, and the time of element B becomes the latest event time of Flink job

A enters the operator, but the event time of element a is still 12:01:00, which is equal to the event time of Flink job, so the event time of Flink job is still 12:01:00

After the calculation of A is completed, the timer (12:02:00) triggered and registered by C will not be triggered, because in the current Flink job, the event time is up (12:01:00), and it will wait until there is data with an event time greater than or equal to 12:02:00

(3) Complete code of processing function

package com.leilei;

import org.apache.flink.api.common.state.MapState;
import org.apache.flink.api.common.state.MapStateDescriptor;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.KeyedProcessFunction;
import org.apache.flink.util.Collector;

import java.text.SimpleDateFormat;
import java.util.Iterator;
import java.util.Map;

/**
 * @author lei
 * @version 1.0
 * @date 2021/3/21 22:17
 * @desc Simulation order automatic evaluation process calculation
 */
public class OrderSettlementProcess extends KeyedProcessFunction<String, Order, Object> {
    private final Long overTime;
    MapState<String, Long> productState;
    SimpleDateFormat df = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");

    public OrderSettlementProcess(Long overTime) {
        this.overTime = overTime;
    }

    /**
     * data processing
     *
     * @param currentOrder
     * @param ctx
     * @param out
     * @throws Exception
     */
    @Override
    public void processElement(Order currentOrder, Context ctx, Collector<Object> out) throws Exception {
        long time = currentOrder.getOrderTime() + this.overTime;
        //Register a processing time timer whose trigger time is value getOrderTime() + this. overTime
        ctx.timerService().registerProcessingTimeTimer(time);
        // The comment code simulates that there is only one timer for the same key, and the execution time overrides the previous one (remove the previous timer)
        //if (productState.contains(ctx.getCurrentKey())) {
        //    ctx.timerService().deleteProcessingTimeTimer(productState.get(ctx.getCurrentKey()));
        //}
        productState.put(ctx.getCurrentKey(), time);
        System.out.println(KeyUtil.buildKey(currentOrder) + " Order expires on:" + time + " :" + df.format(time));
    }

    /**
     * Timed task trigger
     *
     * @param timestamp This is the CTX set above timerService(). registerProcessingTimeTimer(time);  time stamp
     * @param ctx       The context environment can obtain the grouping key of the current Process / set the timer, etc
     * @param out       data collection
     * @throws Exception
     */
    @Override
    public void onTimer(long timestamp, OnTimerContext ctx, Collector<Object> out) throws Exception {
        super.onTimer(timestamp, ctx, out);
        System.out.println("Scheduled task execution:" + timestamp + ":" + ctx.getCurrentKey());
        Iterator<Map.Entry<String, Long>> orderIterator = productState.iterator();
        if (orderIterator.hasNext()) {
            Map.Entry<String, Long> orderEntry = orderIterator.next();
            String key = orderEntry.getKey();
            Long expire = orderEntry.getValue();
            //Simulate calling to query order status
            if (!isEvaluation(key) && expire == timestamp) {
                //todo data collection
                System.err.println(key + ">>>>> If the order is not evaluated and the maximum evaluation time is exceeded, the default setting is five-star praise!");
            } else {
                System.out.println(key + "Order has been evaluated!");
            }

        }
    }

    @Override
    public void open(Configuration parameters) throws Exception {
        super.open(parameters);
        //Define map storage state
        MapStateDescriptor<String, Long> mapStateDescriptor = new MapStateDescriptor<>("productState",
                TypeInformation.of(String.class),
                TypeInformation.of(Long.class));
        productState = getRuntimeContext().getMapState(mapStateDescriptor);
    }

    public Boolean isEvaluation(String orderKey) {
        //todo query order status
        return false;
    }
}

(4) Flink timer DEMO main startup class

package com.leilei;

import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;

/**
 * @author lei
 * @version 1.0
 * @date 2021/3/21 22:00
 * @desc flink Timer (automatic praise, overtime settlement, etc.)
 */
public class FlinkTimer {
    public static void main(String[] args) {

        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<Order> streamSource = env.addSource(new MyOrderSource());
        DataStream<Object> stream = streamSource
                //Group by order
                .keyBy(KeyUtil::buildKey)
                .process(new OrderSettlementProcess(120 * 1000L));
        stream.printToErr();
        try {
            env.execute();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Topics: Java Big Data flink