kafka series Producer interceptor

Posted by dotMoe on Tue, 21 Dec 2021 19:40:29 +0100

Producer interceptor Interceptor is a fairly new function. It was introduced in Kafka version 0.10 and is mainly used to realize the customized control logic on the clients side.

As a very small function, the Kafka interceptor has not been widely used since the introduction of version 0.10.

Kafka interceptor can be applied to scenarios including client monitoring, end-to-end system performance detection, message audit and other functions. For monitoring, we actually use the interceptor to realize similar functions

Interceptors in Spring If you have used Spring Interceptor or Apache Flume, you should not be unfamiliar with the concept of interceptor. Its basic idea is to allow applications to dynamically implement a set of pluggable event processing logic chains without modifying logic.

It can insert corresponding "interception" logic at multiple time points before and after the main business operation. The following figure shows the working principle of Spring MVC Interceptor:

Interceptor 1 and interceptor 2 insert corresponding processing logic before, after and after the request is sent.

The interceptors in Flume are the same. The logic they insert can be to modify the message to be sent, create a new message, or even discard the message. These functions are dynamically inserted into the application by configuring the interceptor class, so you can quickly switch different interceptors without affecting the logic of the main program.

Kafka interceptor draws on this design idea. You can dynamically implant different processing logic at multiple time points before and after message processing, such as before message sending or after message consumption.

Interceptor in kafka Kafka interceptors are divided into producer interceptors and consumer interceptors. The producer interceptor allows you to implant your interceptor logic before sending a message and after the message is submitted successfully; The consumer interceptor supports writing specific logic before consuming the message and after submitting the displacement.

It is worth mentioning that both interceptors support the chain method, that is, you can connect a group of interceptors in series into a large interceptor, Big data training Kafka will execute the interceptor logic in the order of addition

For the producer, the interceptor gives the user the opportunity to make some customization requirements for the message, such as modifying the message, before the message is sent and before the producer callback logic.

At the same time, producer allows users to specify multiple interceptors to act on the same message in order to form an interceptor chain. The implementation interface of Intercetpor is org apache. kafka. clients. producer. Producerinterceptor defines the following methods:

public interface ProducerInterceptor<K, V> extends Configurable {
// This method is called before the message is sent
public ProducerRecord<K, V> onSend(ProducerRecord<K, V> record);
// This method will be called after the message is successfully submitted or the sending fails
public void onAcknowledgement(RecordMetadata metadata, Exception exception);
// Close the interceptor, which is mainly used to clean up some resources
public void close();
void configure(Map<String, ?> configs);
}

onAcknowledgement: This method will be called after the message is successfully submitted or the sending fails,We know that there are callback notifications sent asynchronously callbackļ¼Œ onAcknowledgement Called before callback Call of.

It is worth noting that in big data training, this method and onSend are not invoked in the same thread, so if you call a shared variable object in these two methods, you must ensure thread safety.

Another important point is that this method is in the main path sent by the Producer, so it's best not to put too much logic in it, otherwise you will find that your Producer TPS drops sharply.

Define interceptor Here, we define an interceptor to count the number of messages, so as to master the data volume of a business line (Topic)

public class CountRecordProducerInterceptor implements ProducerInterceptor<String, String> {
private static Jedis jedis;
static {
jedis = new Jedis("localhost");
}
@Override
public ProducerRecord<String, String> onSend(ProducerRecord<String, String> record) {
jedis.incr("totalMessageCount");
return record;
}
@Override
public void onAcknowledgement(RecordMetadata metadata, Exception exception) {
if (exception==null){
jedis.incr("totalSuccessMessageCount");
}else {
jedis.incr("totalFailedMessageCount");
}
}

@Override
public void close() {
}
@Override
public void configure(Map<String, ?> configs) {
}
}

Use interceptors
 The interceptor used is also relatively simple. We just need to add it Properties afferent KafkaProducer Constructor for.
public class ProducerInterceptor {
KafkaProducer<String, String> producer = null;
@Before
public void setup() {
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "all");
props.put("retries", 0);
props.put("batch.size", 16384);
props.put("key.serializer", StringSerializer.class.getName());
props.put("value.serializer", StringSerializer.class.getName());
List<String> interceptors = new ArrayList<>();
// Interceptor 1
interceptors.add("com.kingcall.clients.producer.interceptors.interceptorEntity.producer.CountRecordProducerInterceptor");
props.put(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG, interceptors);
producer = new KafkaProducer<String, String>(props);
}
@Test
public void baseSend() {
ProducerRecord<String, String> record = new ProducerRecord<String, String>("test", "Precision Products", "France");
try {
Future<RecordMetadata> recordMetadataFuture = producer.send(record);
System.out.println(recordMetadataFuture.get().toString());
} catch (Exception e) {
e.printStackTrace();
}
}
}

View statistics

Note: do not use keys online*

summary Although the interceptor itself is a relatively small function, it can help us solve the problems we encounter The interceptor may affect the performance of the client, so it should be used reasonably and pay attention to monitoring