Kafka.Kafka&JAVA advanced API

Posted by bharrison89 on Sat, 20 Nov 2021 22:40:42 +0100

Kafka (V). Kafka & Java advanced API

1.Offset automatic control

When the consumer does not subscribe to the offset of topic, that is, kafka does not record the consumer's information, the consumer defaults to the first consumption strategy;

auto.offset.reset = latest

  • Latest subscription starts with the latest offset default
  • earliest single front partition
  • none reports an error to the consumer without finding the consumer's previous offset
//When the server has no consumer information, it is specified to pull the data of the oldest record of the obtained partition
properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");

Auto submit

The offset of kafka consumer consumption data is submitted periodically by default. Ensure that all messages are consumed at least once; Related parameters

//Auto submit
enable.commit.auto = true //default
anto.commit.interval.ms = 5000 //default

Auto submit time

Corresponding java code configuration file

properties.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, true);
properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, 10000);

Test: if the consumer is closed within 10000ms after consuming the data, but the data is not submitted automatically; After restarting the consumer, the data will still be received and the data will be consumed again. Until the time of automatic submission is exceeded and the data is determined to be consumed, the consumer will not receive this message;

Turn off auto submit

        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"group1");
        properties.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);

Every time consumers start, they start to consume from the front; Because it is not set to submit automatically, kafka will be received by consumers again in order to ensure that it is consumed at least once;

Manual submission

while (true){
            ConsumerRecords<String, String> records = kafkaConsumer.poll(Duration.ofSeconds(1));
            Map<TopicPartition, OffsetAndMetadata> offsetInfo = new HashMap<>();
            if(!records.isEmpty()){
                Iterator<ConsumerRecord<String, String>> iterator = records.iterator();
                while (iterator.hasNext()){
                    ConsumerRecord<String, String> next = iterator.next();
                    offsetInfo.put(new TopicPartition(next.topic(),next.partition()),new OffsetAndMetadata(next.offset()+1));//Note: next.offset()+1;

                    kafkaConsumer.commitAsync(offsetInfo, new OffsetCommitCallback() {
                        @Override
                        public void onComplete(Map<TopicPartition, OffsetAndMetadata> offsets, Exception exception) {
                            System.out.println("offsets:"+next.offset()+"|||||||||||"+"offsets:"+offsets+"key:"+next.key());
                            System.out.println("exception:"+exception);
                        }
                    });
                }
            }
        }

2.ACK &Retries

After sending a data, the producer asks the broker to reply to the ACK response at the specified time. If it does not respond within the specified time, the kafka producer will try to resend the data n times; Default ack = 1

  • Acks = 1 the leader will write the record to his local log, but will not wait for all followers to confirm and respond. In this case, if the leader fails immediately after confirming the record, and the follower fails before copying, the record will be lost

  • acks = 0. The producer will not wait for any response from the server. The record will be immediately added to the buffer of the socket. At this time, it is considered that it has been sent (that is, after sending the data to the local network card); At this time, you cannot assume that the server has received data

  • acks = all means that the Leader node waits for the full set of synchronous replica confirmation records. This ensures that at least one synchronous copy is still active and records will not be lost. It is the most powerful guarantee, which is equivalent to configuring acks = -1

If the producer does not receive the Ack response from Kafka's Leader within the specified time period, Kafka can turn on the retries mechanism.

request.timeout.ms = 30000 default

retires = 2147483674 default

Under the above circumstances, a message may be repeatedly written to the partition file. The idempotent write below can deal with this problem

//Guarantee ack mechanism
properties.put(ProducerConfig.ACKS_CONFIG,"all");
// Number of repeated transmissions
properties.put(ProducerConfig.RETRIES_CONFIG,5);
// Accept Leader ack timeout
properties.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG,10);

3. Idempotent write

In HTTP /1.1, idempotency is defined as: one or more requests for a resource should have the same result for the resource itself; That is, multiple executions have the same impact on the resource itself as one execution.

kafka supports idempotency in version 0.11.0.0. Idempotency is a characteristic of the producer's perspective. It can ensure that the data sent by the producer will not be lost and will not be repeated. The key of realizing idempotent writing in kafka is to identify whether the request is duplicate and filter the duplicate request;

Unique ID: to distinguish whether the request is unique or not, there should be a unique ID of the request in the request; Another parameter is whether the request has been processed. If there is a comparison between the new request and the processing record, it indicates that it is a duplicate request, and it will be rejected;

Idempotent exactly once; Messages are persisted to kafka Topic only once. During initialization, kafka will generate a PID and producer ID for the producer; A monotonically increasing serial number starting from 0 on the top of a PID board; When a new message comes, compare the serial number. If it is 1 larger than the serial number of the last persistent message, it indicates that it is a new message. If this is not the case, the broker judges that the producer is resending the message;

Enable.identotence = false default. Note: when using this, you must retrieve = true and acks= all

//Open idempotent precision write once
properties.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG,true);

4. Transaction control

Idempotent writing of kafka can ensure the atomicity of a record when it is sent, but the integrity of multiple records (multiple partitions) requires the transaction operation of kafka;

kafka introduced the concept of idempotence in 0.11.0.0, which also introduced the concept of transaction. kafka's transactions are classified as

  • Producer transaction Only (when a producer fails to produce multiple data, it will roll back, but the data will not be deleted. read_committed needs to be set)
  • Consumer producer transaction (microservice producer and consumer in one transaction)

Generally speaking, the consumer's default write level is read_ Uncommitted data, which may read the data of transaction failure. After the producer transaction is started, the user needs to set the transaction management level of the consumer;

isolation.level = read_uncommitted

This configuration has two options, one of which is read_committed: if transaction control is enabled, the consumer must set the isolation level of the transaction to read_committed

When producing producer transactions, you only need to specify the transaction.id attribute. Once the transaction is started. The idempotent write of the producer is enabled by default. However, the value of transaction.id must be unique. Only one transaction.id can be stored at the same time, and others can be closed

**Example * * producer transactions

    public static void main(String[] args) {
        //1. KafkaProducer is generally created as standard configuration
        Properties properties = new Properties();
        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "CentOSA:9092,CentOSB:9092,CentOSC:9092");
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        //

        //Opening a transaction must be configured as a producer
        properties.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "tx_id" + UUID.randomUUID().toString());
        // Configure kafka batch size
        properties.put(ProducerConfig.BATCH_SIZE_CONFIG, 1024);
        // The sending time is within the specified time_ SIZE_ If config is not enough, the ms value will be sent
        properties.put(ProducerConfig.LINGER_MS_CONFIG, 5);
        //  Idempotent retries acks
        properties.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
        properties.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 10);
        properties.put(ProducerConfig.RETRIES_CONFIG, 5);
        properties.put(ProducerConfig.ACKS_CONFIG, "all");

        KafkaProducer<String, String> kafkaProducer = new KafkaProducer(properties);


        kafkaProducer.initTransactions();
        try {
            kafkaProducer.beginTransaction();
            for (int i = 0; i < 10; i++) {
                ProducerRecord<String, String> record = new ProducerRecord<>("topic02", "key" + i, "value" + i);
                //send out
                kafkaProducer.send(record);

                //report errors
                if (i == 5) {
                    int b = 1 / 0; // Simulate error producer transaction rollback
                    //  Read uncommitted will read all sent data
                    // If you read the submitted records, you will get the records before 5
                }
            }
            //When the cache is enabled and needs to be refreshed, it can also be refreshed to kafka at intervals of several
            kafkaProducer.flush();
            kafkaProducer.commitTransaction();
        } catch (Exception e) {
            System.out.println(" Transaction error ");
            kafkaProducer.abortTransaction();
        } finally {
            kafkaProducer.close();
        }
    }
   public static void main(String[] args) {
        //1. KafkaConsumer is generally created in standard configuration
        Properties properties = new Properties();
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"CentOSA:9092,CentOSB:9092,CentOSC:9092");
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class.getName());
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,"g2");

        //Set the consumption isolation level. This is the focus. By default, read is not submitted
        properties.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG,"read_committed");

        KafkaConsumer<String,String> kafkaConsumer = new KafkaConsumer (properties);
        kafkaConsumer.subscribe(Pattern.compile("^topic02.*"));
        while (true){
            ConsumerRecords<String, String> records = kafkaConsumer.poll(Duration.ofSeconds(1));
            if(!records.isEmpty()){
                Iterator<ConsumerRecord<String, String>> iterator = records.iterator();
                while (iterator.hasNext()){
                    ConsumerRecord<String, String> next = iterator.next();
                    System.out.println(next.key());
                }
            }
        }
    }

Ex amp le consumer & producer transactions

topic1 producer: production data

kafkaProducer.initTransactions();
        try {
            kafkaProducer.beginTransaction();
            for (int i = 0; i < 10; i++) {
                ProducerRecord<String, String> record = new ProducerRecord<>("topic01", "key" + i, "value" + i);
                //send out
                kafkaProducer.send(record);

                //report errors
                if (i == 5) {
                    int b = 1 / 0; // Simulate error producer transaction rollback
                    //  Read uncommitted will read all sent data
                    // If you read the submitted records, you will get the records before 5
                }
            }
            //When the cache is enabled and needs to be refreshed, it can also be refreshed to kafka at intervals of several
            kafkaProducer.flush();
            kafkaProducer.commitTransaction();
        } catch (Exception e) {
            System.out.println(" Transaction error ");
            kafkaProducer.abortTransaction();
        } finally {
            kafkaProducer.close();
        }

topic1 consumer & topic2 producer = = = "topic 01 error failed rollback = =" topic2 will not receive data

 public static void main(String[] args) {
        KafkaConsumer kafkaTopic01Consumer = buildConsummer("g2");
        kafkaTopic01Consumer.subscribe(Arrays.asList("topic01"));
        KafkaProducer kafkaTopic02Producer = buildProducer();

        //1. Initialization
        kafkaTopic02Producer.initTransactions();
        while (true){
            ConsumerRecords<String, String> records = kafkaTopic01Consumer.poll(Duration.ofSeconds(1));
            if(!records.isEmpty()){
                Map<TopicPartition, OffsetAndMetadata> offsets = new HashMap<>();
                Iterator<ConsumerRecord<String, String>> iterator = records.iterator();
                //Open 1 transaction
                kafkaTopic02Producer.beginTransaction();
                try {
                    //Business code processing
                    while (iterator.hasNext()){
                        ConsumerRecord<String, String> record = iterator.next();
                        System.out.println(record.key());
                        offsets.put(new TopicPartition(record.topic(),record.partition()),new OffsetAndMetadata(record.offset()+1));
                        //Data built to the next business / topic2
                        ProducerRecord<String,String> nextRecord = new ProducerRecord<>("topic02,",record.key(),record.value()+"Processing business 1");
                        kafkaTopic02Producer.send(nextRecord);
                    }
                    //Commit transaction
                    kafkaTopic02Producer.sendOffsetsToTransaction(offsets,"g2");
                    kafkaTopic02Producer.commitTransaction();
                } catch (ProducerFencedException e) {
                    System.out.println("error");
                    //If you roll back the next topic02 / business processing topic, you will not receive the micro service transaction Association
                    kafkaTopic02Producer.abortTransaction();
                }
            }
        }
    }

    public static KafkaProducer buildProducer(){
        //1. KafkaProducer is generally created as standard configuration
        Properties properties = new Properties();
        properties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "CentOSA:9092,CentOSB:9092,CentOSC:9092");
        properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        //

        //Opening a transaction must be configured as a producer
        properties.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "tx_id" + UUID.randomUUID().toString());
        // Configure kafka batch size
        properties.put(ProducerConfig.BATCH_SIZE_CONFIG, 1024);
        // The sending time is within the specified time_ SIZE_ If config is not enough, the ms value will be sent
        properties.put(ProducerConfig.LINGER_MS_CONFIG, 5);
        //  Idempotent retries acks
        properties.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true);
        properties.put(ProducerConfig.REQUEST_TIMEOUT_MS_CONFIG, 10);
        properties.put(ProducerConfig.RETRIES_CONFIG, 5);
        properties.put(ProducerConfig.ACKS_CONFIG, "all");

        KafkaProducer<String, String> kafkaProducer = new KafkaProducer(properties);
        return kafkaProducer;
    }

    public static KafkaConsumer buildConsummer(String groupId){
        //1. KafkaConsumer is generally created in standard configuration
        Properties properties = new Properties();
        properties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,"CentOSA:9092,CentOSB:9092,CentOSC:9092");
        properties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
        properties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class.getName());
        properties.put(ConsumerConfig.GROUP_ID_CONFIG,groupId);


        //Set consumption isolation level
        properties.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG,"read_committed");
        //The consumer's offset auto submission must be turned off here
        properties.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,false);

        KafkaConsumer<String,String> kafkaConsumer = new KafkaConsumer (properties);
        return kafkaConsumer;
    }

Topics: message queue