Kafka message sending and receiving Java instance
The main objects of producers are Kafka producer and producer record.
KafkaProducer is a class used to send messages, and ProducerRecord class is used to encapsulate Kafka messages.
Parameters and meanings to be specified for creating KafkaProducer:
parameterexplainbootstrap.serversConfigure how the producer establishes a connection with the ...
Posted by shareaweb on Fri, 18 Feb 2022 09:06:03 +0100
A detailed explanation of Kafka API
Absrtact: Kafka's APIs include Producer API, Consumer API, user-defined Interceptor (user-defined Interceptor), Streams API for processing streams and Kafka Connect API for building connectors.
This article is shared from Huawei cloud community< [Kafka notes] Kafka API analyzes the Java version in detail (Producer API, Consumer API, inter ...
Posted by Sianide on Fri, 11 Feb 2022 17:21:16 +0100
How to develop PyFlink API jobs from 0 to 1
Introduction: taking Flink 1.12 as an example, this paper introduces how to use Python language to develop Flink jobs through PyFlink API.
As the most popular stream batch unified computing engine, Apache Flink is widely used in real-time ETL, event processing, data analysis, CEP, real-time machine learning and other fields. Starting from Fl ...
Posted by LoganK on Thu, 10 Feb 2022 17:55:11 +0100
kafka2.6.0 installation configuration
kafka installation record:
Official website: http://kafka.apache.org/downloads.html
1. Download
yum install -y wget
wget https://mirrors.bfsu.edu.cn/apache/kafka/2.6.0/kafka_2.12-2.6.0.tgz
2. Decompression:
tar -zxvf kafka_2.12-2.6.0.tgz -C /opt/
3. Change of name
mv kafka_2.12-2.6.0 kafka
4. Create files
cd kafka
Create un ...
Posted by juschillinnow on Thu, 10 Feb 2022 13:58:48 +0100
High performance message oriented middleware Kafka
Kafka was originally developed by Linkedin company. It is a distributed, partition and multi replica distributed message system based on zookeeper coordination. Its biggest feature is that it can process a large amount of data in real time to meet various demand scenarios: such as hadoop based batch processing system, low latency real-time sy ...
Posted by s.eardley on Mon, 07 Feb 2022 08:46:25 +0100
[Flink] FlinkSQL metadata verification
1. General
Reprint: FlinkSQL metadata validation
Flink1. After 9, the CatalogManager was introduced to manage the Catalog and CatalogBaseTable. When executing DDL statements, the table information was encapsulated as CatalogBaseTable and stored in the CatalogManager. At the same time, the Schema interface of calculate is extended, so that c ...
Posted by sticks464 on Thu, 03 Feb 2022 07:29:54 +0100
Spark BigData Program: big data real-time stream processing log
Spark BigData Program: big data real-time stream processing log
1, Project content
Write python scripts to continuously generate user behavior logs of learning websites.Start Flume to collect the generated logs.Start Kafka to receive the log received by Flume.Use Spark Streaming to consume Kafka's user logs.Spark Streaming cleans the data ...
Posted by iriedodge on Mon, 31 Jan 2022 13:43:09 +0100
Analysis of kafka controller
preface
In a kafka cluster, add or delete a service node or topic. How does kafka manage when a partition is added to a topic? Today we will analyze one of kafka's core components, controller
What is controller
Controller Broker (kafkacontroller) is a Kafka service. It runs on each Broker in the Kafka cluster, but only one can be active ...
Posted by doucie on Sat, 29 Jan 2022 00:54:22 +0100
Kafka fast learning II (producer and consumer development)
Partition policy for producer to send messages
1. Default policy DefaultPartitioner
When sending messages, specify the partition with the highest priority
When you specify a key when sending a message, the module will be taken according to the hash value of the key
The partition and key are not specified when sending the message. Will be ra ...
Posted by Burns on Fri, 28 Jan 2022 12:58:58 +0100
Application of alpakka Kafka Kafka in distributed computing
kafka's distributed, high throughput, high availability features and various message consumption modes can ensure the security of message consumption in a multi node cluster environment: that is, to prevent each message from missing processing or repeated consumption. In particular, the exactly once consumption strategy can ensure that each mes ...
Posted by coreyp_1 on Fri, 28 Jan 2022 11:52:18 +0100