Version 1
CentOS 7.5
zookeeper-3.4.12
II. zookeeper Installation
1. Download and decompress zookeeper compression package
2. Creating Data and Log Folders
mkdir /usr/local/zookeeper-3.4.12/data
mkdir /usr/local/zookeeper-3.4.12/logs
3. Copy configuration files
Go to the conf directory and copy zoo_sample.cfg
cp zoo_sample.cfg zoo.cfg
4. Enter the data directory and execute commands
echo 1 > myid
5. Modifying configuration files
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/usr/local/zookeeper-3.4.12/data dataLogDir=/usr/local/zookeeper-3.4.12/logs # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 #Cluster Server Address server.1=IP1:2888:3888 server.2=IP2:2888:3888 server.3=IP3:2888:3888
6. Start zookeeper
III. Installation of kafka
1. Download and decompress kafka packages
tar -zvxf kafka_2.12-1.1.0.tgz
2. Modifying configuration files
Open the kafka configuration file
vim server.properties
Modify the configuration
# The server Id,Set to a unique number. The three servers can be set to 1, 2 and 3 respectively. broker.id=1 #Monitoring address advertised.listeners=PLAINTEXT://IP address:9092 # Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details #listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL #Number of threads in kafka network communication num.network.threads=3 #Number of threads for kafka IO operations num.io.threads=8 # The send buffer (SO_SNDBUF) used by the socket server socket.send.buffer.bytes=102400 # The receive buffer (SO_RCVBUF) used by the socket server socket.receive.buffer.bytes=102400 # The maximum size of a request that the socket server will accept (protection against OOM) socket.request.max.bytes=104857600 #Data Storage Path log.dirs=/tmp/kafka-logs #Number of default partition s num.partitions=1 # The number of threads per data directory to be used for log recovery at startup and flushing at shutdown. # This value is recommended to be increased for installations with data dirs located in RAID array. num.recovery.threads.per.data.dir=1 #In the cluster state, to ensure availability, you need to set it to be greater than 1, where it is set to 3 offsets.topic.replication.factor=3 transaction.state.log.replication.factor=3 transaction.state.log.min.isr=3 ############################# Log Flush Policy ############################# #Log retention time log.retention.hours=168 # A size-based retention policy for logs. Segments are pruned from the log unless the remaining # segments drop below log.retention.bytes. Functions independently of log.retention.hours. #log.retention.bytes=1073741824 # The maximum size of a log segment file. When this size is reached a new log segment will be created. log.segment.bytes=1073741824 # The interval at which log segments are checked to see if they can be deleted according # to the retention policies log.retention.check.interval.ms=300000 ############################# Zookeeper ############################# # Zookeeper connection string (see zookeeper docs for details). # This is a comma separated host:port pairs, each corresponding to a zk # server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002". # You can also append an optional chroot string to the urls to specify the # root directory for all kafka znodes. zookeeper.connect=IP1:2181,IP2:2181,IP3:2181 # Timeout in ms for connecting to zookeeper zookeeper.connection.timeout.ms=6000 ############################# Group Coordinator Settings ############################# # The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance. # The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms. # The default value for this is 3 seconds. # We override this to 0 here as it makes for a better out-of-the-box experience for development and testing. # However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup. #group.initial.rebalance.delay.ms=0 # Whether to create themes automatically flase no true yes auto.create.topics.enable=false ## Allow deletion of theme, default is false delete.topic.enable=true
3. Start up kafka
bin/kafka-server-start.sh -daemon config/server.properties &
IV. Relevant parameters
broker server configuration
message.max.bytes (default: 1M) - The maximum number of bytes a broker can receive a message, which should be greater than or equal to the producer's max.request.size and less than or equal to the consumer's fetch.message.max.bytes, otherwise the broker will hang because the consumer cannot use the message.
log.segment.bytes (default: 1GB) - The size of the kafka data file ensures that the value is greater than the length of a message. Generally speaking, you can use the default value (usually a message is difficult to exceed 1G, because it is a message system, not a file system).
replica.fetch.max.bytes (default::1MB) - Maximum number of bytes of messages that broker can replicate. This value should be larger than message.max.bytes, otherwise the broker will receive the message, but it will not be able to copy the message, resulting in data loss.
Consumer Consumer Configuration
fetch.message.max.bytes (default 1MB) - The largest message a consumer can read. This value should be greater than or equal to message.max.bytes. So, if you have to choose kafka to send big messages, there are still some things to consider. To deliver big messages, we need to consider the impact of big messages on clusters and topics at the beginning of the design, not after problems arise.
Producer Producer Configuration
buffer.memory (default: 32M) - producer buffer size setting, if the buffer is large enough, producer can always write, but does not mean that the message is actually sent;
Bach. size (default: 16384 byte) - The size of each packet is set so that the packet can be sent when it reaches the specified size. There will be multiple packets in the buffer.
linger.ms - If the packet size has not been up to batch.size, set the maximum waiting time, the message will be sent out;
max.request.size (default: 1M) - The maximum size of data sent by a producer at a time, which is larger than the size of batch.size
Notes
1. To ensure that all partitions are available, offsets.topic.replication.factor is configured at least 3.
2. Turn off the automatic theme creation, and try to ensure that all broker s in the cluster start up, then start client consumption, otherwise it can not guarantee that partition and its replicas are evenly distributed, affecting high availability;
3. After the cluster is started, the distribution of partitions and their replicas can be viewed by command.
bin/kafka-topics.sh --describe --zookeeper localhost:2182 --topic __consumer_offsets
Pay attention to Wechat Public Number and check out more technical articles.