An Overview of zookeeper

Posted by neylitalo on Sun, 11 Aug 2019 17:12:54 +0200

In the past, because of the business needs of the company, some knowledge points of zookeeper were sorted out and shared. For some small partners who have just come into contact with zookeeper, maybe we can learn something from it.

Introduction to zookeeper

brief introduction
Zookeeper is committed to providing a distributed coordination service with high performance, high availability and strict sequential access control capabilities.

design goal

  • Simple data structure: shared tree structure, similar to file system, stored in memory;
  • Cluster can be built: to avoid single point of failure, 3-5 machines can form a cluster, more than half of them can work normally to provide services to the outside world;
  • Sequential access: For each write request, zk assigns a globally unique incremental number, which can be used to achieve advanced coordination services.
  • High Performance: Based on memory operation, it serves non-transactional requests, and is suitable for business scenarios with read operation as the main part. Three zk clusters can reach 13w QPS.

Application scenarios

  1. Data Publishing and Subscription
  2. load balancing
  3. Naming Service
  4. Master election
  5. Cluster Management
  6. configuration management
  7. Distributed queue
  8. Distributed Lock

2. zookeeper Characteristics

Session: A session connection between client and server, essentially a long TCP connection, through which heartbeat detection and data transmission can be carried out.

Data Node

  1. PERSISTENT
  2. Persistent Sequential Node (PERSISTENT_SEQUENTIAL)
  3. Temporary Node (EPHEMERAL)
  4. Temporary Sequential Node (EPHEMERAL_SEQUENTIAL)

For persistent and temporary nodes, the name of the node is unique under the same znode: [center red 20px]

Watcher event listener: The client can register the listener on the node. When a specific event occurs, zk notifies the interested client.
EventType: NodeCreated,NodeDeleted,NodeDataChanged,NodeChildrenChange

ACL:Zk uses ACL(access control lists) policy to control permissions
Permission types: create,read,write,delete,admin

Common commands of zookeeper

  1. Start the ZK service: bin/zkServer.sh start
  2. View ZK service status: bin/zkServer.sh status
  3. Stop ZK service: bin/zkServer.sh stop
  4. Restart ZK service: bin/zkServer.sh restart
  5. Client Connection: zkCli.sh-server 127.0.0.1:2181
  6. Display directory: ls/
  7. Create: create /zk "test"
  8. Get the value: get /zk
  9. Modified value: set/zk "test"
  10. Delete: delete/zk
  11. ACL:

    • getAcl / setAcl
    • addauth

4. zookeeper's java client

<dependency>
            <groupId>org.apache.curator</groupId>
            <artifactId>curator-framework</artifactId>
            <version>2.12.0</version>
</dependency>
<dependency>
            <groupId>org.apache.curator</groupId>
            <artifactId>curator-recipes</artifactId>
            <version>2.12.0</version>
 </dependency>
public class App {
    public static void main(String[] args) throws Exception {
        String connectString = "211.159.174.226:2181";

        RetryPolicy retryPolicy = getRetryPolicy();
        CuratorFramework client = CuratorFrameworkFactory.newClient(connectString, 5000, 5000, retryPolicy);
        client.start();

        //crud
        client.create().withMode(CreateMode.PERSISTENT).forPath("/test-Curator-PERSISTENT-nodata");
        client.create().withMode(CreateMode.PERSISTENT).forPath("/test-Curator-PERSISTENT-data", "test-Curator-PERSISTENT-data".getBytes());
        client.create().withMode(CreateMode.EPHEMERAL).forPath("/test-Curator-EPHEMERAL-nodata");
        client.create().withMode(CreateMode.EPHEMERAL).forPath("/test-Curator-EPHEMERAL-data", "/test-Curator-EPHEMERAL-data".getBytes());

        for (int i = 0; i < 5; i++) {
            client.create().withMode(CreateMode.PERSISTENT_SEQUENTIAL).forPath("/test-Curator-PERSISTENT_SEQUENTIAL-nodata");
        }

        byte[] bytes = client.getData().forPath("/test-Curator-PERSISTENT-data");
        System.out.println("----------zk Node data:" + new String(bytes) + "------------");

        client.create().withMode(CreateMode.PERSISTENT).forPath("/test-listener", "test-listener".getBytes());
        final NodeCache nodeCache = new NodeCache(client, "/test-listener");
        nodeCache.start();
        NodeCacheListener listener = new NodeCacheListener() {

            @Override
            public void nodeChanged() throws Exception {
                System.out.println("node changed : " + nodeCache.getCurrentData());
            }
        };
        nodeCache.getListenable().addListener(listener);

        client.setData().forPath("/test-listener", "/test-listener-change".getBytes());

    }
    /**
     * RetryOneTime: Just reconnect once.
     * RetryNTime: Specifies the number of reconnections N.
     * RetryUtilElapsed: Specify maximum reconnection timeout and reconnection interval, intermittent reconnection until the timeout or link success.
     * ExponentialBackoffRetry: The difference between retryUtil Elapsed and "backoff" based reconnection is that the time interval of reconnection is dynamic.
     * BoundedExponentialBackoffRetry: With Exponential Backoff Retry, the maximum number of retries is increased.
     */
    public static RetryPolicy getRetryPolicy() {
        return new ExponentialBackoffRetry(1000, 3);
    }
}

V. Distributed Locks

public class ZookeeperLock {

    private final String lockPath = "/distributed-lock";
    private String connectString;
    private RetryPolicy retry;
    private CuratorFramework client;
    private InterProcessLock interProcessMutex;


    public void init() throws Exception {
        connectString = "211.159.174.226:2181";
        retry = new ExponentialBackoffRetry(1000, 3);
        client = CuratorFrameworkFactory.newClient(connectString, 60000, 15000, retry);
        client.start();

        //Shared reentrant locks
        interProcessMutex = new InterProcessMutex(client,lockPath);
    }

    public void lock(){
        try {
            interProcessMutex.acquire();
        } catch (Exception e) {
            System.out.println("The lock failed.,Miserable");
        }
    }

    public void unlock(){
        try {
            interProcessMutex.release();
        } catch (Exception e) {
            System.out.println("Release failed, even worse.");
        }
    }

    public static void main(String[] args) throws Exception {
        final ZookeeperLock zookeeperLock = new ZookeeperLock();
        zookeeperLock.init();

        Executor executor = Executors.newFixedThreadPool(5);
        for (int i = 0;i<50;i++) {
            executor.execute(new Runnable() {
                @Override
                public void run() {
                    zookeeperLock.lock();
                    Long time = System.nanoTime();
                    System.out.println(time);
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                    System.out.println(time);
                    zookeeperLock.unlock();
                }
            });
        }

        while (true){

        }
    }
}

6. zab Protocol

  1. Three Node States Defined by ZAB Protocol

    • Looking: Election status.
    • Following: The state of the Follower node (slave node).
    • Leading: The state of the Leader node (primary node).
  2. Zxid(64-bit data structure)
    Top 32: Leader cycle number myid
    Low 32 bits: The incremental sequence of transactions (monotonically incremental sequence) is + 1 as long as the client requests it.
    When a new Leader is generated, the largest transaction zxid in the local log is taken out from the Leader server and epoch+1 is read out as a new epoch, which will be at a low position of 32 0 (guaranteeing absolute self-growth of id).
  3. Collapse recovery

    • Each server has a vote < myid, zxid > to vote for itself.
    • Collect votes from various servers.
    • Comparing votes, comparing logic: first compare zxid, then compare myid.
    • Change server status (crash recovery=) data synchronization, or crash recovery= message broadcasting)

  1. Message broadcasting (similar to 2P submission):

    • After the Leader accepts the request, he says that the request gives the global unique 64-bit self-increasing Id (zxid).
    • Send zxid to all follower s as a bill.
    • After all follower s have accepted the proposal, they want to write it to the hard disk and immediately reply to Leader with an ACK (OK).
    • When Leader receives a legal number of Acks, Leader sends commit commands to all follower s.
    • follower executes the commit command.
    • PS:: At this stage, the ZK cluster officially provides services to the outside world, and Leader can broadcast messages. If new nodes join, synchronization is needed.

More articles focus on blogs: https://www.zplxjj.com And public name

Topics: Java Zookeeper Session Apache