Directory navigation
- Preface
- data storage
- Using zookeeper based on Java API
- Event mechanism
- How to register event mechanism
- watcher event type
- What kind of operation will produce what kind of event?
- Implementation principle of transaction
- In depth analysis of the implementation principle of Watcher mechanism
- ClientCnxn initialization
- Clients register to listen through exists
- cnxn.submitRequest
- Sending process of SendThread
- Network interaction between client and server
- Processing flow of receiving request of server
- The client receives the response processed by the server
- Event triggering
- Epilogue
Preface
We will focus on four aspects of distributed coordination services
- Preliminary understanding of Zookeeper
- Understand the core principles of Zookeeper
- Practice and principle analysis of Zookeeper
- Zookeeper practice with registry to complete RPC handwriting
In this section, we will talk about the first part: Zookeeper practice and principle analysis
data storage
- Transaction log
In the zoo.cfg file, specify the file path of datadir
- snapshot log
File path storage based on datadir
- Runtime log
bin/zookeeper.out
Using zookeeper based on Java API
First, start the zookeeper cluster. We have talked about it in the previous section, and we will not repeat it here.
Next, I use pom to import the dependency of zookeeper.
<dependency> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> <version>3.4.8</version> </dependency>
Of course, you can also use jar package to introduce~
Then we start to establish the connection:
public static void main(String[] args) { try { //Pass in the cluster ip: port number of zookeeper ZooKeeper zookeeper = new ZooKeeper("192.168.200.111:2181,192.168.200.112:2181,192.168.200.113:2181",4000,null); System.out.println(zookeeper.getState()); try { Thread.sleep(1000); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println(zookeeper.getState()); } catch (IOException e) { e.printStackTrace(); } }
It can be found that the connection must be changed into connected through thread blocking
So we use JUC's CountDownLatch to make an upgrade
public static void main(String[] args) { try { final CountDownLatch countDownLatch=new CountDownLatch(1); ZooKeeper zooKeeper= new ZooKeeper("192.168.200.111:2181," + "192.168.200.112:2181,192.168.200.113:2181", 4000, new Watcher() { @Override public void process(WatchedEvent event) { if(Event.KeeperState.SyncConnected==event.getState()){ //If a response event is received from the server, the connection is successful countDownLatch.countDown(); } } }); countDownLatch.await(); System.out.println(zooKeeper.getState());//CONNECTED //Add node zooKeeper.create("/zk-persis-mic","0".getBytes(),ZooDefs.Ids.OPEN_ACL_UNSAFE,CreateMode.PERSISTENT); Thread.sleep(1000); Stat stat=new Stat(); //Get the value of the current node byte[] bytes=zooKeeper.getData("/zk-persis-mic",null,stat); System.out.println(new String(bytes)); //Modify node values zooKeeper.setData("/zk-persis-mic","1".getBytes(),stat.getVersion()); //Get the value of the current node byte[] bytes1=zooKeeper.getData("/zk-persis-mic",null,stat); System.out.println(new String(bytes1)); zooKeeper.delete("/zk-persis-mic",stat.getVersion()); zooKeeper.close(); System.in.read(); } catch (IOException e) { e.printStackTrace(); } catch (InterruptedException e) { e.printStackTrace(); } catch (KeeperException e) { e.printStackTrace(); } }
Similar to redis, we used the client of zookeeper in the last section. Here, we just used idea to introduce the dependency of zookeeper, docked the api of zookeeper, and realized the operation of establishing connection and CRUD.
TIps:
Learning is to draw inferences from one example. It's better to use all kinds of methods. Here is zookeeper. XXX.jar will be popular again tomorrow. It's also a similar operation~
Event mechanism
Watcher monitoring mechanism is a very important feature of zookeeper. Based on the nodes created on zookeeper, we can bind monitoring events to these nodes. For example, we can monitor events such as node data change, node deletion, child node status change, etc. through this event mechanism, we can realize distributed lock, cluster management and other functions based on zookeeper
Watcher feature: when the data changes, zookeeper will generate a watcher event and send it to the client. But the client receives only one notification. If the subsequent node changes again, the client that previously set the Watcher will not receive the message again. (watcher is a one-time operation). It can achieve permanent monitoring effect through cyclic monitoring
How to register event mechanism
Bind events through these three operations:
- getData
- Exists
- getChildren
How to trigger an event? Any transaction type operation will trigger a listening event. create /delete /setData
public static void main(String[] args) throws IOException, InterruptedException, KeeperException { final CountDownLatch countDownLatch=new CountDownLatch(1); final ZooKeeper zooKeeper= new ZooKeeper("192.168.11.153:2181," + "192.168.11.154:2181,192.168.11.155:2181", 4000, new Watcher() { @Override public void process(WatchedEvent event) { System.out.println("Default event: "+event.getType()); if(Event.KeeperState.SyncConnected==event.getState()){ //If a response event is received from the server, the connection is successful countDownLatch.countDown(); } } }); countDownLatch.await(); //Create persistent node zooKeeper.create("/zk-persis-mic","1".getBytes(), ZooDefs.Ids.OPEN_ACL_UNSAFE,CreateMode.PERSISTENT); //exists getdata getchildren //Binding events through exists Stat stat=zooKeeper.exists("/zk-persis-mic", new Watcher() { @Override public void process(WatchedEvent event) { System.out.println(event.getType()+"->"+event.getPath()); try { //Binding events again zooKeeper.exists(event.getPath(),true); } catch (KeeperException e) { e.printStackTrace(); } catch (InterruptedException e) { e.printStackTrace(); } } }); //Trigger the listening event by modifying the transaction type operation stat=zooKeeper.setData("/zk-persis-mic","2".getBytes(),stat.getVersion()); Thread.sleep(1000); zooKeeper.delete("/zk-persis-mic",stat.getVersion()); System.in.read(); }
watcher event type
public interface Watcher { void process(WatchedEvent var1); public interface Event { public static enum EventType { //When the client link status changes, it will receive the event of none None(-1), //Create the event for the node. For example, ZK persis mic NodeCreated(1), //Event to delete a node NodeDeleted(2), //Node data changes NodeDataChanged(3), //Nodes are created, deleted and triggered by events NodeChildrenChanged(4); } } }
What kind of operation will produce what kind of event?
~ | ZK persis Mic (listening event) | ZK persis mic / child (listening event) |
---|---|---|
create(/zk-persis-mic) | NodeCreated(exists getData) | nothing |
delete(/zk-persis-mic) | NodeDeleted(exists getData) | nothing |
setData(/zk-persis-mic/children) | NodeDataChanged(exists getData) | nothing |
create(/zk-persis-mic/children) | NodeChildrenChanged(getchild) | nothing |
detete(/zk-persis-mic/children) | NodeChildrenChanged (getchild) | nothing |
setData(/zk-persis-mic/children) | nothing |
Implementation principle of transaction
In depth analysis of the implementation principle of Watcher mechanism
ZooKeeper's Watcher mechanism can be generally divided into three processes:
- Client registration Watcher
- Server processing Watcher
- Client callback Watcher
There are three ways for the client to register the watcher
- getData
- exists
- getChildren
Take the following code as an example to analyze the principle of the whole trigger mechanism
final ZooKeeper zooKeeper= new ZooKeeper("192.168.200.111:2181,192.168.200.112:2181,192.168.200.113:2181",4000, new Watcher() { @Override public void process(WatchedEvent event){ System.out.println("Default event: "+event.getType()); } }); zookeeper.create("/mic","0".getByte(),ZooDefs.Ids. OPEN_ACL_UNSAFE,CreateModel. PERSISTENT); // Create node zookeeper.exists("/mic",true); //Registered monitoring zookeeper.setData("/mic", "1".getByte(),-1) ; //Modify the value of a node to trigger listening
Initialization process of ZooKeeper API
When creating a ZooKeeper client object instance, we pass a default watcher to the construction method through new Watcher(). This Watcher will be the default watcher for the entire ZooKeeper session, and will always be saved in the default watcher of the client ZKWatchManager; the code is as follows
public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher, long sessionId, byte[] sessionPasswd, boolean canBeReadOnly, HostProvider aHostProvider) throws IOException { LOG.info("Initiating client connection, connectString=" + connectString + " sessionTimeout=" + sessionTimeout + " watcher=" + watcher + " sessionId=" + Long.toHexString(sessionId) + " sessionPasswd=" + (sessionPasswd == null ? "<null>" : "<hidden>")); this.clientConfig = new ZKClientConfig(); watchManager = defaultWatchManager(); watchManager.defaultWatcher = watcher; //Set the watcher to ZKWatchManager here ConnectStringParser connectStringParser = new ConnectStringParser( connectString); hostProvider = aHostProvider; //Initializes ClientCnxn and calls the cnxn.start() method cnxn = new ClientCnxn(connectStringParser.getChrootPath(), hostProvider, sessionTimeout, this, watchManager, getClientCnxnSocket(), sessionId, sessionPasswd, canBeReadOnly); cnxn.seenRwServerBefore = true; // since user has provided sessionId cnxn.start(); }
ClientCnxn: it is the main class for communication and event notification processing between Zookeeper client and Zookeeper server. There are two classes in it
-
SendThread: responsible for data communication between client and server, including event information transmission
-
EventThread: mainly used for notification processing in the registered Watchers of client callback
ClientCnxn initialization
public ClientCnxn(String chrootPath, HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper, ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket, long sessionId, byte[] sessionPasswd, boolean canBeReadOnly) { this.zooKeeper = zooKeeper; this.watcher = watcher; this.sessionId = sessionId; this.sessionPasswd = sessionPasswd; this.sessionTimeout = sessionTimeout; this.hostProvider = hostProvider; this.chrootPath = chrootPath; connectTimeout = sessionTimeout / hostProvider.size(); readTimeout = sessionTimeout * 2 / 3; readOnly = canBeReadOnly; //Initialize sendThread sendThread = new SendThread(clientCnxnSocket); //Initialize eventThread eventThread = new EventThread(); this.clientConfig=zooKeeper.getClientConfig(); } //Start two threads public void start() { sendThread.start(); eventThread.start(); }
Clients register to listen through exists
zookeeper.exists("/ mic, true); / / register to listen through the exists method. The code is as follows
public Stat exists(final String path, Watcher watcher) throws KeeperException, InterruptedException { final String clientPath = path; PathUtils.validatePath(clientPath); // the watch contains the un-chroot path WatchRegistration wcb = null; if (watcher != null) { // Build ExistWatchRegistration wcb = new ExistsWatchRegistration(watcher, clientPath); } final String serverPath = prependChroot(clientPath); RequestHeader h = new RequestHeader(); // Set the operation type to exists h.setType(ZooDefs.OpCode.exists); ExistsRequest request = new ExistsRequest(); // Construct ExistsRequest request.setPath(serverPath); //Register to listen request.setWatch(watcher != null); //Set the receiving class of the server response SetDataResponse response = new SetDataResponse(); /Encapsulated RequestHeader,ExistsRequest,SetDataResponse,WatchRegistration Add to send queue ReplyHeader r = cnxn.submitRequest(h, request, response, wcb); if (r.getErr() != 0) { if (r.getErr() == KeeperException.Code.NONODE.intValue()) { return null; } throw KeeperException.create(KeeperException.Code.get(r.getErr()), clientPath); } //Returns the result (Stat information) from exists return response.getStat().getCzxid() == -1 ? null : response.getStat(); }
cnxn.submitRequest
public ReplyHeader submitRequest(RequestHeader h, Record request, Record response, WatchRegistration watchRegistration, WatchDeregistration watchDeregistration) throws InterruptedException { ReplyHeader r = new ReplyHeader(); //Add a message to the queue and construct a Packet transport object Packet packet = queuePacket(h, r, request, response, null, null, null,null, watchRegistration, watchDeregistration); synchronized (packet) { while (!packet.finished) { //Blocking until the packet has not been processed packet.wait(); } } return r; }
Call queuePacket
public Packet queuePacket(RequestHeader h, ReplyHeader r, Record request, Record response, AsyncCallback cb, String clientPath, String serverPath, Object ctx, WatchRegistration watchRegistration, WatchDeregistration watchDeregistration) { Packet packet = null; //Convert related transport objects to packets packet = new Packet(h, r, request, response, watchRegistration); packet.cb = cb; packet.ctx = ctx; packet.clientPath = clientPath; packet.serverPath = serverPath; packet.watchDeregistration = watchDeregistration; synchronized (state) { if (!state.isAlive() || closing) { conLossPacket(packet); } else { // If the client is asking to close the session then // mark as closing if (h.getType() == OpCode.closeSession) { closing = true; } //Add to outgoing queue outgoingQueue.add(packet); } } //This is the multiplexing mechanism. Wake up the Selector and tell him that a packet has been added sendThread.getClientCnxnSocket().packetAdded(); return packet; }
In ZooKeeper, Packet is the smallest communication protocol unit, that is, Packet. Pakcet is used for network transmission between client and server. Any object to be transmitted needs to be wrapped as a Packet object. In ClientCnxn, the WatchRegistration will also be encapsulated in the pakcet, and then the SendThread thread calls the queuePacket method to put the Packet into the send queue and wait for the client to send. This is another asynchronous process. Asynchronous communication is a very common process in distributed systems
Sending process of SendThread
When initializing the connection, zookeeper initializes two threads and starts. Next, we will analyze the sending process of SendThread. Because it is a thread, the SendThread.run method will be called at startup
@Override public void run() { clientCnxnSocket.introduce(this, sessionId, outgoingQueue); clientCnxnSocket.updateNow(); clientCnxnSocket.updateLastSendAndHeard(); int to; long lastPingRwServer = Time.currentElapsedTime(); final int MAX_SEND_PING_INTERVAL = 10000; //10 seconds while (state.isAlive()) { try { if (!clientCnxnSocket.isConnected()) { // don't re-establish connection if we are closing if (closing) { break; } //Initiate connection startConnect(); clientCnxnSocket.updateLastSendAndHeard(); } //In case of connection status, handle authentication authorization of sasl if (state.isConnected()) { // determine whether we need to send an AuthFailed event. if (zooKeeperSaslClient != null) { boolean sendAuthEvent = false; if (zooKeeperSaslClient.getSaslState() == ZooKeeperSaslClient.SaslState.INITIAL) { try { zooKeeperSaslClient.initialize(ClientCnxn.this); } catch (SaslException e) { LOG.error("SASL authentication with Zookeeper Quorum member failed: " + e); state = States.AUTH_FAILED; sendAuthEvent = true; } } KeeperState authState = zooKeeperSaslClient.getKeeperState(); if (authState != null) { if (authState == KeeperState.AuthFailed) { // An authentication error occurred during authentication with the Zookeeper Server. state = States.AUTH_FAILED; sendAuthEvent = true; } else { if (authState == KeeperState.SaslAuthenticated) { sendAuthEvent = true; } } } if (sendAuthEvent == true) { eventThread.queueEvent(new WatchedEvent( Watcher.Event.EventType.None, authState,null)); } } to = readTimeout - clientCnxnSocket.getIdleRecv(); } else { to = connectTimeout - clientCnxnSocket.getIdleRecv(); } //To, which indicates how much time the client has left before the timeout, and is ready to initiate a ping connection if (to <= 0) { //Indicates that it has timed out String warnInfo; warnInfo = "Client session timed out, have not heard from server in " + clientCnxnSocket.getIdleRecv() + "ms" + " for sessionid 0x" + Long.toHexString(sessionId); LOG.warn(warnInfo); throw new SessionTimeoutException(warnInfo); } if (state.isConnected()) { //Calculate the next ping request time int timeToNextPing = readTimeout / 2 - clientCnxnSocket.getIdleSend() - ((clientCnxnSocket.getIdleSend() > 1000) ? 1000 : 0); //send a ping request either time is due or no packet sent out within MAX_SEND_PING_INTERVAL if (timeToNextPing <= 0 || clientCnxnSocket.getIdleSend() > MAX_SEND_PING_INTERVAL) { //Send ping request sendPing(); clientCnxnSocket.updateLastSend(); } else { if (timeToNextPing < to) { to = timeToNextPing; } } } // If we are in read-only mode, seek for read/write server if (state == States.CONNECTEDREADONLY) { long now = Time.currentElapsedTime(); int idlePingRwServer = (int) (now - lastPingRwServer); if (idlePingRwServer >= pingRwTimeout) { lastPingRwServer = now; idlePingRwServer = 0; pingRwTimeout = Math.min(2*pingRwTimeout, maxPingRwTimeout); pingRwServer(); } to = Math.min(to, pingRwTimeout - idlePingRwServer); } //Call clientCnxnSocket to initiate transmission. pendingQueue is a Packet queue used to store sent and waiting responses. clientCnxnSocket defaults to ClientCnxnSocketNIO (ps: remember where to initialize? When instantiating zookeeper) clientCnxnSocket.doTransport(to, pendingQueue, ClientCnxn.this); } catch (Throwable e) { if (closing) { if (LOG.isDebugEnabled()) { // closing so this is expected LOG.debug("An exception was thrown while closing send thread for session 0x" + Long.toHexString(getSessionId()) + " : " + e.getMessage()); } break; } else { // this is ugly, you have a better way speak up if (e instanceof SessionExpiredException) { LOG.info(e.getMessage() + ", closing socket connection"); } else if (e instanceof SessionTimeoutException) { LOG.info(e.getMessage() + RETRY_CONN_MSG); } else if (e instanceof EndOfStreamException) { LOG.info(e.getMessage() + RETRY_CONN_MSG); } else if (e instanceof RWServerFoundException) { LOG.info(e.getMessage()); } else { LOG.warn( "Session 0x" + Long.toHexString(getSessionId()) + " for server " + clientCnxnSocket.getRemoteSocketAddress() + ", unexpected error" + RETRY_CONN_MSG, e); } // At this point, there might still be new packets appended to outgoingQueue. // they will be handled in next connection or cleared up if closed. cleanup(); if (state.isAlive()) { eventThread.queueEvent(new WatchedEvent( Event.EventType.None, Event.KeeperState.Disconnected, null)); } clientCnxnSocket.updateNow(); clientCnxnSocket.updateLastSendAndHeard(); } } } synchronized (state) { // When it comes to this point, it guarantees that later queued // packet to outgoingQueue will be notified of death. cleanup(); } clientCnxnSocket.close(); if (state.isAlive()) { eventThread.queueEvent(new WatchedEvent(Event.EventType.None, Event.KeeperState.Disconnected, null)); } ZooTrace.logTraceMessage(LOG, ZooTrace.getTextTraceLevel(), "SendThread exited loop for session: 0x" + Long.toHexString(getSessionId())); }
Network interaction between client and server
In the process of sending, there is a code like this:
clientCnxnSocket.doTransport(to, pendingQueue, ClientCnxn.this);
Let's look at the doTransport method:
@Override void doTransport(int waitTimeOut, List<Packet> pendingQueue, ClientCnxn cnxn) throws IOException, InterruptedException { try { if (!firstConnect.await(waitTimeOut, TimeUnit.MILLISECONDS)) { return; } Packet head = null; if (needSasl.get()) { if (!waitSasl.tryAcquire(waitTimeOut, TimeUnit.MILLISECONDS)) { return; } } else { if ((head = outgoingQueue.poll(waitTimeOut, TimeUnit.MILLISECONDS)) == null) { return; } } // check if being waken up on closing. if (!sendThread.getZkState().isAlive()) { // adding back the patck to notify of failure in conLossPacket(). addBack(head); return; } // Abnormal process. The channel is closed. Add the current packet to addBack if (disconnected.get()) { addBack(head); throw new EndOfStreamException("channel for sessionid 0x" + Long.toHexString(sessionId) + " is lost"); } //If there are currently packets to be sent, the doWrite method is called, and pendingQueue indicates that the packets have been sent and waiting for response if (head != null) { doWrite(pendingQueue, head, cnxn); } } finally { updateNow(); } }
doWrite method
private void doWrite(List<Packet> pendingQueue, Packet p, ClientCnxn cnxn) { updateNow(); while (true) { if (p != WakeupPacket.getInstance()) { //Determine whether the request header and the current request type are not ping or auth operations if ((p.requestHeader != null) && (p.requestHeader.getType() != ZooDefs.OpCode.ping) && (p.requestHeader.getType() != ZooDefs.OpCode.auth)) { //Set xid, which is used to distinguish request types p.requestHeader.setXid(cnxn.getXid()); //Add the current packet to the pendingQueue queue synchronized (pendingQueue) { pendingQueue.add(p); } } //Send packets out sendPkt(p); } if (outgoingQueue.isEmpty()) { break; } p = outgoingQueue.remove(); } }
sendPkt:
private void sendPkt(Packet p) { //Serialize request data p.createBB(); // Update last send updateLastSend(); //Number of updates sent sentCount++; // Sending byte cache to server through nio channel channel.write(ChannelBuffers.wrappedBuffer(p.bb)); }
createBB:
public void createBB() { try { ByteArrayOutputStream baos = new ByteArrayOutputStream(); BinaryOutputArchive boa = BinaryOutputArchive.getArchive(baos); boa.writeInt(-1, "len"); // We'll fill this in later //Serialize header header (requestHeader) if (requestHeader != null) { requestHeader.serialize(boa, "header"); } if (request instanceof ConnectRequest) { request.serialize(boa, "connect"); // append "am-I-allowed-to-be-readonly" flag boa.writeBool(readOnly, "readOnly"); } else if (request != null) { //Serialize request(request) request.serialize(boa, "request"); } baos.close(); this.bb = ByteBuffer.wrap(baos.toByteArray()); this.bb.putInt(this.bb.capacity() - 4); this.bb.rewind(); } catch (IOException e) { LOG.warn("Ignoring unexpected exception", e); } }
From the createBB method, we can see that in the actual network transmission serialization at the bottom layer, zookeeper only talks about two attributes of requestHeader and request, that is, only these two attributes will be serialized to the byte array at the bottom layer for network transmission, and the information related to watchRegistration will not be transmitted on the network.
Tips:
After users call exists to register and listen, they will do several things
1. Package the request data as a packet and add it to the outgoing queue
2.SendThread will perform data sending operation, mainly to send the data in the outgoing queue to the server
3. Through clientCnxnSocket.doTransport(to, pendingQueue, ClientCnxn.this); where ClientCnxnSocket only zookeeper
There are two concrete implementation classes: ClientCnxnSocketNetty and ClientCnxnSocketNIO
Which class is used by body to send is set during initialization when Zookeeper is instantiated. The code is as follows
cnxn = new ClientCnxn(connectStringParser.getChrootPath(), hostProvider, sessionTimeout, this, watchMana getClientCnxnSocket(), canBeReadOnly); private ClientCnxnSocket getClientCnxnSocket() throws IOException { String clientCnxnSocketName = getClientConfig().getProperty( ZKClientConfig.ZOOKEEPER_CLIENT_CNXN_SOCKET); if (clientCnxnSocketName == null) { clientCnxnSocketName = ClientCnxnSocketNIO.class.getName(); } try { Constructor<?> clientCxnConstructor = Class.forName(clientCnxnSocketName).getDeclaredConstructor(ZKClient ClientCnxnSocket clientCxnSocket = (ClientCnxnSocket) clientCxnConstr return clientCxnSocket; } catch (Exception e) { IOException ioe = new IOException("Couldn't instantiate " + clientCnxnSocketName); ioe.initCause(e); throw ioe; } }
4. Based on step 3, sendPkt will be executed in ClientCnxnSocketNetty method to send the requested packet to the server
Processing flow of receiving request of server
The server has a NettyServerCnxn class to process the requests sent by the client
public void receiveMessage(ChannelBuffer message) { try { while(message.readable() && !throttled) { //ByteBuffer is not empty if (bb != null) { if (LOG.isTraceEnabled()) { LOG.trace("message readable " + message.readableBytes() + " bb len " + bb.remaining() + " " + bb); ByteBuffer dat = bb.duplicate(); dat.flip(); LOG.trace(Long.toHexString(sessionId) + " bb 0x" + ChannelBuffers.hexDump( ChannelBuffers.copiedBuffer(dat))); } //The remaining space of bb is larger than the size of readable bytes in message if (bb.remaining() > message.readableBytes()) { int newLimit = bb.position() + message.readableBytes(); bb.limit(newLimit); } // Write message to bb message.readBytes(bb); bb.limit(bb.capacity()); if (LOG.isTraceEnabled()) { LOG.trace("after readBytes message readable " + message.readableBytes() + " bb len " + bb.remaining() + " " + bb); ByteBuffer dat = bb.duplicate(); dat.flip(); LOG.trace("after readbytes " + Long.toHexString(sessionId) + " bb 0x" + ChannelBuffers.hexDump( ChannelBuffers.copiedBuffer(dat))); } // I have finished reading messag if (bb.remaining() == 0) { packetReceived(); // Statistics receiving information bb.flip(); ZooKeeperServer zks = this.zkServer; if (zks == null || !zks.isRunning()) { throw new IOException("ZK down"); } if (initialized) { //Process the packets from the client zks.processPacket(this, bb); if (zks.shouldThrottle(outstandingCount.incrementAndGet())) { disableRecvNoWait(); } } else { LOG.debug("got conn req request from " + getRemoteSocketAddress()); zks.processConnectRequest(this, bb); initialized = true; } bb = null; } } else { if (LOG.isTraceEnabled()) { LOG.trace("message readable " + message.readableBytes() + " bblenrem " + bbLen.remaining()); ByteBuffer dat = bbLen.duplicate(); dat.flip(); LOG.trace(Long.toHexString(sessionId) + " bbLen 0x" + ChannelBuffers.hexDump( ChannelBuffers.copiedBuffer(dat))); } if (message.readableBytes() < bbLen.remaining()) { bbLen.limit(bbLen.position() + message.readableBytes()); } message.readBytes(bbLen); bbLen.limit(bbLen.capacity()); if (bbLen.remaining() == 0) { bbLen.flip(); if (LOG.isTraceEnabled()) { LOG.trace(Long.toHexString(sessionId) + " bbLen 0x" + ChannelBuffers.hexDump( ChannelBuffers.copiedBuffer(bbLen))); } int len = bbLen.getInt(); if (LOG.isTraceEnabled()) { LOG.trace(Long.toHexString(sessionId) + " bbLen len is " + len); } bbLen.clear(); if (!initialized) { if (checkFourLetterWord(channel, message, len)) { return; } } if (len < 0 || len > BinaryInputArchive.maxBuffer) { throw new IOException("Len error " + len); } bb = ByteBuffer.allocate(len); } } } } catch(IOException e) { LOG.warn("Closing connection to " + getRemoteSocketAddress(), e); close(); } }
ZookeeperServer-zks.processPacket(this, bb);
Handle the packets sent by the client
public void processPacket(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOException { // We have the request, now process and setup for next InputStream bais = new ByteBufferInputStream(incomingBuffer); BinaryInputArchive bia = BinaryInputArchive.getArchive(bais); RequestHeader h = new RequestHeader(); h.deserialize(bia, "header"); //Deserialize client header header incomingBuffer = incomingBuffer.slice(); //Judge the current operation type if (h.getType() == OpCode.auth) { LOG.info("got auth packet " + cnxn.getRemoteSocketAddress()); AuthPacket authPacket = new AuthPacket(); ByteBufferInputStream.byteBuffer2Record(incomingBuffer, authPacket); String scheme = authPacket.getScheme(); ServerAuthenticationProvider ap = ProviderRegistry.getServerProvider(scheme); Code authReturn = KeeperException.Code.AUTHFAILED; if(ap != null) { try { authReturn = ap.handleAuthentication(new ServerAuthenticationProvider.ServerObjs(this, cnxn), authPacket.getAuth()); } catch(RuntimeException e) { LOG.warn("Caught runtime exception from AuthenticationProvider: " + scheme + " due to " + e); authReturn = KeeperException.Code.AUTHFAILED; } } if (authReturn == KeeperException.Code.OK) { if (LOG.isDebugEnabled()) { LOG.debug("Authentication succeeded for scheme: " + scheme); } LOG.info("auth success " + cnxn.getRemoteSocketAddress()); ReplyHeader rh = new ReplyHeader(h.getXid(), 0, KeeperException.Code.OK.intValue()); cnxn.sendResponse(rh, null, null); //If it is not an authorized operation, judge whether it is a sasl operation } else { if (ap == null) { LOG.warn("No authentication provider for scheme: " + scheme + " has " + ProviderRegistry.listProviders()); } else { {//Finally enter this code block for processing //Encapsulate request object LOG.warn("Authentication failed for scheme: " + scheme); } ReplyHeader rh = new ReplyHeader(h.getXid(), 0, KeeperException.Code.AUTHFAILED.intValue()); cnxn.sendResponse(rh, null, null); cnxn.sendBuffer(ServerCnxnFactory.closeConn); cnxn.disableRecv(); } return; } else { if (h.getType() == OpCode.sasl) { Record rsp = processSasl(incomingBuffer,cnxn); ReplyHeader rh = new ReplyHeader(h.getXid(), 0, KeeperException.Code.OK.intValue()); cnxn.sendResponse(rh,rsp, "response"); return; } else { Request si = new Request(cnxn, cnxn.getSessionId(), h.getXid(), h.getType(), incomingBuffer, cnxn.getAuthInfo()); si.setOwner(ServerCnxn.me); setLocalSessionFlag(si); submitRequest(si); //Submit request } } cnxn.incrOutstandingRequests(h); }
submitRequest
public void submitRequest(Request si) { //Processor processor if (firstProcessor == null) { synchronized (this) { try { // Since all requests are passed to the request // processor it should wait for setting up the request // processor chain. The state will be updated to RUNNING // after the setup. while (state == State.INITIAL) { wait(1000); } } catch (InterruptedException e) { LOG.warn("Unexpected interruption", e); } if (firstProcessor == null || state != State.RUNNING) { throw new RuntimeException("Not started"); } } } try { touch(si.cnxn); boolean validpacket = Request.isValid(si.type); if (validpacket) { firstProcessor.processRequest(si); if (si.cnxn != null) { incInProcess(); } } else { LOG.warn("Received packet at server of unknown type " + si.type); new UnimplementedRequestProcessor().processRequest(si); } } catch (MissingSessionException e) { if (LOG.isDebugEnabled()) { LOG.debug("Dropping request: " + e.getMessage()); } } catch (RequestProcessorException e) { LOG.error("Unable to process request:" + e.getMessage(), e); } }
First processor's request chain composition
1. The initialization of the first processor is completed in the setupRequestProcessor of zookeeper server. The code is as follows
protected void setupRequestProcessors() { RequestProcessor finalProcessor = new FinalReques RequestProcessor syncProcessor = new SyncReque ((SyncRequestProcessor)syncProcessor).start(); firstProcessor = new PrepRequestProcessor(this, syn ((PrepRequestProcessor)firstProcessor).start(); }
From the above we can see that the instance of firstProcessor is a PrepRequestProcessor, and a Processor is passed in this constructor to form a call chain.
RequestProcessor syncProcessor = new SyncRequestProcessor(this, finalProcessor);
The construction method of syncProcessor passes another Processor, corresponding to FinalRequestProcessor
2. So the whole call chain is preprequestprocessor - > syncrequestprocessor - > finalrequestprocessor
PredRequestProcessor.processRequest(si);
After learning about the call chain relationship from the above, let's continue to see
firstProcessor.processRequest(si); will call PrepRequestProcessor
public void processRequest(Request request) { submittedRequests.add(request); }
Alas, it's strange that processRequest just adds request to submitted requests. Based on the previous experience, it's natural to think of another asynchronous operation here. subittedRequests is a blocking queue
LinkedBlockingQueue submittedRequests = new LinkedBlockingQueue();
The PrepRequestProcessor class inherits the thread class, so we can directly find the run method in the current class as follows
public void run() { try { while (true) { Request request = submittedRequests.take(); //ok, get the request from the queue for processing long traceMask = ZooTrace.CLIENT_REQUEST_TRACE_MASK; if (request.type == OpCode.ping) { traceMask = ZooTrace.CLIENT_PING_TRACE_MASK; } if (LOG.isTraceEnabled()) { ZooTrace.logRequest(LOG, traceMask, 'P', request, ""); } if (Request.requestOfDeath == request) { break; } pRequest(request); //Call pRequest //Pre treatment } } catch (RequestProcessorException e) { if (e.getCause() instanceof XidRolloverException) { LOG.info(e.getCause().getMessage()); } handleException(this.getName(), e); } catch (Exception e) { handleException(this.getName(), e); } LOG.info("PrepRequestProcessor exited loop!"); }
pRequest
The preprocessing code is too long to paste. The previous N lines of code are judged and processed according to the current OP type. In the last line of this method, we will see the following code
nextProcessor.processRequest(request); obviously, nextProcessor should correspond to SyncRequestProcessor
SyncRequestProcessor. processRequest
public void processRequest(Request request) { // request.addRQRec(">sync"); queuedRequests.add(request); }
The code of this method is the same. Based on the asynchronous operation, add the request to queuedRequets. Then we will continue to find the run method in the current class
public void run() { try { int logCount = 0; // we do this in an attempt to ensure that not all of the servers // in the ensemble take a snapshot at the same time int randRoll = r.nextInt(snapCount/2); while (true) { Request si = null; //Get request from blocking queue if (toFlush.isEmpty()) { si = queuedRequests.take(); } else { si = queuedRequests.poll(); if (si == null) { flush(toFlush); continue; } } if (si == requestOfDeath) { break; } if (si != null) { // track the number of records written to the log //The following code, roughly speaking, triggers the snapshot operation and starts a thread processing the snapshot if (zks.getZKDatabase().append(si)) { logCount++; if (logCount > (snapCount / 2 + randRoll)) { randRoll = r.nextInt(snapCount/2); // roll the log zks.getZKDatabase().rollLog(); // take a snapshot if (snapInProcess != null && snapInProcess.isAlive()) { LOG.warn("Too busy to snap, skipping"); } else { snapInProcess = new ZooKeeperThread("Snapshot Thread") { public void run() { try { zks.takeSnapshot(); } catch(Exception e) { LOG.warn("Unexpected exception", e); } } }; snapInProcess.start(); } logCount = 0; } } else if (toFlush.isEmpty()) { // optimization for read heavy workloads // iff this is a read, and there are no pending // flushes (writes), then just pass this to the next // processor if (nextProcessor != null) { nextProcessor.processRequest(si); //Continue to call the next processor to process the request if (nextProcessor instanceof Flushable) { ((Flushable)nextProcessor).flush(); } } continue; } toFlush.add(si); if (toFlush.size() > 1000) { flush(toFlush); } } } } catch (Throwable t) { handleException(this.getName(), t); } finally{ running = false; } LOG.info("SyncRequestProcessor exited!"); }
FinalRequestProcessor. processRequest
FinalRequestProcessor.processRequest method and update the Session information or znode data in memory according to the operation in the Request object.
There are more than 300 lines of this code, not all of them will be pasted out. We can directly locate the key code and find the following code according to the OP type of the client
case OpCode.exists: { lastOp = "EXIS"; // TODO we need to figure out the security requirement for this! ExistsRequest existsRequest = new ExistsRequest(); //Deserialize (deserialize ByteBuffer to ExitsRequest. This is the Request object that we pass to when the client initiates the Request ByteBufferInputStream.byteBuffer2Record(request.req uest, existsRequest); String path = existsRequest.getPath(); //Get the requested path if (path.indexOf('\0') != -1) { throw new KeeperException.BadArgumentsException(); } //Finally, find a key code to determine whether the getWatch of the request exists. If so, pass cnxn (servercnxn) //For exists requests, you need to listen for data change events and add a watcher Stat stat = zks.getZKDatabase().statNode(path, existsRequest.getWatch() ? cnxn : null); rsp = new ExistsResponse(stat); //In the server-side memory database, assemble according to the result obtained from the path, and set it to ExistsResponse break; }
What does statNode do?
public Stat statNode(String path, ServerCnxn serverCnxn) throws KeeperException.NoNodeException { return dataTree.statNode(path, serverCnxn); }
All the way down, in the following method, ServerCnxn is transformed into Watcher. Because ServerCnxn implements the Watcher interface
public Stat statNode(String path, Watcher watcher) throws KeeperException.NoNodeException { Stat stat = new Stat(); DataNode n = nodes.get(path); //Get the number of nodes //according to if (watcher != null) { //If the watcher is not empty, the current watcher and path will be bound dataWatches.addWatch(path, watcher); } if (n == null) { throw new KeeperException.NoNodeException(); } synchronized (n) { n.copyStat(stat); return stat; } }
WatchManager.addWatch(path, watcher);
synchronized void addWatch(String path, Watcher watcher) { HashSet<Watcher> list = watchTable.get(path); //Judge whether there is a watcher corresponding to the current path in the watcherTable if (list == null) { //Add actively if it doesn't exist // don't waste memory if there are few watches on a node // rehash when the 4th entry is added, doubling size thereafter // seems like a good compromise list = new HashSet<Watcher>(4); // Newly generated watcher collection watchTable.put(path, list); } list.add(watcher); //Add to watcher table HashSet<String> paths = watch2Paths.get(watcher); if (paths == null) { // cnxns typically have many watches, so use default cap here paths = new HashSet<String>(); watch2Paths.put(watcher, paths); // Set up watcher Mapping to node paths } paths.add(path); // Add a path to the paths collection }
The general process is as follows
â‘ Obtain the corresponding watcher set from the watchTable through the incoming path (node path), and enter â‘¡
â‘¡ Judge whether the watcher in â‘ is empty. If it is empty, enter â‘¢. Otherwise, enter â‘£
â‘¢ Generate a new watcher set, add the path and the set to the watchTable, and enter â‘£
④ Add the incoming watcher to the watcher set, that is, complete the steps of adding path and watcher to the watchTable, and enter ⑤
⑤ Obtain the corresponding path set from watch2Paths through the incoming watcher, and enter ⑥
⑥ Judge whether the path set is empty. If it is empty, enter ⑦; otherwise, enter ⑧
⑦ New path set is generated, and watcher and paths are added to watch2Paths to enter ⑧
⑧ Adding the incoming path (node path) to the path set completes the steps of adding path and watcher to watch2Paths
The client receives the response processed by the server
ClientCnxnSocketNetty.messageReceived
After the service department is completed, it will pass
NettyServerCnxn.sendResponse sends the returned response information. The client will receive the return from the server at ClientCnxnSocketNetty.messageReceived
public void messageReceived(ChannelHandlerContext ctx, MessageEvent e) throws Exception { updateNow(); ChannelBuffer buf = (ChannelBuffer) e.getMessage(); while (buf.readable()) { if (incomingBuffer.remaining() > buf.readableBytes()) { int newLimit = incomingBuffer.position() + buf.readableBytes(); incomingBuffer.limit(newLimit); } buf.readBytes(incomingBuffer); incomingBuffer.limit(incomingBuffer.capacity()); if (!incomingBuffer.hasRemaining()) { incomingBuffer.flip(); if (incomingBuffer == lenBuffer) { recvCount++; readLength(); } else if (!initialized) { readConnectResult(); lenBuffer.clear(); incomingBuffer = lenBuffer; initialized = true; updateLastHeard(); } else { sendThread.readResponse(incomingBuffer); Triggered when a message is received SendThread.readResponse Method lenBuffer.clear(); incomingBuffer = lenBuffer; updateLastHeard(); } } } wakeupCnxn(); }
SendThread. readResponse
The main flow of this method is as follows
First, read the header. If its xid == -2, it indicates a ping response, return
If xid is - 4, it indicates the response return of an AuthPacket
If xid is - 1, it indicates a notification. At this time, continue to read and construct an eNet, send it through EventThread.queueEvent, return
In other cases:
Take out a Packet from pendingQueue, and update the Packet information after verification
void readResponse(ByteBuffer incomingBuffer) throws IOException { ByteBufferInputStream bbis = new ByteBufferInputStream( incomingBuffer); BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis); ReplyHeader replyHdr = new ReplyHeader(); replyHdr.deserialize(bbia, "header"); //Deserialize header if (replyHdr.getXid() == -2) { //? // -2 is the xid for pings if (LOG.isDebugEnabled()) { LOG.debug("Got ping response for sessionid: 0x" + Long.toHexString(sessionId) + " after " + ((System.nanoTime() - lastPingSentNs) / 1000000) + "ms"); } return; } if (replyHdr.getXid() == -4) { // -4 is the xid for AuthPacket if(replyHdr.getErr() == KeeperException.Code.AUTHFAILED.intValue()) { state = States.AUTH_FAILED; eventThread.queueEvent( new WatchedEvent(Watcher.Event.EventType.None, Watcher.Event.KeeperState.AuthFailed, null) ); } if (LOG.isDebugEnabled()) { LOG.debug("Got auth sessionid:0x" + Long.toHexString(sessionId)); } return; } if (replyHdr.getXid() == -1) { //Indicates that the current message type is a notification (meaning a response event of the server) // -1 means notification if (LOG.isDebugEnabled()) { LOG.debug("Got notification sessionid:0x" + Long.toHexString(sessionId)); } WatcherEvent event = new WatcherEvent();//? event.deserialize(bbia, "response"); //Deserialize response information // convert from a server path to a client path if (chrootPath != null) { String serverPath = event.getPath(); if(serverPath.compareTo(chrootPath)==0) event.setPath("/"); else if (serverPath.length() > chrootPath.length()) event.setPath(serverPath.substring(chrootPath.length() )); else { LOG.warn("Got server path " + event.getPath() + " which is too short for chroot path " + chrootPath); } } WatchedEvent we = new WatchedEvent(event); if (LOG.isDebugEnabled()) { LOG.debug("Got " + we + " for sessionid 0x" + Long.toHexString(sessionId)); } eventThread.queueEvent( we ); return; } // If SASL authentication is currently in progress, construct and // send a response packet immediately, rather than queuing a // response as with other packets. if (tunnelAuthInProgress()) { GetSASLRequest request = new GetSASLRequest(); request.deserialize(bbia,"token"); zooKeeperSaslClient.respondToServer(request.getToke n(), ClientCnxn.this); return; } Packet packet; synchronized (pendingQueue) { if (pendingQueue.size() == 0) { throw new IOException("Nothing in the queue, but got " + replyHdr.getXid()); } packet = pendingQueue.remove(); //Because the current packet has received a response, it is removed from pendingQueued } /* *Since requests are processed in order, we better get a response *to the first request! */ try {//Verify the packet information. After the verification is successful, update the packet information (replace with the information of the server) if (packet.requestHeader.getXid() != replyHdr.getXid()) { packet.replyHeader.setErr( KeeperException.Code.CONNECTIONLOSS.intValue()); throw new IOException("Xid out of order. Got Xid " + replyHdr.getXid() + " with err " + + replyHdr.getErr() + " expected Xid " + packet.requestHeader.getXid() + " for a packet with details: " + packet ); } packet.replyHeader.setXid(replyHdr.getXid()); packet.replyHeader.setErr(replyHdr.getErr()); packet.replyHeader.setZxid(replyHdr.getZxid()); if (replyHdr.getZxid() > 0) { lastZxid = replyHdr.getZxid(); } if (packet.response != null && replyHdr.getErr() == 0) { packet.response.deserialize(bbia, "response"); //Get the response from the server and set it to the packet.response property after deserialization. So we can get the return result of the change request through packet.response in the last line of the exists method } if (LOG.isDebugEnabled()) { LOG.debug("Reading reply sessionid:0x" + Long.toHexString(sessionId) + ", packet:: " + packet); } } finally { finishPacket(packet); // Finally, the finishPacket method is called to complete the processing } }
finishPacket method
The main function is to take out the corresponding Watcher from the Packet and register it in ZKWatchManager
private void finishPacket(Packet p) { int err = p.replyHeader.getErr(); if (p.watchRegistration != null) { p.watchRegistration.register(err); // Are you familiar with registering events in zkwatchemanager? When assembling the request, we initialize the object //In the watchRegistration subclass Watcher Instance to ZKWatchManager Of existsWatches Stored in. } //Add all the removed monitoring events to the event queue, so that the client can receive the event type of "data/child event removed" if (p.watchDeregistration != null) { Map<EventType, Set<Watcher>> materializedWatchers = null; try { materializedWatchers = p.watchDeregistration.unregister(err); for (Entry<EventType, Set<Watcher>> entry : materializedWatchers.entrySet()) { Set<Watcher> watchers = entry.getValue(); if (watchers.size() > 0) { queueEvent(p.watchDeregistration.getClientPath(), err, watchers, entry.getKey()); // ignore connectionloss when removing from local // session p.replyHeader.setErr(Code.OK.intValue()); } } } catch (KeeperException.NoWatcherException nwe) { p.replyHeader.setErr(nwe.code().intValue()); } catch (KeeperException ke) { p.replyHeader.setErr(ke.code().intValue()); } } //cb is AsnycCallback. If it is null, it indicates that it is a synchronous calling interface and does not need to be asynchronously dropped. Therefore, notify all directly. if (p.cb == null) { synchronized (p) { p.finished = true; p.notifyAll(); } } else { p.finished = true; eventThread.queuePacket(p); } }
watchRegistration
public void register(int rc) { if (shouldAddWatch(rc)) { Map<String, Set<Watcher>> watches = getWatches(rc); // //Get existsWatches in ZKWatchManager through the implementation of subclass synchronized(watches) { Set<Watcher> watchers = watches.get(clientPath); if (watchers == null) { watchers = new HashSet<Watcher>(); watches.put(clientPath, watchers); } watchers.add(watcher); // take Watcher Object placement ZKWatchManager Medium existsWatches inside } } }
The following code is the map sets of the client stored watcher, corresponding to three kinds of registered listening events
static class ZKWatchManager implements ClientWatchManager { private final Map<String, Set<Watcher>> dataWatches = new HashMap<String, Set<Watcher>>(); private final Map<String, Set<Watcher>> existWatches = new HashMap<String, Set<Watcher>>(); private final Map<String, Set<Watcher>> childWatches = new HashMap<String, Set<Watcher>>();
Generally speaking, when using the ZooKeeper construction method or using the three interfaces of getData, exists and getChildren to register the Watcher with the ZooKeeper server, first pass the message to the server. After the delivery is successful, the server will notify the client, and then the client will store the path and the corresponding relationship with the Watcher for standby.
EventThread.queuePacket()
The finishPacket method will eventually call eventThread.queuePacket to add the current packet to the queue waiting for event notification
public void queuePacket(Packet packet) { if (wasKilled) { synchronized (waitingEvents) { if (isRunning) waitingEvents.add(packet); else processEvent(packet); } } else { waitingEvents.add(packet); } }
Event triggering
The previous long description is just for the purpose of cleaning the event registration process and the final trigger, which needs to be completed through transactional operations
In our initial case, the following code is used to trigger the event
zookeeper.setData("/mic", "1".getByte(),-1) ; //Modify the value of a node to trigger listening
The previous process of client-side and server-side docking is no longer repeated. The interaction process is the same. The only difference is that the event is triggered
Server event response DataTree.setData()
public Stat setData(String path, byte data[], int version, long zxid, long time) throws KeeperException.NoNodeException { Stat s = new Stat(); DataNode n = nodes.get(path); if (n == null) { throw new KeeperException.NoNodeException(); } byte lastdata[] = null; synchronized (n) { lastdata = n.data; n.data = data; n.stat.setMtime(time); n.stat.setMzxid(zxid); n.stat.setVersion(version); n.copyStat(s); } // now update if the path is in a quota subtree. String lastPrefix = getMaxPrefixWithQuota(path); if(lastPrefix != null) { this.updateBytes(lastPrefix, (data == null ? 0 : data.length) - (lastdata == null ? 0 : lastdata.length)); } dataWatches.triggerWatch(path, EventType.NodeDataChanged); // Trigger the NodeDataChanged event of the corresponding node return s; }
WatcherManager. triggerWatch
Set<Watcher> triggerWatch(String path, EventType type, Set<Watcher> supress) { WatchedEvent e = new WatchedEvent(type, KeeperState.SyncConnected, path); // Create WatchedEvent according to event type, connection status and node path HashSet<Watcher> watchers; synchronized (this) { watchers = watchTable.remove(path); // Remove the path from the watcher table and return its corresponding watcher set if (watchers == null || watchers.isEmpty()) { if (LOG.isTraceEnabled()) { ZooTrace.logTraceMessage(LOG, ZooTrace.EVENT_DELIVERY_TRACE_MASK, "No watchers for " + path); } return null; } for (Watcher w : watchers) { // ergodic watcher aggregate HashSet<String> paths = watch2Paths.get(w); // Extract path set from the watcher table according to the watcher if (paths != null) { paths.remove(path); //Remove path } } } for (Watcher w : watchers) { // Traversal watcher //aggregate if (supress != null && supress.contains(w)) { continue; } w.process(e); //OK, the point is coming again. What is w.process doing? } return watchers; }
w.process(e);
Remember when we bind events on the server, what is the watcher binding? It is ServerCnxn, so w.process(e), in fact, should call the process method of ServerCnxn. And ServerCnxn is an abstract method, which has two implementation classes: NIOServerCnxn and NettyServerCnxn. Let's take a look at the process method of NettyServerCnxn
public void process(WatchedEvent event) { ReplyHeader h = new ReplyHeader(-1, -1L, 0); if (LOG.isTraceEnabled()) { ZooTrace.logTraceMessage(LOG, ZooTrace.EVENT_DELIVERY_TRACE_MASK, "Deliver event " + event + " to 0x" + Long.toHexString(this.sessionId) + " through " + this); } // Convert WatchedEvent to a type that can be sent over the wire WatcherEvent e = event.getWrapper(); try { sendResponse(h, e, "notification"); //look, this place sends an event. The event object is WatcherEvent. perfect } catch (IOException e1) { if (LOG.isDebugEnabled()) { LOG.debug("Problem sending to " + getRemoteSocketAddress(), e1); } close(); } }
Then, the client will receive the response and trigger the SendThread.readResponse method
Client processing event response
SendThread.readResponse
This code has been pasted above, so we only select the code of the current process for explanation. According to the previous one, xid of notification message is - 1, which means to directly find - 1 for analysis
void readResponse(ByteBuffer incomingBuffer) throws IOException { ByteBufferInputStream bbis = new ByteBufferInputStream( incomingBuffer); BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis); ReplyHeader replyHdr = new ReplyHeader(); replyHdr.deserialize(bbia, "header"); if (replyHdr.getXid() == -2) { //? // -2 is the xid for pings if (LOG.isDebugEnabled()) { LOG.debug("Got ping response for sessionid: 0x" + Long.toHexString(sessionId) + " after " + ((System.nanoTime() - lastPingSentNs) / 1000000) + "ms"); } return; } if (replyHdr.getXid() == -4) { // -4 is the xid for AuthPacket if(replyHdr.getErr() == KeeperException.Code.AUTHFAILED.intValue()) { state = States.AUTH_FAILED; eventThread.queueEvent( new WatchedEvent(Watcher.Event.EventType.None, Watcher.Event.KeeperState.AuthFailed, null) ); } if (LOG.isDebugEnabled()) { LOG.debug("Got auth sessionid:0x" + Long.toHexString(sessionId)); } return; } if (replyHdr.getXid() == -1) { // -1 means notification if (LOG.isDebugEnabled()) { LOG.debug("Got notification sessionid:0x" + Long.toHexString(sessionId)); } WatcherEvent event = new WatcherEvent(); event.deserialize(bbia, "response"); //This place is to deserialize the WatcherEvent event of the server. // convert from a server path to a client path if (chrootPath != null) { String serverPath = event.getPath(); if(serverPath.compareTo(chrootPath)==0) event.setPath("/"); else if (serverPath.length() > chrootPath.length()) event.setPath(serverPath.substring(chrootPath.length() )); else { LOG.warn("Got server path " + event.getPath() + " which is too short for chroot path " + chrootPath); } } WatchedEvent we = new WatchedEvent(event); //Assemble the watchedEvent object. if (LOG.isDebugEnabled()) { LOG.debug("Got " + we + " for sessionid 0x" + Long.toHexString(sessionId)); } eventThread.queueEvent( we ); //Event handling through eventTherad return; } // If SASL authentication is currently in progress, construct and // send a response packet immediately, rather than queuing a // response as with other packets. if (tunnelAuthInProgress()) { GetSASLRequest request = new GetSASLRequest(); request.deserialize(bbia,"token"); zooKeeperSaslClient.respondToServer(request.getToke n(), ClientCnxn.this); return; } Packet packet; synchronized (pendingQueue) { if (pendingQueue.size() == 0) { throw new IOException("Nothing in the queue, but got " + replyHdr.getXid()); } packet = pendingQueue.remove(); } /* * Since requests are processed in order, we better get a response *to the first request! */ try { if (packet.requestHeader.getXid() != replyHdr.getXid()) { packet.replyHeader.setErr( KeeperException.Code.CONNECTIONLOSS.intValue()); throw new IOException("Xid out of order. Got Xid " + replyHdr.getXid() + " with err " + + replyHdr.getErr() + " expected Xid " + packet.requestHeader.getXid() + " for a packet with details: " + packet ); } packet.replyHeader.setXid(replyHdr.getXid()); packet.replyHeader.setErr(replyHdr.getErr()); packet.replyHeader.setZxid(replyHdr.getZxid()); if (replyHdr.getZxid() > 0) { lastZxid = replyHdr.getZxid(); } if (packet.response != null && replyHdr.getErr() == 0) { packet.response.deserialize(bbia, "response"); } if (LOG.isDebugEnabled()) { LOG.debug("Reading reply sessionid:0x" + Long.toHexString(sessionId) + ", packet:: " + packet); } } finally {
eventThread.queueEvent
After SendThread receives the notification event from the server, it will pass the event to the EventThread by calling the queueEvent method of EventThread class. According to the notification event, the queueEvent method will take out all relevant watchers from ZKWatchManager. If it gets the corresponding Watcher, it will cause the Watcher to be removed and invalid.
private void queueEvent(WatchedEvent event, Set<Watcher> materializedWatchers) { if (event.getType() == EventType.None && sessionState == event.getState()) { //Judgement type return; } sessionState = event.getState(); final Set<Watcher> watchers; if (materializedWatchers == null) { // materialize the watchers based on the event watchers watcher.materialize(event.getState(), event.getType(), event.getPath()); } else { watchers = new HashSet<Watcher>(); watchers.addAll(materializedWatchers); } //Encapsulate the WatcherSetEventPair object and add it to the waitingevents queue WatcherSetEventPair pair = new WatcherSetEventPair(watchers, event); // queue the pair (watch set & event) for later processing waitingEvents.add(pair); }
Meterialize method
Get the corresponding watch through the remove of dataWatches, existWatches or childWatches, indicating that the client watch is also removed once registered
At the same time, we need to return the Watcher set that should be notified according to keeperState, eventType and path
public Set<Watcher> materialize(Watcher.Event.KeeperState state, Watcher.Event.EventType type, String clientPath) { Set<Watcher> result = new HashSet<Watcher>(); switch (type) { case None: result.add(defaultWatcher); boolean clear = disableAutoWatchReset && state != Watcher.Event.KeeperState.SyncConnected; synchronized(dataWatches) { for(Set<Watcher> ws: dataWatches.values()) { result.addAll(ws); } if (clear) { dataWatches.clear(); } } synchronized(existWatches) { for(Set<Watcher> ws: existWatches.values()) { result.addAll(ws); } if (clear) { existWatches.clear(); } } synchronized(childWatches) { for(Set<Watcher> ws: childWatches.values()) { result.addAll(ws); } if (clear) { childWatches.clear(); } } return result; case NodeDataChanged: case NodeCreated: synchronized (dataWatches) { addTo(dataWatches.remove(clientPath), result); } synchronized (existWatches) { addTo(existWatches.remove(clientPath), result); } break; case NodeChildrenChanged: synchronized (childWatches) { addTo(childWatches.remove(clientPath), result); } break; case NodeDeleted: synchronized (dataWatches) { addTo(dataWatches.remove(clientPath), result); } // XXX This shouldn't be needed, but just in case synchronized (existWatches) { Set<Watcher> list = existWatches.remove(clientPath); if (list != null) { addTo(existWatches.remove(clientPath), result); LOG.warn("We are triggering an exists watch for delete! Shouldn't happen!"); } } synchronized (childWatches) { addTo(childWatches.remove(clientPath), result); } break; default: String msg = "Unhandled watch event type " + type + " with state " + state + " on path " + clientPath; LOG.error(msg); throw new RuntimeException(msg); } return result; } }
waitingEvents.add
The last step is to get close to the truth
waitingEvents is the blocking queue in the thread EventThread. Obviously, it is also a thread instantiated in the first step of our operation. According to the name, waitingEvents is a queue of watchers to be processed. The run() method of EventThread will continuously fetch data from the queue and submit it to the processEvent method for processing:
public void run() { try { isRunning = true; while (true) { //Dead cycle Object event = waitingEvents.take(); //Get events from the pending event queue if (event == eventOfDeath) { wasKilled = true; } else { processEvent(event); //Execution event //Handle } if (wasKilled) synchronized (waitingEvents) { if (waitingEvents.isEmpty()) { isRunning = false; break; } } } } catch (InterruptedException e) { LOG.error("Event thread exiting due to interruption", e); } LOG.info("EventThread shut down for session: 0x{}", Long.toHexString(getSessionId())); }
ProcessEvent
Because this code is too long, I only paste out the core code, which is the core code for handling event triggering
private void processEvent(Object event) { try { if (event instanceof WatcherSetEventPair) { //Judge event type // each watcher will process the event WatcherSetEventPair pair = (WatcherSetEventPair) event; // Get watcherseteventPair for (Watcher watcher : pair.watchers) { //Get all the watcher columns that match the trigger mechanism //Table, loop to call try { watcher.process(pair.event); // Call the callback process of the client } catch (Throwable t) { LOG.error("Error while calling watcher ", t); } } }
Epilogue
No time