Use and Analysis of [Curator] Distributed Queue

Posted by chombone on Sun, 16 Jun 2019 20:54:15 +0200

Distributed Queue

Distributed queue

A distributed queue implementation based on ZK.

The messages put in guarantee order (ordered persistent nodes based on zk).

For individual consumers, queues are a FIFO (first in first out) approach. If you need to control the order, you can specify a Leader Selector for the consumer to customize the consumption strategy.

1. Notes

In fact, zk is seldom used as a queue in practical applications.

Curator itself does not recommend using zk as a queue

> IMPORTANT - We recommend that you do NOT use ZooKeeper for Queues. Please see Tech Note 4 for details.

> Tech Note 4 > ZooKeeper makes a very bad Queue source.

> The ZooKeeper recipes page lists Queues as a possible use-case for ZooKeeper. Curator includes several Queue recipes. In our experience, however, it is a bad idea to use ZooKeeper as a Queue:

> 1. ZooKeeper has a 1MB transport limitation. In practice this means that ZNodes must be relatively small. Typically, queues can contain many thousands of messages.

> 2. ZooKeeper can slow down considerably on startup if there are many large ZNodes. This will be common if you are using ZooKeeper for queues. You will need to significantly increase initLimit and syncLimit.

> 3. If a ZNode gets too big it can be extremely difficult to clean. getChildren() will fail on the node. At Netflix we had to create a special-purpose program that had a huge value for jute.maxbuffer in order to get the nodes and delete them.

> 4. ZooKeeper can start to perform badly if there are many nodes with thousands of children.

> 5. The ZooKeeper database is kept entirely in memory. So, you can never have more messages than can fit in memory.

> Although Curator provides several queue schemes for zk, it is not recommended to use ZK as a queue because: > 1. ZK has a size limit of 1MB for data transmission. >- This means that ZNodes of zk nodes must be designed very small in practice. >- In practice, however, queues usually store thousands of messages. > 2. If there are many large ZNodes, it will seriously slow down the zk start-up process. >- including synchronization between zk nodes >- If you're using zk as a queue, it's best to adjust initLimit and syncLimit > 3. If a ZNode is too large, it can also make cleaning difficult. >- It also causes the getChildren() method to fail >- Netflix had to design a special mechanism to handle this massive node > 4. If there are thousands of sub-nodes under a node in zk, the performance of ZK will be seriously affected. > 5. Data in ZK will be stored in memory.

Although zk is not naturally suitable for queues, let's look at the implementation of Curator and learn about its design.

2. Key API s

org.apache.curator.framework.recipes.queue.QueueBuilder

org.apache.curator.framework.recipes.queue.QueueConsumer

org.apache.curator.framework.recipes.queue.QueueSerializer

org.apache.curator.framework.recipes.queue.DistributedIdQueue

3. Usage

3.1 Creation

public static <T> QueueBuilder<T> builder(CuratorFramework client,
                                          QueueConsumer<T> consumer,
                                          QueueSerializer<T> serializer,
                                          java.lang.String queuePath)

QueueBuilder<MessageType>    builder = QueueBuilder.builder(client, consumer, serializer, path);
... more builder method calls as needed ...
DistributedQueue<MessageType queue = builder.build();
  • Create it through a Builder pattern

3.2 Use

The start() method needs to be invoked before the queue can be used. Clo () needs to be called when it is used up.

Production news

queue.put(aMessage);

In this way, the message will reach the QueueConsumer.consumeMessage() method on the consumer side.

3.3 Safe Consumption

In general, the message is removed after delivery and does not wait for the consumer call to complete.

So Curator also provides a more atomized way to remove messages after the consumer successfully returns. To turn this mode on, you can call lockPath() of Builder.

  • Ensuring that messages are recoverable by locking
  • Use locks to ensure that messages delivered are not accessed by other consumers
  • The consumer waiting for delivery completes its return
  • If the process fails or the process is abnormally interrupted, the message will be delivered again.
  • Processing with locks inevitably leads to more performance overhead

3.4 Data Format

Messages written by distributed queues use the following format:

Offset SIZE describe
0 4 Format version, current version: 0x00010001
4 1 Instruction: 0x01 = message, 0x02 = data end
5 4 Message byte length
9 n Message serialized bytes
9 n ... The next set (instruction-length-serialized bytes) until the end of the data

4. Error handling

The QueueConsumer class inherits ConnectionStateListener. When the queue is started, listeners are added automatically. In any case, using Distributed Queue requires attention to changes in the zk connection state.

If the SUSPENDED state occurs, the instance must assume that the queue is not updated until the reconnection is successful (SUSPENDED). If the connection is lost (LOST), the queue should be considered completely unavailable.

5. Source code analysis

Class 5.1 Graphs

Let's first look at the relationships among several core objects of Curator Queue:

It can be seen that:

  • All queues implement the org.apache.curator.framework.recipes.queue.QueueBase interface
  • The core queue implementation is: org. apache. curator. framework. recipes. queue. Distributed Queue.

Class 5.2 Definitions

public class DistributedQueue<T> implements QueueBase<T> {...}
  • Implement the interface of org.apache.curator.framework.recipes.queue.QueueBase
public interface QueueBase<T> extends Closeable{
    void     start() throws Exception;
    ListenerContainer<QueuePutListener<T>> getPutListenerContainer();
    void     setErrorMode(ErrorMode newErrorMode);
    boolean flushPuts(long waitTime, TimeUnit timeUnit) throws InterruptedException;
    int getLastMessageCount();
}
  • Inheritance and java.io.Closeable interface
  • Some general methods of queues are defined.
public class DistributedQueue<T> implements QueueBase<T> {}
  • DistributedQueue implements the interface of org.apache.curator.framework.recipes.queue.QueueBase

5.3 Membership Variables

public class DistributedQueue<T> implements QueueBase<T>
{
    private final Logger log = LoggerFactory.getLogger(getClass());
    private final CuratorFramework client;
    private final QueueSerializer<T> serializer;
    private final String queuePath;
    private final Executor executor;
    private final ExecutorService service;
    private final AtomicReference<State> state = new AtomicReference<State>(State.LATENT);
    private final QueueConsumer<T> consumer;
    private final int minItemsBeforeRefresh;
    private final boolean refreshOnWatch;
    private final boolean isProducerOnly;
    private final String lockPath;
    private final AtomicReference<ErrorMode> errorMode = new AtomicReference<ErrorMode>(ErrorMode.REQUEUE);
    private final ListenerContainer<QueuePutListener<T>> putListenerContainer = new ListenerContainer<QueuePutListener<T>>();
    private final AtomicInteger lastChildCount = new AtomicInteger(0);
    private final int maxItems;
    private final int finalFlushMs;
    private final boolean putInBackground;
    private final ChildrenCache childrenCache;

    private final AtomicInteger     putCount = new AtomicInteger(0);

    private enum State
    {
        LATENT,
        STARTED,
        STOPPED
    }

    @VisibleForTesting
    protected enum ProcessType
    {
        NORMAL,
        REMOVE
    }

    private static final String     QUEUE_ITEM_NAME = "queue-";
  • log
  • client
  • serializer
    • org.apache.curator.framework.recipes.queue.QueueSerializer
    • Serialization/deserialization of queue elements
  • queuePath
    • zk path corresponding to queue
  • executor
    • java.util.concurrent.Executor
    • Thread pool
    • Thread pool for consuming tasks
  • service
    • java.util.concurrent.ExecutorService
    • Thread pool
    • Used to handle pull-out of queues and internal asynchronous tasks
  • state
    • org.apache.curator.framework.recipes.queue.DistributedQueue.State
    • Internal enumeration
    • state
    • AtomicReference
  • consumer
    • org.apache.curator.framework.recipes.queue.QueueConsumer
    • Queue consumers
  • minItemsBeforeRefresh
    • Control the minimum number of queue scheduling messages
  • refreshOnWatch
    • Whether to Asynchronously Schedule Consumption after Interest Rate Cancellation
  • isProducerOnly
    • When no consumer is specified, queues work in production-only mode
    • This mode will not pull or cancel interest rates.
  • lockPath
    • zk Path Corresponding to Occupancy Lock
  • errorMode
    • org.apache.curator.framework.recipes.queue.ErrorMode
    • Different ways of error handling when consuming messages
    • enumeration
      • REQUEUE Re-queued
      • DELETE deletion
    • AtomicReference
  • putListenerContainer
    • org.apache.curator.framework.listen.ListenerContainer
    • A listener container for Queuing messages
  • lastChildCount
    • Number of child nodes
    • AtomicInteger
  • maxItems
    • Maximum number of messages in the queue
  • finalFlushMs
    • Delayed waiting time when closed
    • It allows you to wait for messages that have not yet been delivered to complete the delivery action when closed.
  • putInBackground
    • Whether to send asynchronously
    • Use curator callback
  • childrenCache
    • org.apache.curator.framework.recipes.queue.ChildrenCache
    • Sub-Node Caching
    • Caching of message data
  • putCount
    • Counter for Entry Messages
    • AtomicInteger

5.4 Constructor

DistributedQueue
        (
            CuratorFramework client,
            QueueConsumer<T> consumer,
            QueueSerializer<T> serializer,
            String queuePath,
            ThreadFactory threadFactory,
            Executor executor,
            int minItemsBeforeRefresh,
            boolean refreshOnWatch,
            String lockPath,
            int maxItems,
            boolean putInBackground,
            int finalFlushMs
        )
    {
        Preconditions.checkNotNull(client, "client cannot be null");
        Preconditions.checkNotNull(serializer, "serializer cannot be null");
        Preconditions.checkNotNull(threadFactory, "threadFactory cannot be null");
        Preconditions.checkNotNull(executor, "executor cannot be null");
        Preconditions.checkArgument(maxItems > 0, "maxItems must be a positive number");

        isProducerOnly = (consumer == null);
        this.lockPath = (lockPath == null) ? null : PathUtils.validatePath(lockPath);
        this.putInBackground = putInBackground;
        this.consumer = consumer;
        this.minItemsBeforeRefresh = minItemsBeforeRefresh;
        this.refreshOnWatch = refreshOnWatch;
        this.client = client;
        this.serializer = serializer;
        this.queuePath = PathUtils.validatePath(queuePath);
        this.executor = executor;
        this.maxItems = maxItems;
        this.finalFlushMs = finalFlushMs;
        service = Executors.newFixedThreadPool(2, threadFactory);
        childrenCache = new ChildrenCache(client, queuePath);

        if ( (maxItems != QueueBuilder.NOT_SET) && putInBackground )
        {
            log.warn("Bounded queues should set putInBackground(false) in the builder. Putting in the background will result in spotty maxItem consistency.");
        }
    }
}
  • Access privileges are package
  • Constructor has many parameters
    • So provide a Builder pattern: org.apache.curator.framework.recipes.queue.QueueBuilder

As you can see, basically all assignments are simple, but there are several attributes that have undergone some processing:

  • service
    • New Fixed ThreadPool is used by default, with a thread pool size of 2
      • Task queue java.util.concurrent.LinkedBlockingQueue
        • FIFO
        • There is a memory overflow problem in theory.
  • childrenCache
    • queuePath is cached

There is another point to note:

if ( (maxItems != QueueBuilder.NOT_SET) && putInBackground )
        {
            log.warn("Bounded queues should set putInBackground(false) in the builder. Putting in the background will result in spotty maxItem consistency.");
        }

When maxItems are set to specify the maximum number of messages stored in the queue, and asynchronous (callback) transmission is used. Curator will have a warning log. > Bounded queues should set putInBackground(false) in the builder. Putting in the background will result in spotty maxItem consistency.

> Bounded queues should be constructed using synchronous transmission. If asynchronous transmission is used, the number of maxItems will be inconsistent.

5.5 Start-up

You need to call the start() method before using it:

public void start() throws Exception
{
    if ( !state.compareAndSet(State.LATENT, State.STARTED) )
    {
        throw new IllegalStateException();
    }

    try
    {
        client.create().creatingParentContainersIfNeeded().forPath(queuePath);
    }
    catch ( KeeperException.NodeExistsException ignore )
    {
        // this is OK
    }
    if ( lockPath != null )
    {
        try
        {
            client.create().creatingParentContainersIfNeeded().forPath(lockPath);
        }
        catch ( KeeperException.NodeExistsException ignore )
        {
            // this is OK
        }
    }

    if ( !isProducerOnly || (maxItems != QueueBuilder.NOT_SET) )
    {
        childrenCache.start();
    }

    if ( !isProducerOnly )
    {
        service.submit
            (
                new Callable<Object>()
                {
                    @Override
                    public Object call()
                    {
                        runLoop();
                        return null;
                    }
                }
            );
    }
}
  1. Update status using CAS operation first
  2. Create queue zk nodes
  3. If necessary, create zk nodes corresponding to distributed locks
  4. If it's just a producer role, and it's set to a bounded queue
    • Promoter Node Cache
  5. If it's not just the producer
    • Triggers an asynchronous operation to call the runLoop() method

The logic of the startup process is not complicated, but there are two methods that need further analysis:

5.5.1 Promoter Node Cache

Call the org. apache. curator. framework. recipes. queue. ChildrenCache start method:

void start() throws Exception
{
    sync(true);
}

private synchronized void sync(boolean watched) throws Exception
{
    if ( watched )
    {
        client.getChildren().usingWatcher(watcher).inBackground(callback).forPath(path);
    }
    else
    {
        client.getChildren().inBackground(callback).forPath(path);
    }
}

private final CuratorWatcher watcher = new CuratorWatcher()
{
    @Override
    public void process(WatchedEvent event) throws Exception
    {
        if ( !isClosed.get() )
        {
            sync(true);
        }
    }
};

private final BackgroundCallback  callback = new BackgroundCallback()
{
    @Override
    public void processResult(CuratorFramework client, CuratorEvent event) throws Exception
    {
        if ( event.getResultCode() == KeeperException.Code.OK.intValue() )
        {
            setNewChildren(event.getChildren());
        }
    }
};

private synchronized void setNewChildren(List<String> newChildren)
{
    if ( newChildren != null )
    {
        Data currentData = children.get();

        children.set(new Data(newChildren, currentData.version + 1));
        notifyFromCallback();
    }
}
  1. Use observer mode (listener) to restore child node information if necessary
  2. Using callbacks to process the queue's sub-node information
  3. Use a version number to cache the list of child nodes locally
    • Wrapping the list of child nodes with an immutable List
    • Use version number tracking to avoid ABA problems
    • Atomic Reference for Atomic Packaging
    • Subnode information is encapsulated using an internal class
      • org.apache.curator.framework.recipes.queue.ChildrenCache.Data

5.5.2 Asynchronous operation runLoop() method

During queue startup, if not just the producer mode, an additional asynchronous call is executed to execute the org.apache.curator.framework.recipes.queue.DistributedQueue#runLoop method:

private void runLoop()
{
    long         currentVersion = -1;
    long         maxWaitMs = -1;
    try
    {
        while ( state.get() == State.STARTED  )
        {
            try
            {
                ChildrenCache.Data      data = (maxWaitMs > 0) ? childrenCache.blockingNextGetData(currentVersion, maxWaitMs, TimeUnit.MILLISECONDS) : childrenCache.blockingNextGetData(currentVersion);
                currentVersion = data.version;

                List<String>        children = Lists.newArrayList(data.children);
                sortChildren(children); // makes sure items are processed in the correct order

                if ( children.size() > 0 )
                {
                    maxWaitMs = getDelay(children.get(0));
                    if ( maxWaitMs > 0 )
                    {
                        continue;
                    }
                }
                else
                {
                    continue;
                }

                processChildren(children, currentVersion);
            }
            catch ( InterruptedException e )
            {
                // swallow the interrupt as it's only possible from either a background
                // operation and, thus, doesn't apply to this loop or the instance
                // is being closed in which case the while test will get it
            }
        }
    }
    catch ( Exception e )
    {
        log.error("Exception caught in background handler", e);
    }
}

//----------------------------------------
//org.apache.curator.framework.recipes.queue.ChildrenCache
//----------------------------------------
Data blockingNextGetData(long startVersion) throws InterruptedException
{
    return blockingNextGetData(startVersion, 0, null);
}

synchronized Data blockingNextGetData(long startVersion, long maxWait, TimeUnit unit) throws InterruptedException
{
    long            startMs = System.currentTimeMillis();
    boolean         hasMaxWait = (unit != null);
    long            maxWaitMs = hasMaxWait ? unit.toMillis(maxWait) : -1;
    while ( startVersion == children.get().version )
    {
        if ( hasMaxWait )
        {
            long        elapsedMs = System.currentTimeMillis() - startMs;
            long        thisWaitMs = maxWaitMs - elapsedMs;
            if ( thisWaitMs <= 0 )
            {
                break;
            }
            wait(thisWaitMs);
        }
        else
        {
            wait();
        }
    }
    return children.get();
}

After the queue starts, try to pull the cancel rate continuously:

  1. Get the subnode cache
    1. If the version is the same, it means that the local cache is old and needs to wait for synchronization before returning new data.
      • Get Version-1 at startup
      • The default version of ChildrenCache.Data is 0.
      • So at startup, the initialized ChildrenCache.Data is returned directly.
  2. Dictionary sorting of child nodes
  3. If there is child node information
    • Use getDelay to obtain the delay time of next acquisition of new messages based on the current child node information
      • Subclasses can override this method to extend queues
  4. Call processChildren (children, current Version); process child node data

5.5.3 Subnode Data Processing

The org. apache. curator. framework. recipes. queue. DistributedQueue processChildren method:

private void processChildren(List<String> children, long currentVersion) throws Exception
{
    final Semaphore processedLatch = new Semaphore(0);
    final boolean   isUsingLockSafety = (lockPath != null);
    int             min = minItemsBeforeRefresh;
    for ( final String itemNode : children )
    {
        if ( Thread.currentThread().isInterrupted() )
        {
            processedLatch.release(children.size());
            break;
        }

        if ( !itemNode.startsWith(QUEUE_ITEM_NAME) )
        {
            log.warn("Foreign node in queue path: " + itemNode);
            processedLatch.release();
            continue;
        }

        if ( min-- <= 0 )
        {
            if ( refreshOnWatch && (currentVersion != childrenCache.getData().version) )
            {
                processedLatch.release(children.size());
                break;
            }
        }

        if ( getDelay(itemNode) > 0 )
        {
            processedLatch.release();
            continue;
        }

        executor.execute
        (
            new Runnable()
            {
                @Override
                public void run()
                {
                    try
                    {
                        if ( isUsingLockSafety )
                        {
                            processWithLockSafety(itemNode, ProcessType.NORMAL);
                        }
                        else
                        {
                            processNormally(itemNode, ProcessType.NORMAL);
                        }
                    }
                    catch ( Exception e )
                    {
                        ThreadUtils.checkInterrupted(e);
                        log.error("Error processing message at " + itemNode, e);
                    }
                    finally
                    {
                        processedLatch.release();
                    }
                }
            }
        );
    }

    processedLatch.acquire(children.size());
}
  • Use a semaphore with a number of zeros
    • java.util.concurrent.Semaphore
    • Use as a latch-up

Schedule an asynchronous task for each child node:

  1. If lockPath is specified, the message node is handled securely by locking
  2. Otherwise, message nodes are handled in a normal way
  3. After the processing is completed, the signal is released.

Finally, using semaphore blocking, waiting for the current version of the sub-nodes to complete all processing

5.5.3.1 Locking Processing Message Node

The org.apache.curator.framework.recipes.queue.DistributedQueue#processWithLockSafety method:

protected boolean processWithLockSafety(String itemNode, ProcessType type) throws Exception
{
    String      lockNodePath = ZKPaths.makePath(lockPath, itemNode);
    boolean     lockCreated = false;
    try
    {
        client.create().withMode(CreateMode.EPHEMERAL).forPath(lockNodePath);
        lockCreated = true;

        String  itemPath = ZKPaths.makePath(queuePath, itemNode);
        boolean requeue = false;
        byte[]  bytes = null;
        if ( type == ProcessType.NORMAL )
        {
            bytes = client.getData().forPath(itemPath);
            requeue = (processMessageBytes(itemNode, bytes) == ProcessMessageBytesCode.REQUEUE);
        }

        if ( requeue )
        {
            client.inTransaction()
                .delete().forPath(itemPath)
                .and()
                .create().withMode(CreateMode.PERSISTENT_SEQUENTIAL).forPath(makeRequeueItemPath(itemPath), bytes)
                .and()
                .commit();
        }
        else
        {
            client.delete().forPath(itemPath);
        }

        return true;
    }
    catch ( KeeperException.NodeExistsException ignore )
    {
        // another process got it
    }
    catch ( KeeperException.NoNodeException ignore )
    {
        // another process got it
    }
    catch ( KeeperException.BadVersionException ignore )
    {
        // another process got it
    }
    finally
    {
        if ( lockCreated )
        {
            client.delete().guaranteed().forPath(lockNodePath);
        }
    }

    return false;
}
  • There is no distributed re-entry lock using org.apache.curator.framework.recipes.locks.InterProcessMutex
    • Instead, only one ZK temporary node is used to occupy the space.
    • To determine whether a message is being processed by a consumer
  • If the result of processing is requeuing, then atomization (Curator's zk transaction) executes:
    1. Delete message nodes from the original queue
    2. Rewrite message data to another queue
      • The exact queue is determined by the makeRequeueItemPath method
        • Subclasses override this method to customize different reentry strategies
  • Delete the message node in the original queue if the processing result does not require reentry
  • Secure (guaranteed) clean up temporary nodes that occupy space
5.5.3.2 Common way to process message nodes

The org. apache. curator. framework. recipes. queue. DistributedQueue processNormally method:

private boolean processNormally(String itemNode, ProcessType type) throws Exception
{
    try
    {
        String  itemPath = ZKPaths.makePath(queuePath, itemNode);
        Stat    stat = new Stat();

        byte[]  bytes = null;
        if ( type == ProcessType.NORMAL )
        {
            bytes = client.getData().storingStatIn(stat).forPath(itemPath);
        }
        if ( client.getState() == CuratorFrameworkState.STARTED )
        {
            client.delete().withVersion(stat.getVersion()).forPath(itemPath);
        }

        if ( type == ProcessType.NORMAL )
        {
            processMessageBytes(itemNode, bytes);
        }

        return true;
    }
    catch ( KeeperException.NodeExistsException ignore )
    {
        // another process got it
    }
    catch ( KeeperException.NoNodeException ignore )
    {
        // another process got it
    }
    catch ( KeeperException.BadVersionException ignore )
    {
        // another process got it
    }

    return false;
}
  • After obtaining the node data, the node is deleted directly.
  • Then the consumer is invoked locally to process the message.

5.5.4 Summary

Reorganize the startup process:

  1. Update startup status
  2. Create queue paths
  3. Create the path of the queue message consumption lock if necessary
  4. If a consumer is specified, or a bounded queue is specified, the child cache is started.
    1. Adding listeners
      • Used to automatically synchronize caching in various unexpected situations
      • Pull-and-cancel Interest Node
      • notifyAll() notifies the waiting consumer
  5. If a consumer is specified, the polling task is started asynchronously to pull out the interest node
    1. Blocking access queue message nodes
    2. Delay by getDelay method (custom message consumption strategy)
    3. Processing messages
      1. Filtration and scheduling of message consumption
        • Minimum refresh rate
        • delay
      2. Create an asynchronous processing task for each message node that can be processed
        1. Pull and cancel message node data
          • If safe consumption is required, temporary nodes are used to occupy space.
        2. Call consumer consumption message
        3. Ending of messages according to processing results
          • Rejoin the queue
            • Reentry policies can be customized through the makeRequeueItemPath method
          • delete

5.6 Close

When the queue is used up, the close() method needs to be called:

public void close() throws IOException
{
    if ( state.compareAndSet(State.STARTED, State.STOPPED) )
    {
        if ( finalFlushMs > 0 )
        {
            try
            {
                flushPuts(finalFlushMs, TimeUnit.MILLISECONDS);
            }
            catch ( InterruptedException e )
            {
                Thread.currentThread().interrupt();
            }
        }

        CloseableUtils.closeQuietly(childrenCache);
        putListenerContainer.clear();
        service.shutdownNow();
    }
}

When CAS safely updates STOPPED status:

  1. If delayed milliseconds are configured
    • The flushPuts method is called to wait for the result of message processing
    • Equivalent to graceful shutdown

5.6.1 flushPuts Method

Blocking waiting for all messages to be delivered to zk

public boolean flushPuts(long waitTime, TimeUnit timeUnit) throws InterruptedException
{
    long    msWaitRemaining = TimeUnit.MILLISECONDS.convert(waitTime, timeUnit);
    synchronized(putCount)
    {
        while ( putCount.get() > 0 )
        {
            if ( msWaitRemaining <= 0 )
            {
                return false;
            }

            long        startMs = System.currentTimeMillis();

            putCount.wait(msWaitRemaining);

            long        elapsedMs = System.currentTimeMillis() - startMs;
            msWaitRemaining -= elapsedMs;
        }
    }
    return true;
}
  • If the delivery counter is not zero, wait. The message is still being processed.
  • The maximum waiting parameter for closure is final FlushMs in milliseconds

5.7 Generate Messages

After the producer generates the message, the put method can be called to queue the message:

public void     put(T item) throws Exception
{
    put(item, 0, null);
}

public boolean     put(T item, int maxWait, TimeUnit unit) throws Exception
{
    checkState();

    String      path = makeItemPath();
    return internalPut(item, null, path, maxWait, unit);
}
  • As long as the number of queue elements does not exceed maxItems, no waiting is required to return directly
  • Otherwise, wait for queue messages to be consumed before placing them in
    • You can specify the waiting time by parameters

The actual queuing operation is implemented by the org.apache.curator.framework.recipes.queue.DistributedQueue#internalPut method:

boolean internalPut(final T item, MultiItem<T> multiItem, String path, int maxWait, TimeUnit unit) throws Exception
{
    if ( !blockIfMaxed(maxWait, unit) )
    {
        return false;
    }

    final MultiItem<T> givenMultiItem = multiItem;
    if ( item != null )
    {
        final AtomicReference<T>    ref = new AtomicReference<T>(item);
        multiItem = new MultiItem<T>()
        {
            @Override
            public T nextItem() throws Exception
            {
                return ref.getAndSet(null);
            }
        };
    }

    putCount.incrementAndGet();
    byte[]              bytes = ItemSerializer.serialize(multiItem, serializer);
    if ( putInBackground )
    {
        doPutInBackground(item, path, givenMultiItem, bytes);
    }
    else
    {
        doPutInForeground(item, path, givenMultiItem, bytes);
    }
    return true;
}
  1. If a waiting time is specified, the decision is made as to whether a blocking wait is required.
    • If the maximum number is exceeded, a synchronization (pulling the message node in zk again) is triggered.
    • If the version (message data tracking version) is unchanged, the message entry queue is determined to fail
  2. Wrap the message item into org.apache.curator.framework.recipes.queue.MultiItem
  3. Update queue entry counter putCount
  4. Message serialization
  5. Choose to send asynchronously to zk or synchronize by configuration

5.7.1 Asynchronously Send Messages to ZK

If it is sent asynchronously, the org.apache.curator.framework.recipes.queue.DistributedQueue#doPutInBackground method is called:

private void doPutInBackground(final T item, String path, final MultiItem<T> givenMultiItem, byte[] bytes) throws Exception
{
    BackgroundCallback callback = new BackgroundCallback()
    {
        @Override
        public void processResult(CuratorFramework client, CuratorEvent event) throws Exception
        {
            if ( event.getResultCode() != KeeperException.Code.OK.intValue() )
            {
                return;
            }

            if ( event.getType() == CuratorEventType.CREATE )
            {
                synchronized(putCount)
                {
                    putCount.decrementAndGet();
                    putCount.notifyAll();
                }
            }

            putListenerContainer.forEach
            (
                new Function<QueuePutListener<T>, Void>()
                {
                    @Override
                    public Void apply(QueuePutListener<T> listener)
                    {
                        if ( item != null )
                        {
                            listener.putCompleted(item);
                        }
                        else
                        {
                            listener.putMultiCompleted(givenMultiItem);
                        }
                        return null;
                    }
                }
            );
        }
    };
    internalCreateNode(path, bytes, callback);
}

void internalCreateNode(String path, byte[] bytes, BackgroundCallback callback) throws Exception
{
    client.create().withMode(CreateMode.PERSISTENT_SEQUENTIAL).inBackground(callback).forPath(path, bytes);
}
  • This method mainly assembles a callback task:
    1. When the new message node is created
      • Update queue entry message counter
    2. Triggering local listeners
  • The actual creation action is done by the internalCreateNode method
    • Use persistent ordered nodes to create message nodes

5.7.1 Send messages to ZK synchronously

If it is sent synchronously, the org. apache. curator. framework. recipes. queue. DistributedQueue # doPutInForegroundmethod is called:

private void doPutInForeground(final T item, String path, final MultiItem<T> givenMultiItem, byte[] bytes) throws Exception
{
    client.create().withMode(CreateMode.PERSISTENT_SEQUENTIAL).forPath(path, bytes);
    synchronized(putCount)
    {
        putCount.decrementAndGet();
        putCount.notifyAll();
    }
    putListenerContainer.forEach
    (
        new Function<QueuePutListener<T>, Void>()
        {
            @Override
            public Void apply(QueuePutListener<T> listener)
            {
                if ( item != null )
                {
                    listener.putCompleted(item);
                }
                else
                {
                    listener.putMultiCompleted(givenMultiItem);
                }
                return null;
            }
        }
    );
}

Synchronized blocking calls:

  1. Create Nodes
  2. Update queue entry counter
  3. Trigger the local listener in turn

5.8 Consumption News

If you need to consume messages, you need to specify the consumer org.apache.curator.framework.recipes.queue.QueueConsumer at initialization time

6. Builder

Curator prepares a Builder pattern for Queue: org.apache.curator.framework.recipes.queue.QueueBuilder

public class QueueBuilder<T>
{
    private final CuratorFramework client;
    private final QueueConsumer<T> consumer;
    private final QueueSerializer<T> serializer;
    private final String queuePath;

    private ThreadFactory factory;
    private Executor executor;
    private String lockPath;
    private int maxItems = NOT_SET;
    private boolean putInBackground = true;
    private int finalFlushMs = 5000;

    static final ThreadFactory  defaultThreadFactory = ThreadUtils.newThreadFactory("QueueBuilder");

    static final int NOT_SET = Integer.MAX_VALUE;

6.1 Building a Common Queue

public DistributedQueue<T>      buildQueue()
    {
        return new DistributedQueue<T>
        (
            client,
            consumer,
            serializer,
            queuePath,
            factory,
            executor,
            Integer.MAX_VALUE,
            false,
            lockPath,
            maxItems,
            putInBackground,
            finalFlushMs
        );
    }
  • default
    • Enable putInBackground
    • Unbounded queue maxItems = NOT_SET
    • If a message needs to be delivered when it is closed, wait for 5 seconds.

7. Summary

Curator uses zk as a queue

  • Consumption by Client-side Interest Rate Pull-Off
  • Using pull thread pool, and message consumption thread pool using different thread pools
    • Resource isolation
    • Separating io threads from task threads
  • Mechanisms with Elegant Exit
    • When closed, wait time can be set to wait for message delivery to complete.
  • Buffer with local cache
    • Use Version Number for Message Tracking
    • Reducing io
  • When scheduling, provide some protected method control to facilitate subclass customization scheduling strategy

Topics: Apache Zookeeper Java Database