Do you know the underlying implementation of thread pools?Thread Pool Module Summary [Previous]

Posted by keefy on Mon, 09 Mar 2020 17:14:14 +0100

Hello, this is a blog to keep learning for your dreams.This is the fourth article to share my understanding of Thread Pool.The style of the article will always be told in a question and answer way, which is one of my personal favorites and is equivalent to a simulated interview.

What is the core idea of thread pooling?

By My First About thread As mentioned in the article, our server's threading resources are valuable and limited, so we need to reuse threads as much as possible in some scenarios to keep them from being created and destroyed frequently.This leads to the first core idea of our thread pool: thread resource reuse.
In addition to thread resource reuse, we also want some control over the use of thread resources, such as how many threads can be created and how many threads must be running.That brings us to the second core idea of our thread pool: thread resource control.

How is the core functionality of the thread pool implemented?

Implementation principle of thread resource reuse

To get a thorough understanding of how thread pools work, you must start with the source code.The core class selected in this article is ThreadPoolExecutor, which is also the class used in the book The Beauty of Java Concurrent Programming.Let's start with the core function void execute(Runnable command).
The source code below looks really dull, but I'll start with an overview of the principles to give the reader a general impression, then draw a picture to illustrate the specific implementation process.As for the code block below, you can read the source code in conjunction with the instructions here.
The thread pool reuses core threads mainly through the internal Worker thread + producer consumer mode, and incoming threads are executed by the internal Worker thread calling the run function.Allow internal Worker threads to consume Task s and survive for long periods to achieve the effect of thread reuse.
How do internal Worker threads survive long term?The main purpose is to block the while loop inside the Worker thread through the queue's take function and check at each stage whether the number of Worker threads is less than the number of core threads set.The specific code execution flow diagram is as follows:
The source code is described below:

    // This function is documented in detail and can be read by the reader.This is mainly to write down my understanding line by line.
    public void execute(Runnable command) {
        if (command == null)
            throw new NullPointerException();
	// Gets the status value of the current thread pool
        int c = ctl.get();
	// Create a new core thread if the number of worker threads is less than the number of core threads
        if (workerCountOf(c) < corePoolSize) {
            if (addWorker(command, true))
                return;
            c = ctl.get();
        }
	// If all core threads are executing Task, queue new tasks
        if (isRunning(c) && workQueue.offer(command)) {
            int recheck = ctl.get();
            if (! isRunning(recheck) && remove(command))
                reject(command);
            else if (workerCountOf(recheck) == 0)
                addWorker(null, false);
        }
	// If the queue does not fit, create the Worker thread again until the Max limit is reached
        else if (!addWorker(command, false))
	   // If the Max limit is reached and new tasks are available, the corresponding rejection policy is executed.
            reject(command);
    }

The above code is actually a thread pool execution process that interviewers often ask.As we can see from the code above, one of the more important functions is addWorker, which, as the name implies, is used to create new internal Worker threads.The code is as follows:

	// It is divided into two parts:
	// Upper half: Increase the number of threads (ctl) through CAS operations
	// The second half: Bind the incoming Task to Worker through the constructor, add it to the workers, and finally start the internal thread worker.
	private boolean addWorker(Runnable firstTask, boolean core) {
	// Upper half: Increase the number of threads (ctl) through CAS operations
        retry:
        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN &&
                ! (rs == SHUTDOWN &&
                   firstTask == null &&
                   ! workQueue.isEmpty()))
                return false;

            for (;;) {
                int wc = workerCountOf(c);
		// If you look at this with the last branch of execute above, you'll see that if you reach the Max limit, it returns false here.
                if (wc >= CAPACITY ||
                    wc >= (core ? corePoolSize : maximumPoolSize))
                    return false;
                if (compareAndIncrementWorkerCount(c))
                    break retry;
                c = ctl.get();  // Re-read ctl
                if (runStateOf(c) != rs)
                    continue retry;
                // else CAS failed due to workerCount change; retry inner loop
            }
        }
	// The second half: Bind the incoming Task to Worker through the constructor, add it to the workers, and finally start the internal thread worker.
        boolean workerStarted = false;
        boolean workerAdded = false;
        Worker w = null;
        try {
	    // binding
            w = new Worker(firstTask);
	    // What you get here is the Worker itself, because Worker implements Runnable
            final Thread t = w.thread;
            if (t != null) {
                final ReentrantLock mainLock = this.mainLock;
                mainLock.lock();
                try {
                    // Recheck while holding lock.
                    // Back out on ThreadFactory failure or if
                    // shut down before lock acquired.
                    int rs = runStateOf(ctl.get());

                    if (rs < SHUTDOWN ||
                        (rs == SHUTDOWN && firstTask == null)) {
                        if (t.isAlive()) // precheck that t is startable
                            throw new IllegalThreadStateException();
			// Add to workers
                        workers.add(w);
                        int s = workers.size();
                        if (s > largestPoolSize)
                            largestPoolSize = s;
                        workerAdded = true;
                    }
                } finally {
                    mainLock.unlock();
                }
                if (workerAdded) {
		// Start worker thread
                    t.start();
                    workerStarted = true;
                }
            }
        } finally {
            if (! workerStarted)
                addWorkerFailed(w);
        }
        return workerStarted;
    }

This is a long piece of code. There are a lot of details about thread security that you don't care about. We just need to understand what this function does. One is to modify the ctl status value through CAS, the other is to bind the incoming Task to the newly created Worker through the Worker's constructor, add it to the set of workers, and finally start the Worker line.Cheng.Let's look at the specific constructors:

	Worker(Runnable firstTask) {
            setState(-1); // inhibit interrupts until runWorker
	    // Assign the incoming Task to its own member variable
            this.firstTask = firstTask;
	    // Make yourself a thread ed and assign it to its member variable
            this.thread = getThreadFactory().newThread(this);
    }

Once the binding is complete, the final execution occurs: t.start(); this Worker thread inside is started, and you go to the run function inside the Worker, under the code:

    final void runWorker(Worker w) {
        Thread wt = Thread.currentThread();
        Runnable task = w.firstTask;
        w.firstTask = null;
        w.unlock(); // allow interrupts
        boolean completedAbruptly = true;
        try {
	   // Control internal Worker threads are not recycled, producer consumer model
            while (task != null || (task = getTask()) != null) {
                w.lock();
                if ((runStateAtLeast(ctl.get(), STOP) ||
                     (Thread.interrupted() &&
                      runStateAtLeast(ctl.get(), STOP))) &&
                    !wt.isInterrupted())
                    wt.interrupt();
                try {
                    beforeExecute(wt, task);
                    Throwable thrown = null;
                    try {
			// Explicitly calling the run function consumes the task
                        task.run();
                    } catch (RuntimeException x) {
                        thrown = x; throw x;
                    } catch (Error x) {
                        thrown = x; throw x;
                    } catch (Throwable x) {
                        thrown = x; throw new Error(x);
                    } finally {
                        afterExecute(task, thrown);
                    }
                } finally {
                    task = null;
                    w.completedTasks++;
                    w.unlock();
                }
            }
            completedAbruptly = false;
        } finally {
	    // After exiting the loop, the thread finishes execution and calls the post-cleanup function
            processWorkerExit(w, completedAbruptly);
        }
    }

As you can see from the figure above, the while loop and getTask function that determine if the internal Worker thread will be recycled are the following:

     private Runnable getTask() {
        boolean timedOut = false; // Did the last poll() time out?

        for (;;) {
            int c = ctl.get();
            int rs = runStateOf(c);

            // Check if queue empty only if necessary.
            if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) {
                decrementWorkerCount();
                return null;
            }

            int wc = workerCountOf(c);

            // 1. timed is true when the number of worker s is greater than core
            boolean timed = allowCoreThreadTimeOut || wc > corePoolSize;
	    // 3. So here, because timed && timedOut is true, then the queue is empty and returns a null.
	    // Exit the loop inside the worker thread.Eventually recycled.
            if ((wc > maximumPoolSize || (timed && timedOut))
                && (wc > 1 || workQueue.isEmpty())) {
                if (compareAndDecrementWorkerCount(c))
                    return null;
                continue;
            }

            try {
		// 2. This is because after timed is true, timedOut = true is modified when keepAliveTime is idle;
                Runnable r = timed ?
                    workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) :
                    workQueue.take();
                if (r != null)
                    return r;
                timedOut = true;
            } catch (InterruptedException retry) {
                timedOut = false;
            }
        }
    }

We can see that if the current number of Worker threads is less than the number of core threads, workQueue.take() is called; it is blocked so that the Worker thread is not cleaned up.So as to achieve the purpose of persistence.The last is the post-function in runWorker, which cleans up the Worker after execution and rechecks the number of core threads.The source code is as follows:

    private void processWorkerExit(Worker w, boolean completedAbruptly) {
        if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted
            decrementWorkerCount();
	// Clean up worker threads
        final ReentrantLock mainLock = this.mainLock;
        mainLock.lock();
        try {
            completedTaskCount += w.completedTasks;
            workers.remove(w);
        } finally {
            mainLock.unlock();
        }

        tryTerminate();

        int c = ctl.get();
        if (runStateLessThan(c, STOP)) {
            if (!completedAbruptly) {
                int min = allowCoreThreadTimeOut ? 0 : corePoolSize;
                if (min == 0 && ! workQueue.isEmpty())
                    min = 1;
                if (workerCountOf(c) >= min)
                    return; // replacement not needed
            }
	    // If the current number of worker s is less than the number of core threads, then addWorker is also called here to create
	    // Eventually Worker calls the take function to block and persist.
            addWorker(null, false);
        }
    }

Through the above flowchart and source code instructions, I think your readers should have a clear understanding of the execution process at the bottom of this thread pool. We can also know how the thread pool achieves thread reuse, why there must be a certain number of core threads, how incoming threads are called (consumed), and so on.

Implementing Thread Resource Control

If you can be patient with the section above, I think you should have a specific understanding of how this resource control works.Recall how we normally use thread pools?Set a bunch of parameters such as how many core threads the thread pool must have at least; how many worker threads it can have at most; how many seconds it can be idle; what type of Queue it can be placed in; and what rejection policies it should perform if the task cannot be processed in the future.
So the parameters we set are actually quantitative control of the thread pool resources, and some of them, such as corePoolSize, maximumPoolSize, keepAliveTime and so on, when they take effect are mentioned above. Obviously, these parameters are instance variables, which we specify when we call the constructor -- lines are createdWhen a thread pool is running, the corresponding parameters are specified by the constructor, and then the thread pool continuously checks the running state of the thread pool during the running process to achieve the purpose of thread resource control.Finally, paste the code for one of the constructors:

    public ThreadPoolExecutor(int corePoolSize,
                              int maximumPoolSize,
                              long keepAliveTime,
                              TimeUnit unit,
                              BlockingQueue<Runnable> workQueue,
                              ThreadFactory threadFactory,
                              RejectedExecutionHandler handler) {
        if (corePoolSize < 0 ||
            maximumPoolSize <= 0 ||
            maximumPoolSize < corePoolSize ||
            keepAliveTime < 0)
            throw new IllegalArgumentException();
        if (workQueue == null || threadFactory == null || handler == null)
            throw new NullPointerException();
        this.acc = System.getSecurityManager() == null ?
                null :
                AccessController.getContext();
        this.corePoolSize = corePoolSize;
        this.maximumPoolSize = maximumPoolSize;
        this.workQueue = workQueue;
        this.keepAliveTime = unit.toNanos(keepAliveTime);
        this.threadFactory = threadFactory;
        this.handler = handler;
    }

Without explanation, the code is a series of assignment operations.Because the code snippet is really long and seems laborious, the ThreadPool module is summarized in several sections, followed by a further analysis of how the thread pool is closed and what the differences are; the rejection policy settings; how the thread pool parameters should be set in theory; and so on.

Topics: Programming less Java