Performance analysis of Quartz distributed scheduling

Posted by JCF22Lyoko on Tue, 21 Dec 2021 09:36:30 +0100

The implementation of Quartz distributed scheduling is decentralized. It needs to rely on the database to synchronize the scheduling status between clusters and realize consistent scheduling based on distributed locks. However, the distributed scheduling of the current version of XXL job (1.9.x) is based on Quartz. Therefore, the poor scheduling performance of XXL job we know is essentially the poor scheduling performance of Quartz.

When there are only a small number of tasks and there are no second level scheduled tasks, we can't see the performance defects of quartz. When the number of tasks increases significantly, we will find that the scheduling delay will increase significantly, especially for second level tasks. Although we expand nodes horizontally, the scheduling delay of second level tasks will not decrease, and the overall scheduling performance has not improved significantly, It's worse.

With these doubts, we will analyze the distributed scheduling principle of Quartz and the implementation principle of outdated recovery and fault recovery.

Core scheduling process analysis

The core process code is at org quartz. core. The run method of the QuartzSchedulerThread class starts the scheduling thread (only one thread) and executes the run of the QuartzSchedulerThread instance. The source code is as follows.

class QuartzSchedulerThread{
    @Override
    public void run() {
        int acquiresFailed = 0;
        while (!halted.get()) {
            // .....
                    (1)
                int availThreadCount = qsRsrcs.getThreadPool().blockForAvailableThreads();
                if(availThreadCount > 0) {

                    // ....
                    (2)                
                    try {
                        triggers = qsRsrcs.getJobStore().acquireNextTriggers(
                                now + idleWaitTime, Math.min(availThreadCount, qsRsrcs.getMaxBatchSize()), qsRsrcs.getBatchTimeWindow());
                       
                    } //.....

                    if (triggers != null && !triggers.isEmpty()) {
                        (3)        
                        now = System.currentTimeMillis();
                        long triggerTime = triggers.get(0).getNextFireTime().getTime();
                        long timeUntilTrigger = triggerTime - now;
                        while(timeUntilTrigger > 2) {
                            // wait......
                        }

                        // .....
                        if(goAhead) {
                            (4)
                            try {
                                List<TriggerFiredResult> res = qsRsrcs.getJobStore().triggersFired(triggers);
                                if(res != null)
                                    bndles = res;
                            } //....
                        }
                        (5)
                        for (int i = 0; i < bndles.size(); i++) {
                           // ..........
                            JobRunShell shell = null;
                            try {
                                shell = qsRsrcs.getJobRunShellFactory().createJobRunShell(bndle);
                                shell.initialize(qs);
                            } catch (SchedulerException se) {
                                qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
                                continue;
                            }
        
                            if (qsRsrcs.getThreadPool().runInThread(shell) == false) {
                                (6)
                                qsRsrcs.getJobStore().triggeredJobComplete(triggers.get(i), bndle.getJobDetail(), CompletedExecutionInstruction.SET_ALL_JOB_TRIGGERS_ERROR);
                            }

                        }

                        continue; // while (!halted)
                    }
                } else { // if(availThreadCount > 0)
                    // should never happen, if threadPool.blockForAvailableThreads() follows contract
                    continue; // while (!halted)
                }

                //....
            
        } // while (!halted)

        // drop references to scheduler stuff to aid garbage collection...
        qs = null;
        qsRsrcs = null;
    }
}

Process analysis:

  • 1. Check whether the available quantity of the current thread pool (business thread pool) is greater than 0, and execute the scheduling logic only when it is greater than 0;
  • 2. Obtain the distributed lock (LOCK_TRIGGER_ACCESS), then obtain the tasks to be scheduled within 30 seconds according to the configured maxBatchSize (the default maxBatchSize is 1) (the next execution time of the job is within 30 seconds), sort according to the execution time (sort during database query), and finally release the distributed lock (LOCK_TRIGGER_ACCESS);

When obtaining a job, the status will be updated from WAITING to ACQUIRED. A successful update indicates that the node has obtained the scheduling permission at that time;

  • 3. Traverse and schedule this batch of tasks, sleep and wait until the current millisecond. "At least one" job (batch) needs to be triggered for execution;
  • 4. Obtain the distributed lock (LOCK_TRIGGER_ACCESS), trigger the scheduling in batch, and for traversing and modifying the job state, changing from the received to the EXECUTING state. If the modification is successful, update the cron's next execution time and restore the state to WAITING, and finally release the distributed lock (LOCK_TRIGGER_ACCESS);
  • 5. For traverses and executes this batch of jobs. In step (4), change the job status from ACQUIRED to EXECUTING, obtain the thread pool, and put the job into the thread pool for execution;
  • 6. If the Job fails to put into the thread pool, obtain a distributed lock (LOCK_TRIGGER_ACCESS), modify the Job status to ERROR, and then release the lock;

Assuming that only one node is deployed and the maxBatchSize is 1 at most, that is, distributed locks and batches are not considered, the scheduling process of quartz can be simplified as follows:

  • 1. Query the database, obtain the job to be scheduled, and update the status of the job from WAITING to ACQUIRED;
  • 2. Update the job status, and change the status from requested to EXECUTING. If the modification is successful, update the next execution time of cron and restore the status to WAITING;
  • 3. If step (2) is successful, obtain the thread pool, put the Job into the thread pool, allocate the thread in the thread pool, and call the execute method of the Job to execute the Job;
  • 4. Continue cycle step 1;

Failure recovery process analysis

The startup entry of fault recovery and missed recovery is in the JobStoreSupport#schedulerStarted method, which is called by the QuartzScheduler#start method. The source code is as follows.

class JobStoreSupport{
     public void schedulerStarted()throws SchedulerException{
        (1)
        if(isClustered()){
         //...
         clusterManagementThread.initialize();
        }else{
             (2)
             try{
                 recoverJobs();
             }//....
        }
        (3)
        misfireHandler=new MisfireHandler();
        // ...
        misfireHandler.initialize();
        schedulerRunning=true;
        //...
     }
 }
  • (1) Judge whether it is a cluster mode. If so, start the cluster management thread;
  • (2) In the non cluster mode, the current thread executes the recoverJobs method to recover;
  • (3) Start the obsolete recovery processing thread;

Fault recovery process under distributed scheduling:

  • Regularly detect dropped nodes every clusterCheckinInterval milliseconds;
  • If a node drops, add a LOCK_TRIGGER_ACCESS to obtain the scheduling record of the dropped node;
  • If the scheduling status is ACQUIRED, set the Job status to WAITING;
  • Release the LOCK_TRIGGER_ACCESS;

Missed recovery process analysis

The main reasons for missing are as follows:

  • 1. Resume dispatching after suspending dispatching;
  • 2. The thread pool may appear after it is full;
  • 3. Single node restart;

Missed recovery process:

  • Acquire the lock_trigger_access every misfireThreshold MS (timeout threshold) regularly;
  • If you query the WAITING status and the next scheduling time is less than (current time - misfireThreshold), intercept maxmisfirestohandleatime (quantity) records;
  • According to strategy: misfire_ INSTRUCTION_ DO_ Noting or MISFIRE_INSTRUCTION_FIRE_ONCE_NOW, update the next scheduling time of these misfire records. If it is MISFIRE_INSTRUCTION_FIRE_ONCE_NOW, the next execution time is the current time, that is, it is executed immediately, so that the QuartzSchedulerThread thread can immediately obtain the job trigger scheduling;
  • Release the LOCK_TRIGGER_ACCESS;

Causes affecting performance

Answer: why is the performance of distributed consistency scheduling based on mysql implementation of distributed locks poor?

Cause analysis: a single process has only one QuartzSchedulerThread thread, but because maxBatchSize is 1 by default, multiple nodes will compete for distributed locks at the same time. Execute the job once, and the number of lock competitions is at least 3 times. When the number of nodes increases, the lock competition will become fierce, and when the lock is not obtained, it will lead to blocking and waiting for the release of the lock, unless the database sets the waiting timeout to be very short.

Can I optimize by adjusting parameters?

Scheme 1: increase the maxBatchSize. If the total number of job s is 300 and three nodes are deployed, the maxBatchSize can be configured to 100.

In this way, there is still fierce lock competition, that is, the state modification before task execution still needs to be locked, and not every Job uses an independent distributed lock.

How about increasing the maxBatchSize to 300?

In this way, only one node will process jobs, and adding more nodes will not improve performance. Because the scheduling is single threaded, there are too many batch jobs, and the lock is eliminated, but the number of database operations is still not reduced, and the time will increase. There are too many jobs, and single node scheduling leads to increased delay.

Scheme 2: implement distributed lock based on redis.

The root of the problem is still not solved. The main reason is the granularity of locks.

In addition to relying on the database, if each application uses Quartz independently instead of XXL Job, it is somewhat similar to the ElasticJob framework, which is decentralized, but its performance is not as good as ElasticJob. The reason is that ElasticJob changes the lock granularity to Job level based on Quartz instead of application level, and there is not only one thread responsible for scheduling, But one for each Job.