abstract
This paper introduces the most basic usage of APScheduler "start job after a few seconds", explains the differences between the two schedulers BackgroundScheduler and BlockingScheduler, explains how to "let the job start running after start()", and details the problems and solutions in the special case of "job execution time is greater than scheduled scheduling time", It also shows that each job will be scheduled in the form of thread.
Basic timing scheduling
APScheduler is a timed task scheduling framework of python. It can realize tasks similar to crontab type tasks under linux, which is convenient to use. It provides similar task scheduling based on fixed time interval, date and crontab configuration, and can persist tasks or run tasks in daemon mode.
The following is a basic example:
from apscheduler.schedulers.blocking import BlockingScheduler
def job():
print('job 3s')
if name=='main':
sched = BlockingScheduler(timezone=<span class="hljs-string">'MST'</span>) sched.add_job(job, <span class="hljs-string">'interval'</span>, id=<span class="hljs-string">'3_second_job'</span>, seconds=<span class="hljs-number">3</span>) sched.start()
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
It can schedule job() to run every 3s, so the program outputs' job 3s' every 3s. By modifying add_ The parameter seconds of job() can change the interval of task scheduling.
What is the difference between BlockingScheduler and BackgroundScheduler
There are many different types of schedulers in APScheduler. BlockingScheduler and BackgroundScheduler are the two most commonly used schedulers. What's the difference between them? In short, the main difference is that the BlockingScheduler blocks the main thread, while the BackgroundScheduler does not. Therefore, we choose different schedulers in different situations:
- BlockingScheduler: calling the start function will block the current thread. Use when the scheduler is the only thing to run in your application (as in the above example).
- BackgroundScheduler: the main thread will not block after calling start. Use when you don't run any other framework and want the scheduler to execute in the background of your application.
Here are two examples to more intuitively illustrate the difference between the two.
- A real example of BlockingScheduler
from apscheduler.schedulers.blocking import BlockingScheduler import time
def job():
print('job 3s')
if name=='main':
sched = BlockingScheduler(timezone=<span class="hljs-string">'MST'</span>) sched.add_job(job, <span class="hljs-string">'interval'</span>, id=<span class="hljs-string">'3_second_job'</span>, seconds=<span class="hljs-number">3</span>) sched.start() <span class="hljs-keyword">while</span>(<span class="hljs-keyword">True</span>): print(<span class="hljs-string">'main 1s'</span>) time.sleep(<span class="hljs-number">1</span>)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
Running this program, we get the following output:
job 3s job 3s job 3s job 3s
- 1
- 2
- 3
- 4
It can be seen that the BlockingScheduler will block the current thread after calling the start function, so that the while loop in the main program will not be executed.
- A real example of BackgroundScheduler
from apscheduler.schedulers.background import BackgroundScheduler import time
def job():
print('job 3s')
if name=='main':
sched = BackgroundScheduler(timezone=<span class="hljs-string">'MST'</span>) sched.add_job(job, <span class="hljs-string">'interval'</span>, id=<span class="hljs-string">'3_second_job'</span>, seconds=<span class="hljs-number">3</span>) sched.start() <span class="hljs-keyword">while</span>(<span class="hljs-keyword">True</span>): print(<span class="hljs-string">'main 1s'</span>) time.sleep(<span class="hljs-number">1</span>)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
It can be seen that the BackgroundScheduler will not block the current thread after calling the start function, so it can continue to execute the logic of the while loop in the main program.
main 1s main 1s main 1s job 3s main 1s main 1s main 1s job 3s
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
From this output, we can also find that job() does not start immediately after the start function is called. Instead, wait for 3s before being scheduled for execution.
How to make a job run after start()
How can the scheduler call the start function, and the job() starts executing immediately?
In fact, APScheduler does not provide a good way to solve this problem, but the simplest way is to run job() before the scheduler start s, as follows
from apscheduler.schedulers.background import BackgroundScheduler import time
def job():
print('job 3s')
if name=='main':
job()
sched = BackgroundScheduler(timezone='MST')
sched.add_job(job, 'interval', id='3_second_job', seconds=3)
sched.start()
<span class="hljs-keyword">while</span>(<span class="hljs-keyword">True</span>): print(<span class="hljs-string">'main 1s'</span>) time.sleep(<span class="hljs-number">1</span>)<div class="hljs-button {2}" data-title="copy" data-report-click="{"spm":"1001.2101.3001.4259"}"></div></code><ul class="pre-numbering" style=""><li style="color: rgb(153, 153, 153);">1</li><li style="color: rgb(153, 153, 153);">2</li><li style="color: rgb(153, 153, 153);">3</li><li style="color: rgb(153, 153, 153);">4</li><li style="color: rgb(153, 153, 153);">5</li><li style="color: rgb(153, 153, 153);">6</li><li style="color: rgb(153, 153, 153);">7</li><li style="color: rgb(153, 153, 153);">8</li><li style="color: rgb(153, 153, 153);">9</li><li style="color: rgb(153, 153, 153);">10</li><li style="color: rgb(153, 153, 153);">11</li><li style="color: rgb(153, 153, 153);">12</li><li style="color: rgb(153, 153, 153);">13</li><li style="color: rgb(153, 153, 153);">14</li><li style="color: rgb(153, 153, 153);">15</li><li style="color: rgb(153, 153, 153);">16</li></ul></pre>
In this way, the following output can be obtained
job 3s main 1s main 1s main 1s job 3s main 1s main 1s main 1s
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
In this way, although it does not absolutely "let the job start running after start()", it can also "run the job at the beginning without waiting for scheduling".
What happens if the job takes too long to execute
What happens if it takes 5s to execute job(), but the scheduler is configured to call job() every 3s? We wrote the following examples:
from apscheduler.schedulers.background import BackgroundScheduler import time
def job():
print('job 3s')
time.sleep(5)
if name=='main':
sched = BackgroundScheduler(timezone=<span class="hljs-string">'MST'</span>) sched.add_job(job, <span class="hljs-string">'interval'</span>, id=<span class="hljs-string">'3_second_job'</span>, seconds=<span class="hljs-number">3</span>) sched.start() <span class="hljs-keyword">while</span>(<span class="hljs-keyword">True</span>): print(<span class="hljs-string">'main 1s'</span>) time.sleep(<span class="hljs-number">1</span>)
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
Running this program, we get the following output:
main 1s main 1s main 1s job 3s main 1s main 1s main 1s Execution of job "job (trigger: interval[0:00:03], next run at: 2018-05-07 02:44:29 MST)" skipped: maximum number of running instances reached (1) main 1s main 1s main 1s job 3s main 1s
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
It can be seen that when the 3s time arrives, it will not "restart a job thread", but skip the scheduling, wait until the next cycle (wait for 3s), and reschedule the job().
In order to make multiple jobs () run at the same time, we can also configure the scheduler parameter max_instances, as shown in the following example, we allow two jobs () to run simultaneously:
from apscheduler.schedulers.background import BackgroundScheduler import time
def job():
print('job 3s')
time.sleep(5)
if name=='main':
job_defaults = { 'max_instances': 2 }
sched = BackgroundScheduler(timezone='MST', job_defaults=job_defaults)
sched.add_job(job, 'interval', id='3_second_job', seconds=3)
sched.start()
<span class="hljs-keyword">while</span>(<span class="hljs-keyword">True</span>): print(<span class="hljs-string">'main 1s'</span>) time.sleep(<span class="hljs-number">1</span>)<div class="hljs-button {2}" data-title="copy" data-report-click="{"spm":"1001.2101.3001.4259"}"></div></code><ul class="pre-numbering" style=""><li style="color: rgb(153, 153, 153);">1</li><li style="color: rgb(153, 153, 153);">2</li><li style="color: rgb(153, 153, 153);">3</li><li style="color: rgb(153, 153, 153);">4</li><li style="color: rgb(153, 153, 153);">5</li><li style="color: rgb(153, 153, 153);">6</li><li style="color: rgb(153, 153, 153);">7</li><li style="color: rgb(153, 153, 153);">8</li><li style="color: rgb(153, 153, 153);">9</li><li style="color: rgb(153, 153, 153);">10</li><li style="color: rgb(153, 153, 153);">11</li><li style="color: rgb(153, 153, 153);">12</li><li style="color: rgb(153, 153, 153);">13</li><li style="color: rgb(153, 153, 153);">14</li><li style="color: rgb(153, 153, 153);">15</li><li style="color: rgb(153, 153, 153);">16</li></ul></pre>
After running the program, we get the following output:
main 1s main 1s main 1s job 3s main 1s main 1s main 1s job 3s main 1s main 1s main 1s job 3s
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
How is each job scheduled
Through the above example, we find that the scheduler implements scheduling by regularly scheduling the job() function.
Will the job() function be scheduled to run as a process or as a thread?
In order to clarify this problem, we wrote the following procedure:
from apscheduler.schedulers.background import BackgroundScheduler import time,os,threading
def job():
print('job thread_id-{0}, process_id-{1}'.format(threading.get_ident(), os.getpid()))
time.sleep(50)
if name=='main':
job_defaults = { 'max_instances': 20 }
sched = BackgroundScheduler(timezone='MST', job_defaults=job_defaults)
sched.add_job(job, 'interval', id='3_second_job', seconds=3)
sched.start()
<span class="hljs-keyword">while</span>(<span class="hljs-keyword">True</span>): print(<span class="hljs-string">'main 1s'</span>) time.sleep(<span class="hljs-number">1</span>)<div class="hljs-button {2}" data-title="copy" data-report-click="{"spm":"1001.2101.3001.4259"}"></div></code><ul class="pre-numbering" style=""><li style="color: rgb(153, 153, 153);">1</li><li style="color: rgb(153, 153, 153);">2</li><li style="color: rgb(153, 153, 153);">3</li><li style="color: rgb(153, 153, 153);">4</li><li style="color: rgb(153, 153, 153);">5</li><li style="color: rgb(153, 153, 153);">6</li><li style="color: rgb(153, 153, 153);">7</li><li style="color: rgb(153, 153, 153);">8</li><li style="color: rgb(153, 153, 153);">9</li><li style="color: rgb(153, 153, 153);">10</li><li style="color: rgb(153, 153, 153);">11</li><li style="color: rgb(153, 153, 153);">12</li><li style="color: rgb(153, 153, 153);">13</li><li style="color: rgb(153, 153, 153);">14</li><li style="color: rgb(153, 153, 153);">15</li><li style="color: rgb(153, 153, 153);">16</li></ul></pre>
After running the program, we get the following output:
main 1s main 1s main 1s job thread_id-10644, process_id-8872 main 1s main 1s main 1s job thread_id-3024, process_id-8872 main 1s main 1s main 1s job thread_id-6728, process_id-8872 main 1s main 1s main 1s job thread_id-11716, process_id-8872
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
It can be seen that the process ID of each job() is the same, but the thread ID is different. Therefore, job() is finally scheduled to execute in the form of thread.
reference resources
- https://www.cnblogs.com/quijote/p/4385774.html
- http://debugo.com/apscheduler/
- https://stackoverflow.com/questions/34020161/python-apscheduler-skipped-maximum-number-of-running-instances-reached