apscheduler scheduled task

Posted by darkshine on Tue, 18 Jan 2022 19:10:45 +0100

1 Introduction

The full name of APScheduler is Advanced Python Scheduler. It is a lightweight Python timing task scheduling framework. APScheduler supports three scheduling tasks: fixed time interval, fixed time point (date), and cronab command under Linux. At the same time, it also supports asynchronous execution and background execution of scheduling tasks.

2 installation

Using pip package management tool to install APScheduler is the most convenient and fast.

pip install APScheduler
# If the installation fails due to download failure, it is recommended to use an agent
pip --proxy http://Agent ip port: install APScheduler

3 use steps

APScheduler is relatively simple to use. Running a scheduling task requires only the following trilogy.

Create a new scheduler.
Add a scheduling task (job stores).
Run the scheduled task.

The following is a simple example code to execute time reporting every 2 seconds:

import datetime
import time
from apscheduler.schedulers.background import BackgroundScheduler

def timedTask():
    print(datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S.%f")[:-3])

if __name__ == '__main__':
    # Create schedulers for background execution
    scheduler = BackgroundScheduler()  
    # Add scheduled task
    # The scheduling method is timedTask, the trigger selects interval, and the interval length is 2 seconds
    scheduler.add_job(timedTask, 'interval', seconds=2)
    # Start scheduling task
    scheduler.start()
    
    while True:
        print(time.time())
        time.sleep(5)

4 basic components

APScheduler has four components: scheduler, job store, trigger and executor.

schedulers
It is a task scheduler and belongs to the controller role. It configures job storage and executors that can be completed in the scheduler, such as adding, modifying, and removing jobs.
triggers
Describes the conditions under which a scheduled task is triggered. However, the trigger is completely stateless.
job stores
The task persistence warehouse saves the task in memory by default, and can also save the task in various databases. The data in the task is serialized, saved to the persistence database, and then deserialized after loading from the database.
executors
It is responsible for processing the operation of jobs. They are usually carried out by submitting the specified callable object to a thread or entering the city pool in the job. When the job is completed, the executor notifies the scheduler.

4.1 schedulers

I personally think APScheduler is very easy to use. It provides seven kinds of schedulers, which can meet the needs of various scenarios. For example, perform an operation in the background, perform an operation asynchronously, etc. The schedulers are:

Blocking scheduler: the scheduler runs in the main thread of the current process, that is, it will block the current thread.
BackgroundScheduler: the scheduler runs in the background thread and will not block the current thread.
Asyncio scheduler: used in conjunction with asyncio module (an asynchronous framework).
GeventScheduler: the program uses gevent (high-performance Python concurrency framework) as the IO model, which is used in conjunction with GeventExecutor.
Tornado scheduler: the IO model of Tornado (a web framework) is used in the program, using ioloop add_ Timeout completes the timed wake-up.
TwistedScheduler: cooperate with twistexecutor and use reactor Calllater completes the timed wake-up.
QtScheduler: your application is a Qt application. You need to use QTimer to complete timed wake-up.

4.2 triggers

The APScheduler has three built-in trigger s:

1) date trigger

date is the most basic scheduling, and the job task will be executed only once. It represents a specific point in time trigger. Its parameters are as follows:

parameter	explain
run_date (datetime or str)	The date or time the job was run
timezone (datetime.tzinfo or str)	Specify time zone

The usage example of date trigger is as follows:

from datetime import datetime
from datetime import date
from apscheduler.schedulers.background import BackgroundScheduler

def job_func(text):
    print(text)

scheduler = BackgroundScheduler()
# Run a job on December 13, 2017_ Func method
scheduler .add_job(job_func, 'date', run_date=date(2017, 12, 13), args=['text'])
# Run a job at 14:00:00 on December 13, 2017_ Func method
scheduler .add_job(job_func, 'date', run_date=datetime(2017, 12, 13, 14, 0, 0), args=['text'])
# Run a job at 14:00:01 on December 13, 2017_ Func method
scheduler .add_job(job_func, 'date', run_date='2017-12-13 14:00:01', args=['text'])

scheduler.start()

2) interval trigger

Triggered at fixed time intervals. interval scheduling. The parameters are as follows:

parameter	explain
weeks (int)	A few weeks apart
days (int)	Every few days
hours (int)	A few hours apart
minutes (int)	Every few minutes
seconds (int)	How many seconds apart
start_date (datetime or str)	Start date
end_date (datetime or str)	End date
timezone (datetime.tzinfo or str)	time zone

The use example of interval trigger is as follows:

import datetime
from apscheduler.schedulers.background import BackgroundScheduler

def job_func(text):
    print(datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S.%f")[:-3])

scheduler = BackgroundScheduler()
# The job is executed every two minutes_ Func method
scheduler .add_job(job_func, 'interval', minutes=2)
# Execute the job every two minutes from 14:00:01 on December 13, 2017 to 14:00:10 on December 13, 2017_ Func method
scheduler .add_job(job_func, 'interval', minutes=2, start_date='2017-12-13 14:00:01' , end_date='2017-12-13 14:00:10')

scheduler.start()

3) cron trigger

Triggered periodically at a specific time, compatible with the Linux crontab format. It is the most powerful trigger.
Let's first understand the cron parameters:

parameter	explain
year (int or str)	Year, 4 digits
month (int or str)	Month (range 1-12)
day (int or str)	Day (range 1-31)
week (int or str)	Weeks (range 1-53)
day_of_week (int or str)	Day or day of the week (range 0-6 or mon,tue,wed,thu,fri,sat,sun)
hour (int or str)	When (range 0-23)
minute (int or str)	Minutes (range 0-59)
second (int or str)	Seconds (range 0-59)
start_date (datetime or str)	Earliest start date (inclusive)
end_date (datetime or str)	Latest end time (inclusive)
timezone (datetime.tzinfo or str)	Specify time zone

These parameters support arithmetic expressions. The value formats are as follows:

Examples of cron triggers are as follows:

import datetime
from apscheduler.schedulers.background import BackgroundScheduler

def job_func(text):
    print("Current time:", datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S.%f")[:-3])

scheduler = BackgroundScheduler()
# Execute jobs at 00:00, 01:00, 02:00 and 03:00 on Mondays and Tuesdays from January to March and July to September every year_ Func task
scheduler .add_job(job_func, 'cron', month='1-3,7-9',day='0, tue', hour='0-3')

scheduler.start()

4.3 job store

This component is used to manage scheduling tasks.

1) Add job

There are two ways to add, one of which is add used by the above code_ Job (), the other is scheduled_ The job () modifier modifies the function.

The difference between the two methods is that the first method returns an apscheduler job. An instance of a job that can be used to change or remove a job. The second method applies only to jobs that will not change during the application run.

Example of the second way to add a task:

import datetime
from apscheduler.schedulers.background import BackgroundScheduler

@scheduler.scheduled_job(job_func, 'interval', minutes=2)
def job_func(text):
    print(datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M:%S.%f")[:-3])

scheduler = BackgroundScheduler()
scheduler.start()

2) Remove job

There are also two ways to remove a job: remove_job() and job remove().
remove_ The job () is removed according to the id of the job, so an id should be specified when the job is created.
job.remove() is to execute the remove method on the job

scheduler.add_job(job_func, 'interval', minutes=2, id='job_one')
scheduler.remove_job(job_one)

job = add_job(job_func, 'interval', minutes=2, id='job_one')
job.remvoe()

3) Get job list

Via scheduler get_ The jobs () method can get a list of all job in the current scheduler.

4) Modify job

If you want to modify a job due to a plan change, you can use job Modify () or modify_ The job () method to modify the properties of the job. However, it is worth noting that the job id cannot be modified.

scheduler.add_job(job_func, 'interval', minutes=2, id='job_one')
scheduler.start()
# Modify the trigger interval to 5 minutes
scheduler.modify_job('job_one', minutes=5)

job = scheduler.add_job(job_func, 'interval', minutes=2)
# Modify the trigger interval to 5 minutes
job.modify(minutes=5)

5) Close job

By default, the scheduler will close all schedulers and job stores after all running jobs are completed. If you don't want to wait, you can set the wait option to False.

scheduler.shutdown()
scheduler.shutdown(wait=false)

4.4 actuator

As the name suggests, an actuator is a module that performs scheduling tasks. There are two most commonly used executors: ProcessPoolExecutor and ThreadPoolExecutor

The following is an example of code that explicitly sets the job store (using mongo storage) and the executor.

from pymongo import MongoClient
from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.jobstores.mongodb import MongoDBJobStore
from apscheduler.jobstores.memory import MemoryJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
 
 
def my_job():
    print 'hello world'
host = '127.0.0.1'
port = 27017
client = MongoClient(host, port)
 
jobstores = {
    'mongo': MongoDBJobStore(collection='job', database='test', client=client),
    'default': MemoryJobStore()
}
executors = {
    'default': ThreadPoolExecutor(10),
    'processpool': ProcessPoolExecutor(3)
}
job_defaults = {
    'coalesce': False,
    'max_instances': 3
}
scheduler = BlockingScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults)
scheduler.add_job(my_job, 'interval', seconds=5)
 
try:
    scheduler.start()
except SystemExit:
    client.close()

Programmer Think