[source code analysis] sending Task & AMQP of distributed task queue Celery

Posted by feldon23 on Fri, 04 Mar 2022 07:41:19 +0100

[source code analysis] sending task & AMQP of distributed task queue Celery

0x00 summary

Celery is a simple, flexible and reliable distributed system that processes a large number of messages. It focuses on asynchronous task queue for real-time processing, and also supports task scheduling.

In the previous article, we saw the analysis of tasks. In this article, we focus on how to send tasks on the client and how to use the amqp object of Celery.

Before reading, we still need to ask a few questions as a guide:

  • When the client starts, how is the Celery application and user-defined Task generated?
  • What does the Task decorator do?
  • How are messages assembled when sending a Task?
  • When sending a Task, what medium (module) is used to send it? amqp?
  • How to store tasks in Redis after they are sent out?

Note: when sorting out the article, I found that I missed an article, which will affect your reading ideas. I hereby make it up. Please understand.

[Source code analysis] message queue mailbox of Kombu

[[source code analysis] message queue Kombu Hub

[[source code analysis] Consumer of message queue Kombu

[[source code analysis] message queue Kombu Producer

[Source code analysis] message queue Kombu startup process

[Source code analysis] basic architecture of message queue Kombu

[ Source code analysis] architecture of parallel distributed framework Celery (1)

[ Distributed task queue architecture (Celery 2)

[ Source code analysis] worker startup of parallel distributed framework Celery (1)

[Source code analysis] worker startup of parallel distributed framework Celery (2)

[ [source code analysis] startup of parallel distributed task queue Celery Consumer

[ [source code analysis] what is the Task of parallel distributed Task queue Celery

[design from source code] sending task & AMQP of celery This article explains sending tasks from the perspective of the client

[[source code analysis] consumption dynamic process of parallel distributed task queue Celery The next article explains how to consume the received Task from the perspective of the server

[Multi process model of parallel distributed task queue Celery

0x01 example code

We first give the sample code.

1.1 server

The server side of the sample code is as follows. Here, a decorator is used to wrap the task to be executed.

from celery import Celery

app = Celery('myTest', broker='redis://localhost:6379')

@app.task
def add(x,y):
    return x+y

if __name__ == '__main__':
    app.worker_main(argv=['worker'])

1.2 client

The client sends the following code, which is to call add Task for addition calculation:

from myTest import add
re = add.apply_async((2,17))

Let's start with a specific introduction. The following are the execution sequences of the client.

0x02 system startup

We will first introduce how to start the Celery system and task (instance) on the client.

2.1 generate Celery

The following code will first execute the myTest Celery.

app = Celery('myTest', broker='redis://localhost:6379')

2.2 task decorator

Celery uses decorators to wrap the tasks to be performed (because of similar concepts in various languages, the terms decorator or annotation may be mixed in this article)

@app.task
def add(x,y):
    return x+y

The task decorator actually returns_ create_task_cls is the result of the execution of this internal function.

This function returns a Proxy, which will be executed when it is actually executed_ task_from_fun.

_ task_ from_ The function of fun is to add the task to the global variable, that is, when calling_ task_ from_ The task will be added to the app task list during fun, so as to achieve the purpose of sharing all tasks. In this way, the client can know the task.

    def task(self, *args, **opts):
        """Decorator to create a task class out of any callable. """
        if USING_EXECV and opts.get('lazy', True):
            from . import shared_task
            return shared_task(*args, lazy=False, **opts)

        def inner_create_task_cls(shared=True, filter=None, lazy=True, **opts):
            _filt = filter

            def _create_task_cls(fun):
                if shared:
                    def cons(app):
                        return app._task_from_fun(fun, **opts) # Add the task to the global variable when called_ task_ from_ The task will be added to the app task list during fun, so as to achieve the purpose of sharing all tasks
                    cons.__name__ = fun.__name__
                    connect_on_app_finalize(cons)
                if not lazy or self.finalized:
                    ret = self._task_from_fun(fun, **opts)
                else:
                    # return a proxy object that evaluates on first use
                    ret = PromiseProxy(self._task_from_fun, (fun,), opts,
                                       __doc__=fun.__doc__)
                    self._pending.append(ret)
                if _filt:
                    return _filt(ret)
                return ret

            return _create_task_cls

        if len(args) == 1:
            if callable(args[0]):
                return inner_create_task_cls(**opts)(*args) #Execution here
        return inner_create_task_cls(**opts)

Let's analyze this decorator in detail.

2.2.1 adding tasks

During initialization, when adding this task for each app, it will call app_ task_ from_ fun(fun, **options).

The specific functions are:

  • Judge various parameter configurations;
  • Create task dynamically;
  • Add task to_ tasks task;
  • Bind relevant attributes to the instance with the bind method of task;

The code is as follows:

    def _task_from_fun(self, fun, name=None, base=None, bind=False, **options):

        name = name or self.gen_task_name(fun.__name__, fun.__module__)         # If a name is passed in, use it; otherwise, use the form of moudle name
        base = base or self.Task                                                # Whether to pass in the task. Otherwise, the default cell is the task class of the class itself app. task:Task

        if name not in self._tasks:                                             # If the name of the task to be added is no longer_ In tasks
            run = fun if bind else staticmethod(fun)                            # bind this method. If yes, this method is used directly. Otherwise, it is set as a static method
            task = type(fun.__name__, (base,), dict({
                'app': self,                                                    # Dynamically create Task class instances
                'name': name,                                                   # name of Task
                'run': run,                                                     # run method of task
                '_decorated': True,                                             # Whether to decorate
                '__doc__': fun.__doc__,
                '__module__': fun.__module__,
                '__header__': staticmethod(head_from_fun(fun, bound=bind)),
                '__wrapped__': run}, **options))()                              
            # for some reason __qualname__ cannot be set in type()
            # so we have to set it here.
            try:
                task.__qualname__ = fun.__qualname__                            
            except AttributeError:
                pass
            self._tasks[task.name] = task                                       # Add task to_ tasks task
            task.bind(self)  # connects task to this app                        # Call the bind method of task to bind relevant properties to the instance

            add_autoretry_behaviour(task, **options)
        else:
            task = self._tasks[name]
        return task  

2.2.2 binding

The bind method is used to bind related attributes to the instance. Because it is not enough to know only the task name or code, you also need to get the task instance at runtime.

@classmethod
def bind(cls, app):
    was_bound, cls.__bound__ = cls.__bound__, True
    cls._app = app                                          # Set class_ app properties
    conf = app.conf                                         # Get app configuration information
    cls._exec_options = None  # clear option cache

    if cls.typing is None:
        cls.typing = app.strict_typing

    for attr_name, config_name in cls.from_config:          # Set default values in class
        if getattr(cls, attr_name, None) is None:           # If the property obtained is empty
            setattr(cls, attr_name, conf[config_name])      # Use the default values in the app configuration

    # decorate with annotations from config.
    if not was_bound:
        cls.annotate()

        from celery.utils.threads import LocalStack
        cls.request_stack = LocalStack()                    # Save data using thread stack

    # PeriodicTask uses this to add itself to the PeriodicTask schedule.
    cls.on_bound(app)

    return app

2.3 summary

So far, on the client (user side), the Celery application has been started, a task instance has been generated, and its properties are bound to the instance.

0x03 amqp class

Call apply on the client_ When async is used, app will be called send_ Task to send specific tasks, in which amqp is used, so let's talk about the amqp class first.

3.1 generation

In send_ The following code is in the task:

    def send_task(self, ....):
        """Send task by name.
        """
        parent = have_parent = None
        amqp = self.amqp # Generate at this time

At this time, self is the Celery application itself. Let's print out the specific content and see what the Celery application looks like from below.

self = {Celery} <Celery myTest at 0x1eeb5590488>
 AsyncResult = {type} <class 'celery.result.AsyncResult'>
 Beat = {type} <class 'celery.apps.beat.Beat'>
 GroupResult = {type} <class 'celery.result.GroupResult'>
 Pickler = {type} <class 'celery.app.utils.AppPickler'>
 ResultSet = {type} <class 'celery.result.ResultSet'>
 Task = {type} <class 'celery.app.task.Task'>
 WorkController = {type} <class 'celery.worker.worker.WorkController'>
 Worker = {type} <class 'celery.apps.worker.Worker'>
 amqp = {AMQP} <celery.app.amqp.AMQP object at 0x000001EEB5884188>
 amqp_cls = {str} 'celery.app.amqp:AMQP'
 backend = {DisabledBackend} <celery.backends.base.DisabledBackend object at 0x000001EEB584E248>
 clock = {LamportClock} 0
 control = {Control} <celery.app.control.Control object at 0x000001EEB57B37C8>
 events = {Events} <celery.app.events.Events object at 0x000001EEB56C7188>
 loader = {AppLoader} <celery.loaders.app.AppLoader object at 0x000001EEB5705408>
 main = {str} 'myTest'
 pool = {ConnectionPool} <kombu.connection.ConnectionPool object at 0x000001EEB57A9688>
 producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x000001EEB6297508>
 registry_cls = {type} <class 'celery.app.registry.TaskRegistry'>
 tasks = {TaskRegistry: 10} {'myTest.add': <@task: myTest.add of myTest at 0x1eeb5590488>, 'celery.accumulate': <@task: celery.accumulate of myTest at 0x1eeb5590488>, 'celery.chord_unlock': <@task: celery.chord_unlock of myTest at 0x1eeb5590488>, 'celery.chunks': <@task: celery.chunks of myTest at 0x1eeb5590488>, 'celery.backend_cleanup': <@task: celery.backend_cleanup of myTest at 0x1eeb5590488>, 'celery.group': <@task: celery.group of myTest at 0x1eeb5590488>, 'celery.map': <@task: celery.map of myTest at 0x1eeb5590488>, 'celery.chain': <@task: celery.chain of myTest at 0x1eeb5590488>, 'celery.starmap': <@task: celery.starmap of myTest at 0x1eeb5590488>, 'celery.chord': <@task: celery.chord of myTest at 0x1eeb5590488>}

Stack is:

amqp, base.py:1205
__get__, objects.py:43
send_task, base.py:705
apply_async, task.py:565
<module>, myclient.py:4

Why can an assignment statement generate amqp? Because it is cached_property modification.

Using cached_ The function modified by property becomes the property of the object. When the object references the property for the first time, the function will be called. When the object references the property for the second time, it will be taken directly from the dictionary, that is, Caches the return value of the get method on first call.

    @cached_property
    def amqp(self):
        """AMQP related functionality: :class:`~@amqp`."""
        return instantiate(self.amqp_cls, app=self)

3.2 definitions

AMQP class is another encapsulation of the implementation of AMQP protocol. Here, it is actually another encapsulation of kombu class.

class AMQP:
    """App AMQP API: app.amqp."""

    Connection = Connection
    Consumer = Consumer
    Producer = Producer

    #: compat alias to Connection
    BrokerConnection = Connection

    queues_cls = Queues

    #: Cached and prepared routing table.
    _rtable = None

    #: Underlying producer pool instance automatically
    #: set by the :attr:`producer_pool`.
    _producer_pool = None

    # Exchange class/function used when defining automatic queues.
    # For example, you can use ``autoexchange = lambda n: None`` to use the
    # AMQP default exchange: a shortcut to bypass routing
    # and instead send directly to the queue named in the routing key.
    autoexchange = None

Let's print out the details and see what amqp looks like.

amqp = {AMQP}  
 BrokerConnection = {type} <class 'kombu.connection.Connection'>
 Connection = {type} <class 'kombu.connection.Connection'>
 Consumer = {type} <class 'kombu.messaging.Consumer'>
 Producer = {type} <class 'kombu.messaging.Producer'>
 app = {Celery} <Celery myTest at 0x252bd2903c8>
 argsrepr_maxsize = {int} 1024
 autoexchange = {NoneType} None
 default_exchange = {Exchange} Exchange celery(direct)
 default_queue = {Queue} <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>
 kwargsrepr_maxsize = {int} 1024
 producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
 publisher_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
 queues = {Queues: 1} {'celery': <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>}
 queues_cls = {type} <class 'celery.app.amqp.Queues'>
 router = {Router} <celery.app.routes.Router object at 0x00000252BDC6B248>
 routes = {tuple: 0} ()
 task_protocols = {dict: 2} {1: <bound method AMQP.as_task_v1 of <celery.app.amqp.AMQP object at 0x00000252BDC74148>>, 2: <bound method AMQP.as_task_v2 of <celery.app.amqp.AMQP object at 0x00000252BDC74148>>}
 utc = {bool} True
  _event_dispatcher = {EventDispatcher} <celery.events.dispatcher.EventDispatcher object at 0x00000252BE750348>
  _producer_pool = {ProducerPool} <kombu.pools.ProducerPool object at 0x00000252BDC8F408>
  _rtable = {tuple: 0} ()

The specific logic is as follows:

+---------+
| Celery  |    +----------------------------+
|         |    |   celery.app.amqp.AMQP     |
|         |    |                            |
|         |    |                            |
|         |    |          BrokerConnection +----->  kombu.connection.Connection
|         |    |                            |
|   amqp+----->+          Connection       +----->  kombu.connection.Connection
|         |    |                            |
+---------+    |          Consumer         +----->  kombu.messaging.Consumer
               |                            |
               |          Producer         +----->  kombu.messaging.Producer
               |                            |
               |          producer_pool    +----->  kombu.pools.ProducerPool
               |                            |
               |          queues           +----->  celery.app.amqp.Queues
               |                            |
               |          router           +----->  celery.app.routes.Router
               +----------------------------+

0x04 send Task

Let's then look at how the client sends the task.

from myTest import add
re = add.apply_async((2,17))

The following logic is summarized:

  • The initialization process of Producer completes the connection content, such as calling self Connect method, connect the carrier to the predetermined Transport class, and initialize Chanel, self chanel = self. connection;
  • Call Message to encapsulate the Message;
  • Exchange routing_key to queue;
  • Call amqp to send message;
  • Channel is responsible for the final news release;

Let's read it in detail below.

4.1 apply_async in task

Here are two important points:

  • If it's a task_always_eager, a kombu is generated producer;
  • Otherwise, call amqp to send the task (we mainly see here);

The reduced version code is as follows:

    def apply_async(self, args=None, kwargs=None, task_id=None, producer=None,
                    link=None, link_error=None, shadow=None, **options):
        """Apply tasks asynchronously by sending a message.
        """
        
        preopts = self._get_exec_options()
        options = dict(preopts, **options) if options else preopts

        app = self._get_app()
        if app.conf.task_always_eager:
            # Get producer
            with app.producer_or_acquire(producer) as eager_producer:      
                serializer = options.get('serializer')
                body = args, kwargs
                content_type, content_encoding, data = serialization.dumps(
                    body, serializer,
                )
                args, kwargs = serialization.loads(
                    data, content_type, content_encoding,
                    accept=[content_type]
                )
            with denied_join_result():
                return self.apply(args, kwargs, task_id=task_id or uuid(),
                                  link=link, link_error=link_error, **options)
        else:
            return app.send_task( #Call here
                self.name, args, kwargs, task_id=task_id, producer=producer,
                link=link, link_error=link_error, result_cls=self.AsyncResult,
                shadow=shadow, task_type=self,
                **options
            )

Here is the following:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +-------------------+

4.2 send_task

This function is used to generate task information and call amqp to send the task:

  • Obtain amqp instance;
  • Set the task id, and generate the task id if it is not passed in;
  • Generate routing value. If not, use amqp's router;
  • Generate route information;
  • Generate task information;
  • If there is a connection, the producer is generated;
  • Send task message;
  • Generate asynchronous task instances;
  • Return results;

The details are as follows:

def send_task(self, name, ...):
    """Send task by name.
    """
    parent = have_parent = None
    amqp = self.amqp                                                    # Get amqp instance
    task_id = task_id or uuid()                                         # Set the task id. if it is not passed in, the task id will be generated
    producer = producer or publisher  # XXX compat                      # Generate this
    router = router or amqp.router                                      # Route value. If not, use amqp's router
    options = router.route(
        options, route_name or name, args, kwargs, task_type)           # Generate route information

    message = amqp.create_task_message( # Generate task information
        task_id, name, args, kwargs, countdown, eta, group_id, group_index,
        expires, retries, chord,
        maybe_list(link), maybe_list(link_error),
        reply_to or self.thread_oid, time_limit, soft_time_limit,
        self.conf.task_send_sent_event,
        root_id, parent_id, shadow, chain,
        argsrepr=options.get('argsrepr'),
        kwargsrepr=options.get('kwargsrepr'),
    )

    if connection:
        producer = amqp.Producer(connection)                            # If there is a connection, the producer is generated
    
    with self.producer_or_acquire(producer) as P:                       
        with P.connection._reraise_as_library_errors():
            self.backend.on_task_call(P, task_id)
            amqp.send_task_message(P, name, message, **options)         # Send task message 
    
    result = (result_cls or self.AsyncResult)(task_id)                  # Generate asynchronous task instance
    if add_to_parent:
        if not have_parent:
            parent, have_parent = self.current_worker_task, True
        if parent:
            parent.add_trail(result)
    return result                                                       # Return results

Here is the following:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +--------+----------+
                                       |
                                       |
                        2 send_task    |
                                       |
                                       v
                                +------+--------+
                                | Celery myTest |
                                |               |
                                +------+--------+
                                       |
                                       |
                  3 send_task_message  |
                                       |
                                       v
                               +-------+---------+
                               |      amqp       |
                               |                 |
                               |                 |
                               +-----------------+

4.3 generated message content

as_task_v2 will specifically generate the message content. You can see that if you implement a message, you need to use several parts:

  • headers, including: task name, task id, expires, etc;
  • Message type and encoding method: content encoding, content type;
  • Parameters: These are unique to Celery and are used to distinguish different queues, such as exchange and routing_key, etc;
  • Body: is the message body;

The final specific message is as follows:

{
	"body": "W1syLCA4XSwge30sIHsiY2FsbGJhY2tzIjogbnVsbCwgImVycmJhY2tzIjogbnVsbCwgImNoYWluIjogbnVsbCwgImNob3JkIjogbnVsbH1d",
	"content-encoding": "utf-8",
	"content-type": "application/json",
	"headers": {
		"lang": "py",
		"task": "myTest.add",
		"id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
		"shadow": null,
		"eta": null,
		"expires": null,
		"group": null,
		"group_index": null,
		"retries": 0,
		"timelimit": [null, null],
		"root_id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
		"parent_id": null,
		"argsrepr": "(2, 8)",
		"kwargsrepr": "{}",
		"origin": "gen33652@DESKTOP-0GO3RPO"
	},
	"properties": {
		"correlation_id": "243aac4a-361b-4408-9e0c-856e2655b7b5",
		"reply_to": "b34fcf3d-da9a-3717-a76f-44b6a6362da1",
		"delivery_mode": 2,
		"delivery_info": {
			"exchange": "",
			"routing_key": "celery"
		},
		"priority": 0,
		"body_encoding": "base64",
		"delivery_tag": "fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c"
	}
}

The specific code is as follows: send here_ Event is required for subsequent sending and is not reflected in the specific message content:

def as_task_v2(self, task_id, name, args=None, kwargs=None, ......):

    ......
    
    return task_message(
        headers={
            'lang': 'py',
            'task': name,
            'id': task_id,
            'shadow': shadow,
            'eta': eta,
            'expires': expires,
            'group': group_id,
            'group_index': group_index,
            'retries': retries,
            'timelimit': [time_limit, soft_time_limit],
            'root_id': root_id,
            'parent_id': parent_id,
            'argsrepr': argsrepr,
            'kwargsrepr': kwargsrepr,
            'origin': origin or anon_nodename()
        },
        properties={
            'correlation_id': task_id,
            'reply_to': reply_to or '',
        },
        body=(
            args, kwargs, {
                'callbacks': callbacks,
                'errbacks': errbacks,
                'chain': chain,
                'chord': chord,
            },
        ),
        sent_event={
            'uuid': task_id,
            'root_id': root_id,
            'parent_id': parent_id,
            'name': name,
            'args': argsrepr,
            'kwargs': kwargsrepr,
            'retries': retries,
            'eta': eta,
            'expires': expires,
        } if create_sent_event else None,
    )

4.4 send_task_message in amqp

amqp.send_task_message(P, name, message, **options) is used to AMQP send tasks.

This method is mainly used to assemble the parameters of the task to be sent, such as connection, queue, exchange and routing_key, etc. call the publish of producer to send the task.

The basic routine is:

  • Obtain a queue;
  • Get delivery_mode;
  • Obtain exchange;
  • Obtain retry strategy, etc;
  • Call producer to send message;
        def send_task_message(producer, name, message,
                              exchange=None, routing_key=None, queue=None,
                              event_dispatcher=None,
                              retry=None, retry_policy=None,
                              serializer=None, delivery_mode=None,
                              compression=None, declare=None,
                              headers=None, exchange_type=None, **kwargs):
    				# Get queue, get delivery_mode, obtain exchange, obtain retry policy, etc

            if before_receivers:
                send_before_publish(
                    sender=name, body=body,
                    exchange=exchange, routing_key=routing_key,
                    declare=declare, headers=headers2,
                    properties=properties, retry_policy=retry_policy,
                )
            
            ret = producer.publish(
                body,
                exchange=exchange,
                routing_key=routing_key,
                serializer=serializer or default_serializer,
                compression=compression or default_compressor,
                retry=retry, retry_policy=_rp,
                delivery_mode=delivery_mode, declare=declare,
                headers=headers2,
                **properties
            )
            if after_receivers:
                send_after_publish(sender=name, body=body, headers=headers2,
                                   exchange=exchange, routing_key=routing_key)
 
            .....
  
            if sent_event: # Here we deal with sent_event
                evd = event_dispatcher or default_evd
                exname = exchange
                if isinstance(exname, Exchange):
                    exname = exname.name
                sent_event.update({
                    'queue': qname,
                    'exchange': exname,
                    'routing_key': routing_key,
                })
                evd.publish('task-sent', sent_event,
                            producer, retry=retry, retry_policy=retry_policy)
            return ret
        return send_task_message

At this point, the stack is:

send_task_message, amqp.py:473
send_task, base.py:749
apply_async, task.py:565
<module>, myclient.py:4

The variable is:

qname = {str} 'celery'
queue = {Queue} <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>
 ContentDisallowed = {type} <class 'kombu.exceptions.ContentDisallowed'>
 alias = {NoneType} None
 attrs = {tuple: 18} (('name', None), ('exchange', None), ('routing_key', None), ('queue_arguments', None), ('binding_arguments', None), ('consumer_arguments', None), ('durable', <class 'bool'>), ('exclusive', <class 'bool'>), ('auto_delete', <class 'bool'>), ('no_ack', None), ('alias', None), ('bindings', <class 'list'>), ('no_declare', <class 'bool'>), ('expires', <class 'float'>), ('message_ttl', <class 'float'>), ('max_length', <class 'int'>), ('max_length_bytes', <class 'int'>), ('max_priority', <class 'int'>))
 auto_delete = {bool} False
 binding_arguments = {NoneType} None
 bindings = {set: 0} set()
 can_cache_declaration = {bool} True
 channel = {str} 'Traceback (most recent call last):\n  File "C:\\Program Files\\JetBrains\\PyCharm Community Edition 2020.2.2\\plugins\\python-ce\\helpers\\pydev\\_pydevd_bundle\\pydevd_resolver.py", line 178, in _getPyDictionary\n    attr = getattr(var, n)\n  File "C:\\User
 consumer_arguments = {NoneType} None
 durable = {bool} True
 exchange = {Exchange} Exchange celery(direct)
 exclusive = {bool} False
 expires = {NoneType} None
 is_bound = {bool} False
 max_length = {NoneType} None
 max_length_bytes = {NoneType} None
 max_priority = {NoneType} None
 message_ttl = {NoneType} None
 name = {str} 'celery'
 no_ack = {bool} False
 no_declare = {NoneType} None
 on_declared = {NoneType} None
 queue_arguments = {NoneType} None
 routing_key = {str} 'celery'
  _channel = {NoneType} None
  _is_bound = {bool} False
queues = {Queues: 1} {'celery': <unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>}

The logic is as follows:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +--------+----------+
                                       |
                                       |
                          2 send_task  |
                                       |
                                       v
                                +------+--------+
                                | Celery myTest |
                                |               |
                                +------+--------+
                                       |
                                       |
                  3 send_task_message  |
                                       |
                                       v
                               +-------+---------+
                               |      amqp       |
                               +-------+---------+
                                       |
                                       |
                            4 publish  |
                                       |
                                       v
                                  +----+------+
                                  | producer  |
                                  |           |
                                  +-----------+

4.5 publish in producer

In produer, call channel to send information.

def _publish(self, body, priority, content_type, content_encoding,
             headers, properties, routing_key, mandatory,
             immediate, exchange, declare):
    channel = self.channel
    message = channel.prepare_message(
        body, priority, content_type,
        content_encoding, headers, properties,
    )
    if declare:
        maybe_declare = self.maybe_declare
        [maybe_declare(entity) for entity in declare]

    # handle autogenerated queue names for reply_to
    reply_to = properties.get('reply_to')
    if isinstance(reply_to, Queue):
        properties['reply_to'] = reply_to.name
    return channel.basic_publish( # send message
        message,
        exchange=exchange, routing_key=routing_key,
        mandatory=mandatory, immediate=immediate,
    )

The variables are:

body = {str} '[[2, 8], {}, {"callbacks": null, "errbacks": null, "chain": null, "chord": null}]'
compression = {NoneType} None
content_encoding = {str} 'utf-8'
content_type = {str} 'application/json'
declare = {list: 1} [<unbound Queue celery -> <unbound Exchange celery(direct)> -> celery>]
delivery_mode = {int} 2
exchange = {str} ''
exchange_name = {str} ''
expiration = {NoneType} None
headers = {dict: 15} {'lang': 'py', 'task': 'myTest.add', 'id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'shadow': None, 'eta': None, 'expires': None, 'group': None, 'group_index': None, 'retries': 0, 'timelimit': [None, None], 'root_id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'parent_id': None, 'argsrepr': '(2, 8)', 'kwargsrepr': '{}', 'origin': 'gen11468@DESKTOP-0GO3RPO'}
immediate = {bool} False
mandatory = {bool} False
priority = {int} 0
properties = {dict: 3} {'correlation_id': 'af0e4c14-a618-41b4-9340-1479cb7cde4f', 'reply_to': '2c938063-64b8-35f5-ac9f-a1c0915b6f71', 'delivery_mode': 2}
retry = {bool} True
retry_policy = {dict: 4} {'max_retries': 3, 'interval_start': 0, 'interval_max': 1, 'interval_step': 0.2}
routing_key = {str} 'celery'
self = {Producer} <Producer: <promise: 0x1eeb62c44c8>>
serializer = {str} 'json'

At this time, the logic is:

         1  apply_async       +-------------------+
                              |                   |
User  +---------------------> | task: myTest.add  |
                              |                   |
                              +--------+----------+
                                       |
                          2 send_task  |
                                       |
                                       v
                                +------+--------+
                                | Celery myTest |
                                |               |
                                +------+--------+
                                       |
                  3 send_task_message  |
                                       |
                                       v
                               +-------+---------+
                               |      amqp       |
                               +-------+---------+
                                       |
                            4 publish  |
                                       |
                                       v
                                  +----+------+
                                  | producer  |
                                  |           |
                                  +----+------+
                                       |
                                       |
                      5 basic_publish  |
                                       v
                                  +----+------+
                                  |  channel  |
                                  |           |
                                  +-----------+

So far, a task is sent out, waiting for consumers to consume the task.

4.6 redis content

After sending, the task is stored in the redis queue. The result of redis is:

127.0.0.1:6379> keys *
1) "_kombu.binding.reply.testMailbox.pidbox"
2) "_kombu.binding.testMailbox.pidbox"
3) "celery"
4) "_kombu.binding.celeryev"
5) "_kombu.binding.celery"
6) "_kombu.binding.reply.celery.pidbox"
127.0.0.1:6379> lrange celery 0 -1
1) "{\"body\": \"W1syLCA4XSwge30sIHsiY2FsbGJhY2tzIjogbnVsbCwgImVycmJhY2tzIjogbnVsbCwgImNoYWluIjogbnVsbCwgImNob3JkIjogbnVsbH1d\", \"content-encoding\": \"utf-8\", \"content-type\": \"application/json\", \"headers\": {\"lang\": \"py\", \"task\": \"myTest.add\", \"id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"shadow\": null, \"eta\": null, \"expires\": null, \"group\": null, \"group_index\": null, \"retries\": 0, \"timelimit\": [null, null], \"root_id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"parent_id\": null, \"argsrepr\": \"(2, 8)\", \"kwargsrepr\": \"{}\", \"origin\": \"gen33652@DESKTOP-0GO3RPO\"}, \"properties\": {\"correlation_id\": \"243aac4a-361b-4408-9e0c-856e2655b7b5\", \"reply_to\": \"b34fcf3d-da9a-3717-a76f-44b6a6362da1\", \"delivery_mode\": 2, \"delivery_info\": {\"exchange\": \"\", \"routing_key\": \"celery\"}, \"priority\": 0, \"body_encoding\": \"base64\", \"delivery_tag\": \"fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c\"}}"

4.6.1 delivery_tag action

As you can see, there is a delivery in the final message_ The tag variable needs special description here.

It can be considered as delivery_tag is the unique identifier of the message in redis, which is in UUID format.

Specific examples are as follows:

"delivery_tag": "fa1bc9c8-3709-4c02-9543-8d0fe3cf4e6c".

Subsequent use of QoS_ Tag to do various processing, such as ACK and snack.

with self.pipe_or_acquire() as pipe:
    pipe.zadd(self.unacked_index_key, *zadd_args) \
        .hset(self.unacked_key, delivery_tag,
              dumps([message._raw, EX, RK])) \
        .execute()
    super().append(message, delivery_tag)

4.6.2 delivery_ When is the tag generated

What we care about is when to generate a delivery when sending a message_ tag.

It was found that it was in the Channel_ next_ delivery_ In the tag function, the message is further enhanced before sending the message.

def _next_delivery_tag(self):
    return uuid()

The specific stack is as follows:

_next_delivery_tag, base.py:595
_inplace_augment_message, base.py:614
basic_publish, base.py:599
_publish, messaging.py:200
_ensured, connection.py:525
publish, messaging.py:178
send_task_message, amqp.py:532
send_task, base.py:749
apply_async, task.py:565
<module>, myclient.py:4

So far, the process of sending task s by the client has ended. You can have a look if you are interested[ [source code analysis] consumption dynamic process of parallel distributed task queue Celery This chapter explains how to consume the received Task from the perspective of the server.

0xFF reference

celery source code analysis - Task initialization and sending tasks

Celery source code analysis III: implementation of Task object

Distributed task queue Celery -- detailed workflow

★★★★★★★ thinking about life and technology ★★★★★★

Wechat public account: Rossi's thinking

If you want to get a personal message or push the technical information, you can scan the following two-dimensional code (or long Click to identify the two-dimensional code) and pay attention to the official account number.

Topics: Python Celery