Python: Why does a thread set Daemon

Posted by mjm on Fri, 12 Jul 2019 18:04:14 +0200

Preface

You won't Miss threads when using Python, but every time you talk about threads, everyone subconsciously says GIL global locks.

But in addition to this old-fashioned topic, there are many valuable things to explore, such as setDaemon().

Use of Threads and Problems

We write code like this to start multithreading:

import time
import threading

def test():
    while True:
        print threading.currentThread()
        time.sleep(1)

if __name__ == '__main__':
    t1 = threading.Thread(target=test)
    t2 = threading.Thread(target=test)
    t1.start()
    t2.start()

Output:

^C<Thread(Thread-2, started 123145414086656)>
<Thread(Thread-1, started 123145409880064)>
^C^C^C^C^C^C<Thread(Thread-2, started 123145414086656)>    # ctrl-c cannot be interrupted many times
 <Thread(Thread-1, started 123145409880064)>
^C<Thread(Thread-1, started 123145409880064)>
 <Thread(Thread-2, started 123145414086656)>
<Thread(Thread-1, started 123145409880064)>
 <Thread(Thread-2, started 123145414086656)>
<Thread(Thread-2, started 123145414086656)><Thread(Thread-1, started 123145409880064)>
...(Two threads competing to print)

Threading allows us to easily implement concurrent requirements, but it also presents us with a big challenge: how do we exit?

In the above program run, I have tried to press ctrl-c several times, can not interrupt the enthusiasm of this program work!Finally, you have to kill to finish.

So how can we avoid this problem?Or, how can a child thread exit automatically when the main thread exits?

Daemon Threads

Old drivers with similar experience must know that setDaemon() doesn't just turn a thread into a daemon thread:

import time
import threading

def test():
    while True:
        print threading.currentThread()
        time.sleep(1)

if __name__ == '__main__':
    t1 = threading.Thread(target=test)
    t1.setDaemon(True)
    t1.start()

    t2 = threading.Thread(target=test)
    t2.setDaemon(True)
    t2.start()

Output:

python2.7 1.py
<Thread(Thread-1, started daemon 123145439883264)>
<Thread(Thread-2, started daemon 123145444089856)>
(Quit directly)

Exit directly?Of course, because the main thread has finished executing, it is really finished, and because the daemon thread is set up, the child threads also exit at this time.

Unexpected daemon

So the problem is, when we used to learn C, it seems that we didn't use Daemon, for example:

#include <stdio.h>
#include <sys/syscall.h>
#include <pthread.h>

void *test(void *args)
{
    while (1)
    {
        printf("ThreadID: %d\n", syscall(SYS_gettid));
        sleep(1);
    }
}

int main()
{
    pthread_t t1 ;
    int ret = pthread_create(&t1, NULL, test, NULL);
    if (ret != 0)
    {
        printf("Thread create failed\n");
    }
   
    // Avoid exiting directly
    sleep(2);
    printf("Main run..\n");
}

Output:

# gcc -lpthread test_pytha.out & ./a
ThreadID: 31233
ThreadID: 31233
Main run.. (Quit without hesitation)

Since Python is also written in C, why do Python multithreaded exits require setDaemon???

To solve this problem, we're afraid we don't have to start from the moment the main thread exits, as before...

Gooseberry

At the end of the Python parser, wait_for_thread_shutdown is called for a routine cleanup:

// python2.7/python/pythonrun.c

static void
wait_for_thread_shutdown(void)
{
#ifdef WITH_THREAD
    PyObject *result;
    PyThreadState *tstate = PyThreadState_GET();
    PyObject *threading = PyMapping_GetItemString(tstate->interp->modules,
                                                  "threading");
    if (threading == NULL) {
        /* threading not imported */
        PyErr_Clear();
        return;
    }
    result = PyObject_CallMethod(threading, "_shutdown", "");
    if (result == NULL)
        PyErr_WriteUnraisable(threading);
    else
        Py_DECREF(result);
    Py_DECREF(threading);
#endif
}

As we see in #ifdef WITH_THREAD, we probably guess that this function runs different logic for multithreading or not.

Clearly, our script above hit this thread logic, so it dynamically imports the threading module and executes the _shutdown function.

The contents of this function can be seen from the threading module:

# /usr/lib/python2.7/threading.py

_shutdown = _MainThread()._exitfunc

class _MainThread(Thread):

    def __init__(self):
        Thread.__init__(self, name="MainThread")
        self._Thread__started.set()
        self._set_ident()
        with _active_limbo_lock:
            _active[_get_ident()] = self

    def _set_daemon(self):
        return False

    def _exitfunc(self):
        self._Thread__stop()
        t = _pickSomeNonDaemonThread()
        if t:
            if __debug__:
                self._note("%s: waiting for other threads", self)
        while t:
            t.join()
            t = _pickSomeNonDaemonThread()
        if __debug__:
            self._note("%s: exiting", self)
        self._Thread__delete()

def _pickSomeNonDaemonThread():
    for t in enumerate():
        if not t.daemon and t.is_alive():
            return t
    return None

_shutdown is essentially the content of _MainThread()._exitfunc, which recycles all the results returned by enumerate(), all join()

And what is enumerate()?

This is what we normally use, all the Python Thread objects that qualify for the current process:

>>> print threading.enumerate()
[<_MainThread(MainThread, started 140691994822400)>]
# /usr/lib/python2.7/threading.py

def enumerate():
    """Return a list of all Thread objects currently alive.

    The list includes daemonic threads, dummy thread objects created by
    current_thread(), and the main thread. It excludes terminated threads and
    threads that have not yet been started.

    """
    with _active_limbo_lock:
        return _active.values() + _limbo.values()

Eligible???What conditions do you meet?Don't worry, let me talk:

From Origin to Survival Conditions

Inside Python's thread model, threads are real native threads despite GIL interference

Python simply adds one more layer of encapsulation: t_bootstrap, and then executes the real processing function within that layer of encapsulation.

Within the threading module, we can also see a similar:

# /usr/lib/python2.7/threading.py

class Thread(_Verbose):
    def start(self):
        ...ellipsis
        with _active_limbo_lock:
            _limbo[self] = self             # A key
        try:
            _start_new_thread(self.__bootstrap, ())
        except Exception:
            with _active_limbo_lock:
                del _limbo[self]            # A key
            raise
        self.__started.wait()
        
    def __bootstrap(self):
        try:
            self.__bootstrap_inner()
        except:
            if self.__daemonic and _sys is None:
                return
            raise
         
    def __bootstrap_inner(self):
        try:
            ...ellipsis
            with _active_limbo_lock:
                _active[self.__ident] = self # A key
                del _limbo[self]             # A key
            ...ellipsis
            

In the above series of codes, the changes to _limbo and _active have been highlighted, and we can get the following definitions:

    _limbo: The object that calls start but has not yet reached _start_new_thread s
    _active:Living Thread Object

So back to the above, when _MainThread()._exitfunc executes, it checks for the existence of _limbo + _active objects throughout the process.

As long as one exists, join() is called, which is the cause of the blockage.

setDaemon Uses

Blocking indefinitely is not possible, and it is not the way to smart yourself to help users kill threads. So what can you do to be more elegant?

That is to provide a way for users to set the logo to exit with the process, setDaemon:

class Thread():
    ...ellipsis
    def setDaemon(self, daemonic):
        self.daemon = daemonic
        
    ...ellipsis
  
# Actually, it's also pasted on it, here it's pasted again
def _pickSomeNonDaemonThread():
    for t in enumerate():
        if not t.daemon and t.is_alive():
            return t
    return None

setDaemon(True) is set for all threads, and as soon as the main thread is ready to exit, it is obliged to be destroyed and recycled by the operating system.

It has always been curious why Python has no daemon attribute for pthread.

The result is that it really only works on the Python layer (manual smiles)

epilogue

A single setDaemon can lead to many opportunities to explore essential content, such as the process of creating threads, managing processes, and so on.

These are interesting things. We should explore them boldly and not limit ourselves to using them.

Welcome to the discussion group QQ: 258498217
Please indicate the source for reprinting: https://segmentfault.com/a/11...

Topics: Linux Python Attribute