Python 3 Chapter 9: metaprogramming

Posted by abda53 on Sat, 22 Jan 2022 14:13:32 +0100

Chapter 9: meta programming

The most classic mantra in the field of software development is "don't repeat yourself". That is, whenever there is highly repetitive (or copied by cutting) code in your program, you should think about whether there is a better solution. In Python, such problems can usually be solved through metaprogramming. In short, metaprogramming is about creating functions and classes that manipulate source code, such as modifying, generating, or wrapping the original code. The main technology is to use decorators, class decorators and metaclasses. However, there are other technologies, including signing objects, executing code using exec(), and Reflection on internal functions and classes. The main purpose of this chapter is to introduce these metaprogramming techniques and give examples to demonstrate how they customize your source code behavior.

9.1 adding wrappers to functions

problem

You want to add a wrapper to the function and add additional operation processing (such as logging, timing, etc.).

Solution

If you want to wrap a function with additional code, you can define a decorator function, for example:

import time
from functools import wraps

def timethis(func):
    '''
    Decorator that reports the execution time.
    '''
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(func.__name__, end-start)
        return result
    return wrapper

The following is an example of using a decorator:

>>> @timethis
... def countdown(n):
...     '''
...     Counts down
...     '''
...     while n > 0:
...         n -= 1
...
>>> countdown(100000)
countdown 0.008917808532714844
>>> countdown(10000000)
countdown 0.87188299392912
>>>

discuss

A decorator is a function that takes a function as an argument and returns a new function. When you write like this:

@timethis
def countdown(n):
    pass

The effect is the same as the following:

def countdown(n):
    pass
countdown = timethis(countdown)

By the way, the principle of built-in decorators such as @ staticmethod, @ classmethod and @ property is the same. For example, the following two code fragments are equivalent:

class A:
    @classmethod
    def method(cls):
        pass

class B:
    # Equivalent definition of a class method
    def method(cls):
        pass
    method = classmethod(method)

In the wrapper() function above, the decorator defines a function that uses * args and * * kwargs to accept arbitrary parameters. In this function, the original function is called and the result is returned, but you can add other additional code (such as timing). The new function wrapper is then returned as a result instead of the original function.

It should be emphasized that the decorator does not modify the parameter signature and return value of the original function. The purpose of using * args and * * kwargs is to ensure that any parameter can be applied. The returned result values are basically the returned results of calling the original function func(*args, **kwargs), where func is the original function.

At the beginning of learning decorators, I will use some simple examples to illustrate, such as the one demonstrated above. However, there are still some details to pay attention to when using the actual scene. For example, it is very important to use @ wraps(func) annotation above. It can retain the metadata of the original function (which will be discussed in the next section). Novices often ignore this detail. In the next few sections, we will explain the details of decorator function in more depth. If you want to construct your own decorator function, you need to take a careful look.

9.2 preserve function meta information when creating decorators

problem

You wrote a decorator to act on a function, but the important meta information of the function, such as name, document string, annotation and parameter signature, are lost.

Solution

Whenever you define decorators, you should use the @ wraps decorator in the functools library to annotate the underlying wrapper functions. For example:

import time
from functools import wraps
def timethis(func):
    '''
    Decorator that reports the execution time.
    '''
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(func.__name__, end-start)
        return result
    return wrapper

Let's use this wrapped function and check its meta information:

>>> @timethis
... def countdown(n):
...     '''
...     Counts down
...     '''
...     while n > 0:
...         n -= 1
...
>>> countdown(100000)
countdown 0.008917808532714844
>>> countdown.__name__
'countdown'
>>> countdown.__doc__
'\n\tCounts down\n\t'
>>> countdown.__annotations__
{'n': <class 'int'>}
>>>

discuss

Copying meta information is a very important part when writing decorators. If you forget to use @ wraps, you will find that the decorated function loses all useful information. For example, if @ wraps is ignored, the effect is as follows:

>>> countdown.__name__
'wrapper'
>>> countdown.__doc__
>>> countdown.__annotations__
{}
>>>

@An important feature of wraps is that it allows you to pass attributes__ wrapped__ Direct access to the wrapped function. For example:

>>> countdown.__wrapped__(100000)
>>>

__ wrapped__ Property also allows the decorated function to correctly expose the underlying parameter signature information. For example:

>>> from inspect import signature
>>> print(signature(countdown))
(n:int)
>>>

A very common problem is how to let the decorator directly copy the parameter signature information of the original function. If you want to implement it manually, you need to do a lot of work. You'd better simply use the @ wraps decorator. Through the bottom__ wrapped__ Property to access function signature information. For more information on signatures, refer to section 9.16.

9.3 release a decorator

problem

A decorator is already acting on a function. You want to undo it and directly access the original unpacked function.

Solution

Assuming that the decorator is implemented through @ wraps (refer to Section 9.2), you can access__ wrapped__ Property to access the original function:

>>> @somedecorator
>>> def add(x, y):
...     return x + y
...
>>> orig_add = add.__wrapped__
>>> orig_add(3, 4)
7
>>>

discuss

Direct access to unwrapped raw functions is useful for debugging, introspection, and other function operations. However, our scheme here only applies to the correct use of @ wraps in the wrapper or the direct setting of @ wraps__ wrapped__ Property.

If there are multiple wrappers, access__ wrapped__ The behavior of attributes is unpredictable and should be avoided. In Python 3 3, it will skip all packaging layers. For example, if you have the following code:

from functools import wraps

def decorator1(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print('Decorator 1')
        return func(*args, **kwargs)
    return wrapper

def decorator2(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print('Decorator 2')
        return func(*args, **kwargs)
    return wrapper

@decorator1
@decorator2
def add(x, y):
    return x + y

Now we're at Python 3 Test under 3:

>>> add(2, 3)
Decorator 1
Decorator 2
5
>>> add.__wrapped__(2, 3)
5
>>>

Now we're at Python 3 Test under 4 conditions:

>>> add(2, 3)
Decorator 1
Decorator 2
5
>>> add.__wrapped__(2, 3)
Decorator 2
5
>>>

Finally, not all decorators use @ wraps, so not all the schemes here are applicable. In particular, the built-in decorators @ staticmethod and @ classmethod do not follow this Convention (they store the original function in the attribute _func).

9.4 define a decorator with parameters

problem

You want to define a decorator that can accept parameters

Solution

We use an example to elaborate the process of accepting parameters. Suppose you want to write a decorator to add logging function to the function, and allow the user to specify the level of logging and other options. The following is the definition and use example of this decorator:

from functools import wraps
import logging

def logged(level, name=None, message=None):
    """
    Add logging to a function. level is the logging
    level, name is the logger name, and message is the
    log message. If name and message aren't specified,
    they default to the function's module and name.
    """
    def decorate(func):
        logname = name if name else func.__module__
        log = logging.getLogger(logname)
        logmsg = message if message else func.__name__

        @wraps(func)
        def wrapper(*args, **kwargs):
            log.log(level, logmsg)
            return func(*args, **kwargs)
        return wrapper
    return decorate

# Example use
@logged(logging.DEBUG)
def add(x, y):
    return x + y

@logged(logging.CRITICAL, 'example')
def spam():
    print('Spam!')

At first glance, this implementation looks complex, but the core idea is very simple. The outermost function, logged(), takes arguments and applies them to the inner decorator function. The inner function, modify (), takes a function as an argument and places a wrapper on it. The key point here is that the wrapper can use the parameters passed to logged().

discuss

Defining a wrapper that accepts parameters looks complex, mainly because of the underlying call sequence. In particular, if you have the following code:

@decorator(x, y, z)
def func(a, b):
    pass

The decorator processing procedure is equivalent to the following call;

def func(a, b):
    pass
func = decorator(x, y, z)(func)

The return result of decorator(x, y, z) must be a callable object, which accepts a function as a parameter and wraps it. Refer to another wrapper example of acceptable parameters in section 9.7.

9.5 decorator with customizable attributes

problem

You want to write a decorator to wrap a function and allow the user to provide parameters to control the behavior of the decorator at run time.

Solution

Introduce an access function and use nonlocal to modify internal variables. The access function is then assigned to the wrapper function as an attribute.

from functools import wraps, partial
import logging
# Utility decorator to attach a function as an attribute of obj
def attach_wrapper(obj, func=None):
    if func is None:
        return partial(attach_wrapper, obj)
    setattr(obj, func.__name__, func)
    return func

def logged(level, name=None, message=None):
    '''
    Add logging to a function. level is the logging
    level, name is the logger name, and message is the
    log message. If name and message aren't specified,
    they default to the function's module and name.
    '''
    def decorate(func):
        logname = name if name else func.__module__
        log = logging.getLogger(logname)
        logmsg = message if message else func.__name__

        @wraps(func)
        def wrapper(*args, **kwargs):
            log.log(level, logmsg)
            return func(*args, **kwargs)

        # Attach setter functions
        @attach_wrapper(wrapper)
        def set_level(newlevel):
            nonlocal level
            level = newlevel

        @attach_wrapper(wrapper)
        def set_message(newmsg):
            nonlocal logmsg
            logmsg = newmsg

        return wrapper

    return decorate

# Example use
@logged(logging.DEBUG)
def add(x, y):
    return x + y

@logged(logging.CRITICAL, 'example')
def spam():
    print('Spam!')

The following is an example of use in an interactive environment:

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> add(2, 3)
DEBUG:__main__:add
5
>>> # Change the log message
>>> add.set_message('Add called')
>>> add(2, 3)
DEBUG:__main__:Add called
5
>>> # Change the log level
>>> add.set_level(logging.WARNING)
>>> add(2, 3)
WARNING:__main__:Add called
5
>>>

discuss

The key point in this section is to access functions such as set_message() and set_level(), which are assigned to the wrapper as attributes. Each access function allows you to use nonlocal to modify variables inside the function.

Another surprise is that access functions propagate among multiple decorators (if your decorators use the @ functools.wraps annotation). For example, suppose you introduce another decorator, such as @ timethis in Section 9.2, as follows:

@timethis
@logged(logging.DEBUG)
def countdown(n):
    while n > 0:
        n -= 1

You will find that the access function is still valid:

>>> countdown(10000000)
DEBUG:__main__:countdown
countdown 0.8198461532592773
>>> countdown.set_level(logging.WARNING)
>>> countdown.set_message("Counting down to zero")
>>> countdown(10000000)
WARNING:__main__:Counting down to zero
countdown 0.8225970268249512
>>>

You will also find that even if the decorator is arranged in the opposite direction as follows, the effect is the same:

@logged(logging.DEBUG)
@timethis
def countdown(n):
    while n > 0:
        n -= 1

You can also use lambda expression code to make the return of access function different settings:

@attach_wrapper(wrapper)
def get_level():
    return level

# Alternative
wrapper.get_level = lambda: level

A difficult place to understand is the first use of access functions. For example, you might consider another method to directly access the properties of a function, as follows:

@wraps(func)
def wrapper(*args, **kwargs):
    wrapper.log.log(wrapper.level, wrapper.logmsg)
    return func(*args, **kwargs)

# Attach adjustable attributes
wrapper.level = level
wrapper.logmsg = logmsg
wrapper.log = log

This method may also work normally, but only if it is the outermost decorator. If there are other decorators on it (such as the @ timethis example mentioned above), it will hide the underlying properties so that modifying them has no effect. This limitation can be avoided by using access functions.

Finally, the scheme in this section can also be used as another implementation method of decorator class in section 9.9.

9.6 decorator with optional parameters

problem

If you want to write a decorator, you can either pass no parameters to it, such as @ decorator, or you can pass optional parameters to it, such as @ decorator(x,y,z).

Solution

The following is a modified version of the log decorator in section 9.5:

from functools import wraps, partial
import logging

def logged(func=None, *, level=logging.DEBUG, name=None, message=None):
    if func is None:
        return partial(logged, level=level, name=name, message=message)

    logname = name if name else func.__module__
    log = logging.getLogger(logname)
    logmsg = message if message else func.__name__

    @wraps(func)
    def wrapper(*args, **kwargs):
        log.log(level, logmsg)
        return func(*args, **kwargs)

    return wrapper

# Example use
@logged
def add(x, y):
    return x + y

@logged(level=logging.CRITICAL, name='example')
def spam():
    print('Spam!')

As you can see, the @ logged decorator can take no parameters or parameters at the same time.

discuss

The problem mentioned here is commonly known as programming consistency. When we use decorators, most programmers are used to either passing them no parameters or passing them exact parameters. Technically speaking, we can define a decorator whose parameters are optional, as follows:

@logged()
def add(x, y):
    return x+y

However, this writing method is not in line with our habits. Sometimes programmers forget to add the following parentheses, which will lead to errors. Here we show you how to satisfy both the case without parentheses and the case with parentheses in a consistent programming style.

To understand how the code works, you need to be very familiar with how decorators work on functions and their call rules. For a simple decorator like the following:

# Example use
@logged
def add(x, y):
    return x + y

This call sequence is equivalent to the following:

def add(x, y):
    return x + y

add = logged(add)

At this time, the decorated function will be directly passed to the logged decorator as the first parameter. Therefore, the first parameter in logged() is the wrapped function itself. All other parameters must have default values.

For a decorator with the following parameters:

@logged(level=logging.CRITICAL, name='example')
def spam():
    print('Spam!')

The call sequence is equivalent to the following:

def spam():
    print('Spam!')
spam = logged(level=logging.CRITICAL, name='example')(spam)

When the logged() function is initially called, the wrapped function is not passed in. Therefore, in the decorator, it must be optional. This in turn forces other parameters to be specified using keywords. Moreover, after these parameters are passed in, the decorator will return a function that accepts a function parameter and wraps it (refer to section 9.5). To do this, we use a technique, which is to use functools partial . It will return an uninitialized self, and all the parameters have been determined except the wrapped function. You can refer to section 7.8 for more knowledge of the partial() method.

9.7 using decorators to force type checking on functions

problem

As a programming convention, you want to force type checking on function parameters.

Solution

Before presenting the actual code, let's explain our goal: we can assert the function parameter types, similar to the following:

>>> @typeassert(int, int)
... def add(x, y):
...     return x + y
...
>>>
>>> add(2, 3)
5
>>> add(2, 'hello')
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "contract.py", line 33, in wrapper
TypeError: Argument y must be <class 'int'>
>>>

The following is the implementation of @ typeassert using decorator Technology:

from inspect import signature
from functools import wraps

def typeassert(*ty_args, **ty_kwargs):
    def decorate(func):
        # If in optimized mode, disable type checking
        if not __debug__:
            return func

        # Map function argument names to supplied types
        sig = signature(func)
        bound_types = sig.bind_partial(*ty_args, **ty_kwargs).arguments

        @wraps(func)
        def wrapper(*args, **kwargs):
            bound_values = sig.bind(*args, **kwargs)
            # Enforce type assertions across supplied arguments
            for name, value in bound_values.arguments.items():
                if name in bound_types:
                    if not isinstance(value, bound_types[name]):
                        raise TypeError(
                            'Argument {} must be {}'.format(name, bound_types[name])
                            )
            return func(*args, **kwargs)
        return wrapper
    return decorate

As you can see, this decorator is very flexible. You can specify all parameter types or only some. And the parameter type can be specified by location or keyword. The following is an example:

>>> @typeassert(int, z=int)
... def spam(x, y, z=42):
...     print(x, y, z)
...
>>> spam(1, 2, 3)
1 2 3
>>> spam(1, 'hello', 3)
1 hello 3
>>> spam(1, 'hello', 'world')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "contract.py", line 33, in wrapper
TypeError: Argument z must be <class 'int'>
>>>

discuss

This section is an example of an advanced decorator, which introduces many important concepts.

First, the decorator is called only once when the function is defined. Sometimes if you remove the function of the decorator, you just need to simply return the decorated function. In the following code, if the global variable__ debug__ If it is set to false (when you execute the program using the optimization mode of - O or - OO parameters), the unmodified function itself will be returned directly:

def decorate(func):
    # If in optimized mode, disable type checking
    if not __debug__:
        return func

Secondly, we also check the parameter signature of the wrapped function. We use inspect The signature() function. In short, it runs you to extract the parameter signature information of a callable object. For example:

>>> from inspect import signature
>>> def spam(x, y, z=42):
...     pass
...
>>> sig = signature(spam)
>>> print(sig)
(x, y, z=42)
>>> sig.parameters
mappingproxy(OrderedDict([('x', <Parameter at 0x10077a050 'x'>),
('y', <Parameter at 0x10077a158 'y'>), ('z', <Parameter at 0x10077a1b0 'z'>)]))
>>> sig.parameters['z'].name
'z'
>>> sig.parameters['z'].default
42
>>> sig.parameters['z'].kind
<_ParameterKind: 'POSITIONAL_OR_KEYWORD'>
>>>

At the beginning of the decorator, we used bind_partial() method to perform a partial binding from the specified type to the name. Here is an example:

>>> bound_types = sig.bind_partial(int,z=int)
>>> bound_types
<inspect.BoundArguments object at 0x10069bb50>
>>> bound_types.arguments
OrderedDict([('x', <class 'int'>), ('z', <class 'int'>)])
>>>

In this partial binding, you can notice that the missing parameters are ignored (for example, y is not bound). But the most important thing is to create an ordered dictionary bound_types.arguments . This dictionary maps parameter names to the specified type values in the same order as in the function signature. In our decorator example, this mapping contains the type assertions we want to enforce.

SIG is used in the actual wrapper function created by the decorator bind() method. bind() and bind_partial() is similar, but it does not allow any parameters to be ignored. Therefore, the following results are obtained:

>>> bound_values = sig.bind(1, 2, 3)
>>> bound_values.arguments
OrderedDict([('x', 1), ('y', 2), ('z', 3)])
>>>

Using this mapping, we can easily implement our mandatory type checking:

>>> for name, value in bound_values.arguments.items():
...     if name in bound_types.arguments:
...         if not isinstance(value, bound_types.arguments[name]):
...             raise TypeError()
...
>>>

However, there is a slight flaw in this scheme. It does not apply to parameters with default values. For example, the following code can work normally, although the type of items is wrong:

>>> @typeassert(int, list)
... def bar(x, items=None):
...     if items is None:
...         items = []
...     items.append(x)
...     return items
>>> bar(2)
[2]
>>> bar(2,3)
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "contract.py", line 33, in wrapper
TypeError: Argument items must be <class 'list'>
>>> bar(4, [1, 2, 3])
[1, 2, 3, 4]
>>>

The last point is about the debate between applicable decorator parameters and function annotations. For example, why not write a decorator like this to find annotations in functions?

@typeassert
def spam(x:int, y, z:int = 42):
    print(x,y,z)

One possible reason is that if Function Parameter annotation is used, it is limited. If annotations are used for type checking, nothing else can be done. And @ typeassert can no longer be used for functions that do other things with annotations. Using the above decorator parameters is more flexible and more general.

More information about function parameter objects can be found in PEP 362 and the inspect module. There is another example in section 9.16.

9.8 defining decorators as part of a class

problem

You want to define decorators in your class and apply them to other functions or methods.

Solution

It's easy to define a decorator in a class, but you must first confirm its use. For example, whether it is an instance method or a class method. Let's illustrate their differences with examples below:

from functools import wraps

class A:
    # Decorator as an instance method
    def decorator1(self, func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            print('Decorator 1')
            return func(*args, **kwargs)
        return wrapper

    # Decorator as a class method
    @classmethod
    def decorator2(cls, func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            print('Decorator 2')
            return func(*args, **kwargs)
        return wrapper

The following is an example:

# As an instance method
a = A()
@a.decorator1
def spam():
    pass
# As a class method
@A.decorator2
def grok():
    pass

Careful observation shows that one is instance call and the other is class call.

discuss

Defining decorators in classes may seem strange at first, but there are many such examples in the standard library. In particular, @ property decorator is actually a class, which defines three methods getter (), setter (), and Deleter (), and each method is a decorator. For example:

class Person:
    # Create a property instance
    first_name = property()

    # Apply decorator methods
    @first_name.getter
    def first_name(self):
        return self._first_name

    @first_name.setter
    def first_name(self, value):
        if not isinstance(value, str):
            raise TypeError('Expected a string')
        self._first_name = value

The main reason why it is so defined is that various decorator methods operate its state on the associated property instance. Therefore, whenever you encounter the need to record or bind information in the decorator, this is a feasible method.

One difficulty in defining decorators in a class is the correct use of additional parameters self or cls. Although the outermost decorator functions such as decorator1() or decorator2() need to provide a self or cls parameter, the wrapper() function created inside the two decorators does not need to contain this self parameter. The only time you need this parameter is when you really want to access some parts of this instance in the wrapper. Don't worry about it under other circumstances.

It is also difficult to understand the wrapper defined in the class, that is, when it comes to inheritance. For example, suppose you want the decorator defined in A to work in subclass B. You need to write like this:

class B(A):
    @A.decorator2
    def bar(self):
        pass

In other words, the decorator should be defined as a class method, and you must explicitly use the parent class name to call it. You can't use @ B.decorator2 because this class B has not been created when the method is defined.

9.9 define decorators as classes

problem

You want to use a decorator to wrap the function, but you want to return a callable instance. You need to make your decorator work both inside and outside the class definition.

Solution

In order to define the decorator as an instance, you need to ensure that it implements__ call__ () and__ get__ () method. For example, the following code defines a class that places a simple record layer on other functions:

import types
from functools import wraps

class Profiled:
    def __init__(self, func):
        wraps(func)(self)
        self.ncalls = 0

    def __call__(self, *args, **kwargs):
        self.ncalls += 1
        return self.__wrapped__(*args, **kwargs)

    def __get__(self, instance, cls):
        if instance is None:
            return self
        else:
            return types.MethodType(self, instance)

You can use it as an ordinary decorator, either inside or outside the class:

@Profiled
def add(x, y):
    return x + y

class Spam:
    @Profiled
    def bar(self, x):
        print(self, x)

Examples of use in interactive environments:

>>> add(2, 3)
5
>>> add(4, 5)
9
>>> add.ncalls
2
>>> s = Spam()
>>> s.bar(1)
<__main__.Spam object at 0x10069e9d0> 1
>>> s.bar(2)
<__main__.Spam object at 0x10069e9d0> 2
>>> s.bar(3)
<__main__.Spam object at 0x10069e9d0> 3
>>> Spam.bar.ncalls
3

discuss

Defining decorators as classes is often straightforward. But there are still some details to explain, especially when you want to apply it to instance methods.

First, use functools The wraps () function is the same as before, copying the meta information of the wrapped function to the callable instance.

Second, it is often easy to ignore the above__ get__ () method. If you ignore it and keep the other code running again, you will find that there are strange problems when you call the decorated instance method. For example:

>>> s = Spam()
>>> s.bar(3)
Traceback (most recent call last):
...
TypeError: bar() missing 1 required positional argument: 'x'

The reason for the error is that when method functions are found in a class, their__ get__ The () method is called according to the descriptor protocol, which has been described in section 8.9. Here__ get__ The purpose of () is to create a binding method object (which will eventually pass the self parameter to this method). The following is an example to demonstrate the underlying principle:

>>> s = Spam()
>>> def grok(self, x):
...     pass
...
>>> grok.__get__(s, Spam)
<bound method Spam.grok of <__main__.Spam object at 0x100671e90>>
>>>

__ get__ () method is to ensure that the bound method object can be created correctly. type.MethodType() manually creates a binding method to use. Binding methods are created only when the instance is used. If this method is accessed on the class, then__ get__ The instance parameter in () will be set to None and directly return the Profiled instance itself. In this way, we can extract its ncalls attribute.

If you want to avoid some confusion, you can also consider another decorator implemented with closures and nonlocal variables, which is described in section 9.5. For example:

import types
from functools import wraps

def profiled(func):
    ncalls = 0
    @wraps(func)
    def wrapper(*args, **kwargs):
        nonlocal ncalls
        ncalls += 1
        return func(*args, **kwargs)
    wrapper.ncalls = lambda: ncalls
    return wrapper

# Example
@profiled
def add(x, y):
    return x + y

This method has almost the same effect as before, except that the access to ncalls is now realized through a function bound as an attribute, for example:

>>> add(2, 3)
5
>>> add(4, 5)
9
>>> add.ncalls()
2
>>>

9.10 provide decorators for classes and static methods

problem

You want to provide decorators for classes or static methods.

Solution

It's easy to provide decorators for classes or static methods, but make sure that decorators precede @ classmethod or @ staticmethod. For example:

import time
from functools import wraps

# A simple decorator
def timethis(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        r = func(*args, **kwargs)
        end = time.time()
        print(end-start)
        return r
    return wrapper

# Class illustrating application of the decorator to different kinds of methods
class Spam:
    @timethis
    def instance_method(self, n):
        print(self, n)
        while n > 0:
            n -= 1

    @classmethod
    @timethis
    def class_method(cls, n):
        print(cls, n)
        while n > 0:
            n -= 1

    @staticmethod
    @timethis
    def static_method(n):
        print(n)
        while n > 0:
            n -= 1

The decorated classes and static methods work normally, but additional timing functions are added:

>>> s = Spam()
>>> s.instance_method(1000000)
<__main__.Spam object at 0x1006a6050> 1000000
0.11817407608032227
>>> Spam.class_method(1000000)
<class '__main__.Spam'> 1000000
0.11334395408630371
>>> Spam.static_method(1000000)
1000000
0.11740279197692871
>>>

discuss

If you write the decorator in the wrong order, you will make a mistake. For example, suppose you write like this:

class Spam:
    @timethis
    @staticmethod
    def static_method(n):
        print(n)
        while n > 0:
            n -= 1

Then you will report an error when calling this static method:

>>> Spam.static_method(1000000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "timethis.py", line 6, in wrapper
start = time.time()
TypeError: 'staticmethod' object is not callable
>>>

The problem is that @ classmethod and @ staticmethod do not actually create directly callable objects, but create special descriptor objects (refer to section 8.9). So when you try to use them as functions in other decorators, you will make mistakes. Making sure that this decorator appears first in the decorator chain can fix this problem.

This knowledge is useful when we define class methods and static methods in abstract base classes (refer to section 8.12). For example, if you want to define an abstract class method, you can use code similar to the following:

from abc import ABCMeta, abstractmethod
class A(metaclass=ABCMeta):
    @classmethod
    @abstractmethod
    def method(cls):
        pass

In this code, the order of @ classmethod and @ abstractmethod is particular. If you change their order, you will make an error.

9.11 the decorator adds parameters to the wrapped function

problem

You want to add additional parameters to the wrapped function in the decorator, but you can't affect the existing call rules of the function.

Solution

Keyword parameters can be used to add additional parameters to the wrapped function. Consider the following decorators:

from functools import wraps

def optional_debug(func):
    @wraps(func)
    def wrapper(*args, debug=False, **kwargs):
        if debug:
            print('Calling', func.__name__)
        return func(*args, **kwargs)

    return wrapper
>>> @optional_debug
... def spam(a,b,c):
...     print(a,b,c)
...
>>> spam(1,2,3)
1 2 3
>>> spam(1,2,3, debug=True)
Calling spam
1 2 3
>>>

discuss

It is not common to add parameters to wrapped functions through decorators. However, sometimes it can avoid some duplicate code. For example, if you have the following code:

def a(x, debug=False):
    if debug:
        print('Calling a')

def b(x, y, z, debug=False):
    if debug:
        print('Calling b')

def c(x, y, debug=False):
    if debug:
        print('Calling c')

Then you can reconstruct it as follows:

from functools import wraps
import inspect

def optional_debug(func):
    if 'debug' in inspect.getargspec(func).args:
        raise TypeError('debug argument already defined')

    @wraps(func)
    def wrapper(*args, debug=False, **kwargs):
        if debug:
            print('Calling', func.__name__)
        return func(*args, **kwargs)
    return wrapper

@optional_debug
def a(x):
    pass

@optional_debug
def b(x, y, z):
    pass

@optional_debug
def c(x, y):
    pass

This implementation works because mandatory keyword parameters can be easily added to functions that accept * args and * * kwargs parameters. By using mandatory keyword parameters, it is selected as a special case, and this special parameter will be excluded when the function is called only with the remaining location and keyword parameters. In other words, it will not be included in * * kwargs.

Another difficulty is how to deal with the direct name conflict between the added parameter and the wrapped function parameter. For example, if the decorator @ optional_ There will be problems when debug acts on a function that already has a debug parameter. Here we add a step name check.

The above scheme can be a little more perfect, because smart programmers should find that the function signature of the wrapped function is actually wrong. For example:

>>> @optional_debug
... def add(x,y):
...     return x+y
...
>>> import inspect
>>> print(inspect.signature(add))
(x, y)
>>>

This problem can be solved through the following modifications:

from functools import wraps
import inspect

def optional_debug(func):
    if 'debug' in inspect.getargspec(func).args:
        raise TypeError('debug argument already defined')

    @wraps(func)
    def wrapper(*args, debug=False, **kwargs):
        if debug:
            print('Calling', func.__name__)
        return func(*args, **kwargs)

    sig = inspect.signature(func)
    parms = list(sig.parameters.values())
    parms.append(inspect.Parameter('debug',
                inspect.Parameter.KEYWORD_ONLY,
                default=False))
    wrapper.__signature__ = sig.replace(parameters=parms)
    return wrapper

Through such modification, the wrapped function signature can correctly display the existence of the debug parameter. For example:

>>> @optional_debug
... def add(x,y):
...     return x+y
...
>>> print(inspect.signature(add))
(x, y, *, debug=False)
>>> add(2,3)
5
>>>

Refer to section 9.16 for more information on function signatures.

9.12 use the decorator to expand the functions of the class

problem

You want to modify the behavior of a class definition by introspecting or rewriting some part of it, but you don't want to use inheritance or metaclass.

Solution

This situation may be the best use scenario for class decorators. For example, here is a special method that overrides__ getattribute__ Class decorator, which can print logs:

def log_getattribute(cls):
    # Get the original implementation
    orig_getattribute = cls.__getattribute__

    # Make a new definition
    def new_getattribute(self, name):
        print('getting:', name)
        return orig_getattribute(self, name)

    # Attach to the class and return
    cls.__getattribute__ = new_getattribute
    return cls

# Example use
@log_getattribute
class A:
    def __init__(self,x):
        self.x = x
    def spam(self):
        pass

Here are the usage effects:

>>> a = A(42)
>>> a.x
getting: x
42
>>> a.spam()
getting: spam
>>>

discuss

Class decorators can often be used as a very concise alternative to other advanced technologies, such as blending or metaclasses. For example, another implementation in the above example uses inheritance:

class LoggedGetattribute:
    def __getattribute__(self, name):
        print('getting:', name)
        return super().__getattribute__(name)

# Example:
class A(LoggedGetattribute):
    def __init__(self,x):
        self.x = x
    def spam(self):
        pass

This scheme works, but in order to understand it, you must know the method call order, super() and other inheritance knowledge introduced in section 8.7. To some extent, the class decorator scheme is more intuitive, and it will not introduce a new inheritance system. It also runs faster because it does not rely on the super() function.

If you want to use multiple class decorators on A class, you need to pay attention to the next order. For example, one decorator A will completely replace its decoration method with another implementation, while the other decorator B simply adds some additional logic to its decoration method. At this time, decorator A needs to be placed in front of decorator B.

You can also review another useful example of class decorators in section 8.13.

9.13 use metaclass to control instance creation

problem

You want to implement singleton, cache, or other similar features by changing the way instances are created.

Solution

Python programmers know that if you define a class, you can call it like a function to create an instance, for example:

class Spam:
    def __init__(self, name):
        self.name = name

a = Spam('Guido')
b = Spam('Diana')

If you want to customize this step, you can define a metaclass and implement it yourself__ call__ () method.

For demonstration, suppose you don't want anyone to create an instance of this class:

class NoInstances(type):
    def __call__(self, *args, **kwargs):
        raise TypeError("Can't instantiate directly")

# Example
class Spam(metaclass=NoInstances):
    @staticmethod
    def grok(x):
        print('Spam.grok')

In this case, the user can only call the static methods of this class, and cannot use the usual methods to create its instances. For example:

>>> Spam.grok(42)
Spam.grok
>>> s = Spam()
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "example1.py", line 7, in __call__
        raise TypeError("Can't instantiate directly")
TypeError: Can't instantiate directly
>>>

Now, if you want to implement singleton mode (you can only create classes with unique instances), it is also very simple:

class Singleton(type):
    def __init__(self, *args, **kwargs):
        self.__instance = None
        super().__init__(*args, **kwargs)

    def __call__(self, *args, **kwargs):
        if self.__instance is None:
            self.__instance = super().__call__(*args, **kwargs)
            return self.__instance
        else:
            return self.__instance

# Example
class Spam(metaclass=Singleton):
    def __init__(self):
        print('Creating Spam')

Then the Spam class can only create a unique instance. The demonstration is as follows:

>>> a = Spam()
Creating Spam
>>> b = Spam()
>>> a is b
True
>>> c = Spam()
>>> a is c
True
>>>

Finally, suppose you want to create a cache instance like that in section 8.25. The following can be implemented through metaclasses:

import weakref

class Cached(type):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.__cache = weakref.WeakValueDictionary()

    def __call__(self, *args):
        if args in self.__cache:
            return self.__cache[args]
        else:
            obj = super().__call__(*args)
            self.__cache[args] = obj
            return obj

# Example
class Spam(metaclass=Cached):
    def __init__(self, name):
        print('Creating Spam({!r})'.format(name))
        self.name = name

Then I'll test it:

>>> a = Spam('Guido')
Creating Spam('Guido')
>>> b = Spam('Diana')
Creating Spam('Diana')
>>> c = Spam('Guido') # Cached
>>> a is b
False
>>> a is c # Cached value returned
True
>>>

discuss

Implementing multiple instance creation patterns with metaclasses is often much more elegant than not using metaclasses.

Assuming you don't use metaclasses, you may need to hide classes behind some factory functions. For example, to implement a singleton, you might write as follows:

class _Spam:
    def __init__(self):
        print('Creating Spam')

_spam_instance = None

def Spam():
    global _spam_instance

    if _spam_instance is not None:
        return _spam_instance
    else:
        _spam_instance = _Spam()
        return _spam_instance

Although using metaclasses may involve more advanced technologies, its code will look more concise, comfortable and intuitive.

For more information about creating cache instances and weak references, please refer to section 8.25.

9.14 attribute definition order of capture class

problem

You want to automatically record the order of attribute and method definitions in a class, and then you can use it to do many operations (such as serialization, mapping to database, etc.).

Solution

Using metaclasses, you can easily capture class definition information. The following is an example. An OrderedDict is used to record the definition order of the descriptor:

from collections import OrderedDict

# A set of descriptors for various types
class Typed:
    _expected_type = type(None)
    def __init__(self, name=None):
        self._name = name

    def __set__(self, instance, value):
        if not isinstance(value, self._expected_type):
            raise TypeError('Expected ' + str(self._expected_type))
        instance.__dict__[self._name] = value

class Integer(Typed):
    _expected_type = int

class Float(Typed):
    _expected_type = float

class String(Typed):
    _expected_type = str

# Metaclass that uses an OrderedDict for class body
class OrderedMeta(type):
    def __new__(cls, clsname, bases, clsdict):
        d = dict(clsdict)
        order = []
        for name, value in clsdict.items():
            if isinstance(value, Typed):
                value._name = name
                order.append(name)
        d['_order'] = order
        return type.__new__(cls, clsname, bases, d)

    @classmethod
    def __prepare__(cls, clsname, bases):
        return OrderedDict()

In this metaclass, when executing the class body, the definition order of the descriptor will be captured by an OrderedDict, and the generated ordered name will be extracted from the dictionary and put into the class attribute_ In order. In this way, the methods in the class can use it in many ways. For example, the following is a simple class that uses this sort dictionary to serialize the data of a class instance into a row of CSV data:

class Structure(metaclass=OrderedMeta):
    def as_csv(self):
        return ','.join(str(getattr(self,name)) for name in self._order)

# Example use
class Stock(Structure):
    name = String()
    shares = Integer()
    price = Float()

    def __init__(self, name, shares, price):
        self.name = name
        self.shares = shares
        self.price = price

Let's test this Stock class in an interactive environment:

>>> s = Stock('GOOG',100,490.1)
>>> s.name
'GOOG'
>>> s.as_csv()
'GOOG,100,490.1'
>>> t = Stock('AAPL','a lot', 610.23)
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "dupmethod.py", line 34, in __init__
TypeError: shares expects <class 'int'>
>>>

discuss

A key point in this section is defined in the OrderedMeta metaclass__ prepare__ () method. This method is executed at the beginning of defining the class and its parent class. It must return a mapping object to be used in the class definition body. We can easily capture the defined order by returning an OrderedDict instead of an ordinary dictionary.

If you want to construct your own class dictionary object, you can easily extend this function. For example, the following modification scheme can prevent duplicate definitions:

from collections import OrderedDict

class NoDupOrderedDict(OrderedDict):
    def __init__(self, clsname):
        self.clsname = clsname
        super().__init__()
    def __setitem__(self, name, value):
        if name in self:
            raise TypeError('{} already defined in {}'.format(name, self.clsname))
        super().__setitem__(name, value)

class OrderedMeta(type):
    def __new__(cls, clsname, bases, clsdict):
        d = dict(clsdict)
        d['_order'] = [name for name in clsdict if name[0] != '_']
        return type.__new__(cls, clsname, bases, d)

    @classmethod
    def __prepare__(cls, clsname, bases):
        return NoDupOrderedDict(clsname)

Let's test what happens to duplicate definitions:

>>> class A(metaclass=OrderedMeta):
... def spam(self):
... pass
... def spam(self):
... pass
...
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "<stdin>", line 4, in A
    File "dupmethod2.py", line 25, in __setitem__
        (name, self.clsname))
TypeError: spam already defined in A
>>>

Last but not least, in__ new__ () the processing of the modified dictionary in the metaclass. Although the class uses another dictionary to define it, we still need to convert this dictionary into a correct dict instance when constructing the final class object. This effect is achieved by the statement d = dict(clsdict).

For many applications, the ability to capture the order of class definitions is a seemingly insignificant but very important feature. For example, in object relationship mapping, we usually see classes defined in the following way:

class Stock(Model):
    name = String()
    shares = Integer()
    price = Float()

At the bottom of the framework, we must capture the defined order to map objects to tuples or rows in database tables (similar to the function of as_csv() in the above example). The technique demonstrated in this section is very simple and usually much simpler than other similar methods (usually maintaining a hidden counter in the descriptor class).

9.15 defining metaclasses with optional parameters

problem

You want to define a metaclass that allows you to provide optional parameters during class definition, so that you can control or configure the type creation process.

Solution

When defining classes, Python allows us to use []( https://Python3-cookbook.readthedocs.io/zh_CN/latest/c09/p15_define_metaclass_that_takes_optional_arguments.html#id4 )Metaclass keyword parameter to specify a specific metaclass. For example, use an abstract base class:

from abc import ABCMeta, abstractmethod
class IStream(metaclass=ABCMeta):
    @abstractmethod
    def read(self, maxsize=None):
        pass

    @abstractmethod
    def write(self, data):
        pass

However, in the custom metaclass, we can also provide other keyword parameters, as shown below:

class Spam(metaclass=MyMeta, debug=True, synchronize=True):
    pass

In order for the metaclass to support these keyword parameters, you must ensure that__ prepare__() , __new__ () and__ init__ Mandatory keyword parameters are used in () methods. As follows:

class MyMeta(type):
    # Optional
    @classmethod
    def __prepare__(cls, name, bases, *, debug=False, synchronize=False):
        # Custom processing
        pass
        return super().__prepare__(name, bases)

    # Required
    def __new__(cls, name, bases, ns, *, debug=False, synchronize=False):
        # Custom processing
        pass
        return super().__new__(cls, name, bases, ns)

    # Required
    def __init__(self, name, bases, ns, *, debug=False, synchronize=False):
        # Custom processing
        pass
        super().__init__(name, bases, ns)

discuss

Adding optional keyword parameters to a metaclass requires you to fully understand all steps of class creation, because these parameters will be passed to each relevant method__ prepare__ The () method is called first before all class definitions are executed to create a class namespace. Generally speaking, this method simply returns a dictionary or other mapping object__ new__ The () method is used to instantiate the final class object. It starts executing after the body of the class is executed__ init__ The () method is finally called to perform some other initialization work.

When we construct metaclasses, we usually only need to define one__ new__ () or__ init__ () method, but not both. However, if you need to accept other keyword parameters, these two methods must be provided at the same time, and the corresponding parameter signature must be provided. Default__ prepare__ The () method accepts arbitrary keyword parameters, but ignores them, so you need to define them only when these additional parameters may affect the creation of class namespace__ prepare__ () method.

By using mandatory keyword parameters, we must specify these parameters through keywords during class creation.

Configuring a metaclass with keyword parameters can also be seen as an alternative to class variables. For example:

class Spam(metaclass=MyMeta):
    debug = True
    synchronize = True
    pass

The advantage of defining these attributes as parameters is that they do not pollute the namespace of the class. These attributes only belong to the creation phase of the class, not the statement execution phase in the class. In addition, they are__ prepare__ () method can be accessed because this method will be executed before all class bodies are executed. However, class variables can only be used in the__ new__ () and__ init__ () visible in method.

9.16 * Mandatory parameter signature for args and * * kwargs

problem

You have a function or method that uses * args and * * kwargs as parameters, which makes it more general, but sometimes you want to check whether the passed parameter is a type you want.

Solution

For any problem involving the Signature of operation function calls, you should use the Signature feature in the inspect module. We mainly focus on two classes: Signature and Parameter. The following is an interactive example before creating a function:

>>> from inspect import Signature, Parameter
>>> # Make a signature for a func(x, y=42, *, z=None)
>>> parms = [ Parameter('x', Parameter.POSITIONAL_OR_KEYWORD),
...         Parameter('y', Parameter.POSITIONAL_OR_KEYWORD, default=42),
...         Parameter('z', Parameter.KEYWORD_ONLY, default=None) ]
>>> sig = Signature(parms)
>>> print(sig)
(x, y=42, *, z=None)
>>>

Once you have a signature object, you can easily bind it to * args and * * kwargs using its bind() method. Here is a simple demonstration:

>>> def func(*args, **kwargs):
...     bound_values = sig.bind(*args, **kwargs)
...     for name, value in bound_values.arguments.items():
...         print(name,value)
...
>>> # Try various examples
>>> func(1, 2, z=3)
x 1
y 2
z 3
>>> func(1)
x 1
>>> func(1, z=3)
x 1
z 3
>>> func(y=2, x=1)
x 1
y 2
>>> func(1, 2, 3, 4)
Traceback (most recent call last):
...
    File "/usr/local/lib/python3.3/inspect.py", line 1972, in _bind
        raise TypeError('too many positional arguments')
TypeError: too many positional arguments
>>> func(y=2)
Traceback (most recent call last):
...
    File "/usr/local/lib/python3.3/inspect.py", line 1961, in _bind
        raise TypeError(msg) from None
TypeError: 'x' parameter lacking default value
>>> func(1, y=2, x=3)
Traceback (most recent call last):
...
    File "/usr/local/lib/python3.3/inspect.py", line 1985, in _bind
        '{arg!r}'.format(arg=param.name))
TypeError: multiple values for argument 'x'
>>>

It can be seen that by binding the signature and the passed parameters, the function call can be forced to follow specific rules, such as required, default, duplicate, etc.

The following is a more specific example of mandatory function signature. In the code, we first define a very general class in the base class__ init__ () method, and then we force all subclasses to provide a specific parameter signature.

from inspect import Signature, Parameter

def make_sig(*names):
    parms = [Parameter(name, Parameter.POSITIONAL_OR_KEYWORD)
            for name in names]
    return Signature(parms)

class Structure:
    __signature__ = make_sig()
    def __init__(self, *args, **kwargs):
        bound_values = self.__signature__.bind(*args, **kwargs)
        for name, value in bound_values.arguments.items():
            setattr(self, name, value)

# Example use
class Stock(Structure):
    __signature__ = make_sig('name', 'shares', 'price')

class Point(Structure):
    __signature__ = make_sig('x', 'y')

The following is an example of using this Stock class:

>>> import inspect
>>> print(inspect.signature(Stock))
(name, shares, price)
>>> s1 = Stock('ACME', 100, 490.1)
>>> s2 = Stock('ACME', 100)
Traceback (most recent call last):
...
TypeError: 'price' parameter lacking default value
>>> s3 = Stock('ACME', 100, 490.1, shares=50)
Traceback (most recent call last):
...
TypeError: multiple values for argument 'shares'
>>>

discuss

When we need to build general function libraries, write decorators or implement agents, the use of * args and * * kwargs is very common. However, one disadvantage of such functions is that the code will be clumsy and messy when you want to implement your own parameter verification. There is an example in section 8.11. At this time, we can simplify it by a signature object.

In the last scheme example, we can also create a signature object by using a custom metaclass. Here's how to implement it:

from inspect import Signature, Parameter

def make_sig(*names):
    parms = [Parameter(name, Parameter.POSITIONAL_OR_KEYWORD)
            for name in names]
    return Signature(parms)

class StructureMeta(type):
    def __new__(cls, clsname, bases, clsdict):
        clsdict['__signature__'] = make_sig(*clsdict.get('_fields',[]))
        return super().__new__(cls, clsname, bases, clsdict)

class Structure(metaclass=StructureMeta):
    _fields = []
    def __init__(self, *args, **kwargs):
        bound_values = self.__signature__.bind(*args, **kwargs)
        for name, value in bound_values.arguments.items():
            setattr(self, name, value)

# Example
class Stock(Structure):
    _fields = ['name', 'shares', 'price']

class Point(Structure):
    _fields = ['x', 'y']

When we customize the signature, we store the signature in a specific attribute__ signature__ Is usually very useful. In this way, the signature can be found and used as a calling convention when performing introspective code using the inspect module.

>>> import inspect
>>> print(inspect.signature(Stock))
(name, shares, price)
>>> print(inspect.signature(Point))
(x, y)
>>>

9.17 mandatory use of programming protocols on classes

problem

Your program contains a large class inheritance system. You want to enforce some programming conventions (or code diagnostics) to help programmers stay awake.

Solution

If you want to monitor the definition of a class, you can usually define a metaclass. A basic metaclass usually inherits from type and redefines it__ new__ () method or__ init__ () method. For example:

class MyMeta(type):
    def __new__(self, clsname, bases, clsdict):
        # clsname is name of class being defined
        # bases is tuple of base classes
        # clsdict is class dictionary
        return super().__new__(cls, clsname, bases, clsdict)

The other is the definition__ init__ () method:

class MyMeta(type):
    def __init__(self, clsname, bases, clsdict):
        super().__init__(clsname, bases, clsdict)
        # clsname is name of class being defined
        # bases is tuple of base classes
        # clsdict is class dictionary

In order to use this metaclass, you usually put it into a top-level parent class definition, and then other classes inherit the top-level parent class. For example:

class Root(metaclass=MyMeta):
    pass

class A(Root):
    pass

class B(Root):
    pass

A key feature of metaclasses is that it allows you to check the contents of classes when defining them. Redefining__ init__ () method, you can easily check the class dictionary, parent class, etc. Moreover, once a metaclass is assigned to a class, it will be inherited to all subclasses. Therefore, the builder of a framework can capture the definitions of all the following subclasses by assigning a metaclass to a top-level parent class in a large inheritance system.

As a specific application example, a metaclass is defined below, which will reject any class definition with mixed case names as methods (possibly to annoy Java programmers):

class NoMixedCaseMeta(type):
    def __new__(cls, clsname, bases, clsdict):
        for name in clsdict:
            if name.lower() != name:
                raise TypeError('Bad attribute name: ' + name)
        return super().__new__(cls, clsname, bases, clsdict)

class Root(metaclass=NoMixedCaseMeta):
    pass

class A(Root):
    def foo_bar(self): # Ok
        pass

class B(Root):
    def fooBar(self): # TypeError
        pass

As a more advanced and practical example, the following metaclass is used to detect overloaded methods and ensure that its call parameters have the same parameter signature as the original method in the parent class.

from inspect import signature
import logging

class MatchSignaturesMeta(type):

    def __init__(self, clsname, bases, clsdict):
        super().__init__(clsname, bases, clsdict)
        sup = super(self, self)
        for name, value in clsdict.items():
            if name.startswith('_') or not callable(value):
                continue
            # Get the previous definition (if any) and compare the signatures
            prev_dfn = getattr(sup,name,None)
            if prev_dfn:
                prev_sig = signature(prev_dfn)
                val_sig = signature(value)
                if prev_sig != val_sig:
                    logging.warning('Signature mismatch in %s. %s != %s',
                                    value.__qualname__, prev_sig, val_sig)

# Example
class Root(metaclass=MatchSignaturesMeta):
    pass

class A(Root):
    def foo(self, x, y):
        pass

    def spam(self, x, *, z):
        pass

# Class with redefined methods, but slightly different signatures
class B(A):
    def foo(self, a, b):
        pass

    def spam(self,x,z):
        pass

If you run this code, you will get the following output:

WARNING:root:Signature mismatch in B.spam. (self, x, *, z) != (self, x, z)
WARNING:root:Signature mismatch in B.foo. (self, x, y) != (self, a, b)

This warning message is useful for catching subtle program bug s. For example, if a code depends on the keyword parameters passed to the method, an error will be called when the subclass changes the parameter name.

discuss

In large object-oriented programs, it is often useful to control the definition of classes in metaclasses. Metaclasses can monitor class definitions and warn programmers of possible problems that they are not aware of.

One might say that it would be better if such a mistake could be made through a program analysis tool or IDE. Admittedly, these tools are very useful. However, if you are building a framework or function library for others to use, you have no way to control what tools users want to use. Therefore, for this type of program, if it can be detected in the metaclass, it may bring a better user experience.

Select redefine in the metaclass__ new__ () method or__ init__ The () method depends on how you want to use the result class__ new__ The () method is called before the class is created. It is usually used to modify the definition of the class in some way (for example, by changing the content of the class Dictionary). And__ init__ The () method is called after the class is created, which is useful when you need to build the class object completely. In the last example, this is necessary because it uses the super() function to search for the previous definition. It can only be after the class instance is created, and the corresponding method parsing order has been set.

The last example also demonstrates the use of Python's function signature object. In fact, the metaclass places each callable definition in a class, searches for the previous definition (if any), and then uses inspect Signature () to simply compare their call signatures.

Finally, the use of super(self, self) in a line of code is not a typographical error. When using metaclasses, we should always remember that self is actually a class object. Therefore, this statement is actually used to find the definition of the self parent class built in the inheritance system.

9.18 defining classes programmatically

problem

You are writing a piece of code, and you finally need to create a new class object. You consider publishing the class definition source code as a string. And use functions such as exec() to execute it, but you want to find a more elegant solution.

Solution

You can use the function types new_ Class () to initialize a new class object. All you need to do is provide the class name, parent tuple, keyword parameters, and a callback function that fills the class dictionary with member variables. For example:

# stock.py
# Example of making a class manually from parts

# Methods
def __init__(self, name, shares, price):
    self.name = name
    self.shares = shares
    self.price = price
def cost(self):
    return self.shares * self.price

cls_dict = {
    '__init__' : __init__,
    'cost' : cost,
}

# Make a class
import types

Stock = types.new_class('Stock', (), {}, lambda ns: ns.update(cls_dict))
Stock.__module__ = __name__

This method will build a normal class object and work according to your expectations:

>>> s = Stock('ACME', 50, 91.1)
>>> s
<stock.Stock object at 0x1006a9b10>
>>> s.cost()
4555.0
>>>

In this method, a difficult place to understand is after calling types new_ Class () to stock__ module__ Assignment of. Each time a class is defined, its__ module__ Property contains the name of the module that defines it. This name is used to generate__ repr__ () output of the method. It is also used in many libraries, such as pickle. Therefore, in order for the class you create to be "correct", you need to ensure that this property is also set correctly.

If the class you want to create needs a different metaclass, you can use types new_ The third parameter of class () is passed to it. For example:

>>> import abc
>>> Stock = types.new_class('Stock', (), {'metaclass': abc.ABCMeta},
...                         lambda ns: ns.update(cls_dict))
...
>>> Stock.__module__ = __name__
>>> Stock
<class '__main__.Stock'>
>>> type(Stock)
<class 'abc.ABCMeta'>
>>>

The third parameter can also contain other keyword parameters. For example, a class is defined as follows:

class Spam(Base, debug=True, typecheck=False):
    pass

Then it can be translated into the following new_class() call form:

Spam = types.new_class('Spam', (Base,),
                        {'debug': True, 'typecheck': False},
                        lambda ns: ns.update(cls_dict))

new_ The fourth parameter of class () is the most mysterious. It is a function used to accept the mapping object of class namespace. Usually this is an ordinary dictionary, but it is actually a dictionary__ prepare__ Any object returned by () method, which has been described in section 9.14. This function needs to use the update() method demonstrated above to add content to the namespace.

discuss

Many times, it is very useful to construct new class objects. A familiar example is calling collections Namedtuple() function, for example:

>>> Stock = collections.namedtuple('Stock', ['name', 'shares', 'price'])
>>> Stock
<class '__main__.Stock'>
>>>

namedtuple() uses exec() instead of the technique described above. However, let's directly create a class through a simple change:

import operator
import types
import sys

def named_tuple(classname, fieldnames):
    # Populate a dictionary of field property accessors
    cls_dict = { name: property(operator.itemgetter(n))
                for n, name in enumerate(fieldnames) }

    # Make a __new__ function and add to the class dict
    def __new__(cls, *args):
        if len(args) != len(fieldnames):
            raise TypeError('Expected {} arguments'.format(len(fieldnames)))
        return tuple.__new__(cls, args)

    cls_dict['__new__'] = __new__

    # Make the class
    cls = types.new_class(classname, (tuple,), {},
                        lambda ns: ns.update(cls_dict))

    # Set the module to that of the caller
    cls.__module__ = sys._getframe(1).f_globals['__name__']
    return cls

The last part of this code uses a so-called "framework magic" by calling sys_ Getframe() to get the module name of the caller. Another example of frame magic is described in section 2.15.

The following example demonstrates how the previous code works:

>>> Point = named_tuple('Point', ['x', 'y'])
>>> Point
<class '__main__.Point'>
>>> p = Point(4, 5)
>>> len(p)
2
>>> p.x
4
>>> p.y
5
>>> p.x = 2
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
>>> print('%s %s' % p)
4 5
>>>

An important aspect of this technique is its proper use of metaclasses. You may directly create a class by directly instantiating a metaclass:

Stock = type('Stock', (), cls_dict)

The problem with this approach is that it ignores some key steps, such as in metaclasses__ prepare__ () method call. By using types new_ Class (), you can ensure that all necessary initialization steps can be performed. For example, types new_ The callback function of the fourth parameter of class () accepts__ prepare__ The mapping object returned by () method.

If you just want to perform the preparation steps, you can use types prepare_ class() . For example:

import types
metaclass, kwargs, ns = types.prepare_class('Stock', (), {'metaclass': type})

It looks for the appropriate metaclass and calls its__ prepare__ () method. Then the metaclass saves its keyword parameters and is returned after preparing the namespace.

For more information, refer to PEP 3115 , and Python documentation .

9.19 initializing class members at definition time

problem

You want to initialize the members of a class when the class is defined, rather than wait until the instance is created.

Solution

It is a typical application scenario for metaclasses to perform initialization or setting operations at the time of class definition. Essentially, a metaclass will be triggered at the time of definition, at which time you can perform some additional operations.

The following is an example to use this idea to create a class similar to the named tuple in the collections module:

import operator

class StructTupleMeta(type):
    def __init__(cls, *args, **kwargs):
        super().__init__(*args, **kwargs)
        for n, name in enumerate(cls._fields):
            setattr(cls, name, property(operator.itemgetter(n)))

class StructTuple(tuple, metaclass=StructTupleMeta):
    _fields = []
    def __new__(cls, *args):
        if len(args) != len(cls._fields):
            raise ValueError('{} arguments required'.format(len(cls._fields)))
        return super().__new__(cls,args)

This code can be used to define a simple tuple based data structure, as shown below:

class Stock(StructTuple):
    _fields = ['name', 'shares', 'price']

class Point(StructTuple):
    _fields = ['x', 'y']

Here's how it works:

>>> s = Stock('ACME', 50, 91.1)
>>> s
('ACME', 50, 91.1)
>>> s[0]
'ACME'
>>> s.name
'ACME'
>>> s.shares * s.price
4555.0
>>> s.shares = 23
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
AttributeError: can't set attribute
>>>

discuss

In this section, the class StructTupleMeta gets the class properties_ A list of attribute names in fields, and then convert them into corresponding methods that can access a specific tuple slot. Function operator Itemsetter () creates an accessor function, and then the property() function converts it into a property.

The hardest part of this section is knowing when different initialization steps occur. In StructTupleMeta__ init__ The () method is called only once when each class is defined. The cls parameter is the defined class. In fact, the above code uses_ fields class variable to hold the new defined class, and then add something new to it.

As a common base class, StructTuple class is inherited by other users. In this class__ new__ The () method is used to construct a new instance. Use here__ new__ () is not very common, mainly because we need to modify the call signature of tuples, so that we can create instances like ordinary instance calls. As follows:

s = Stock('ACME', 50, 91.1) # OK
s = Stock(('ACME', 50, 91.1)) # Error

Follow__ init__ () the difference is__ new__ The () method is triggered before the instance is created. Since tuples are immutable, it is impossible to change them once they are created. And__ init__ () will be triggered at the end of instance creation, so that we can do what we want to do. That's why__ new__ () method has been defined.

Although this section is very short, you still need to study it carefully and think deeply about how Python classes are defined, how instances are created, and when different methods of metaclasses and classes are called.

PEP 422 Provides another way to solve the problems in this section. However, by the time I wrote this book, it had not been adopted and accepted. Still, if you're using Python 3.3 or later, it's worth looking at.

9.20 implementing method overloading with function annotation

problem

You have learned how to use function parameter annotation, so you may want to use it to implement type based method overloading. But you're not sure how to achieve it (or whether it will work).

Solution

The technology in this section is based on a simple technology, that is, Python allows Parameter annotation. The code can be written as follows:

class Spam:
    def bar(self, x:int, y:int):
        print('Bar 1:', x, y)

    def bar(self, s:str, n:int = 0):
        print('Bar 2:', s, n)

s = Spam()
s.bar(2, 3) # Prints Bar 1: 2 3
s.bar('hello') # Prints Bar 2: hello 0

The following is our first attempt, using a metaclass and descriptor:

# multiple.py
import inspect
import types

class MultiMethod:
    '''
    Represents a single multimethod.
    '''
    def __init__(self, name):
        self._methods = {}
        self.__name__ = name

    def register(self, meth):
        '''
        Register a new method as a multimethod
        '''
        sig = inspect.signature(meth)

        # Build a type signature from the method's annotations
        types = []
        for name, parm in sig.parameters.items():
            if name == 'self':
                continue
            if parm.annotation is inspect.Parameter.empty:
                raise TypeError(
                    'Argument {} must be annotated with a type'.format(name)
                )
            if not isinstance(parm.annotation, type):
                raise TypeError(
                    'Argument {} annotation must be a type'.format(name)
                )
            if parm.default is not inspect.Parameter.empty:
                self._methods[tuple(types)] = meth
            types.append(parm.annotation)

        self._methods[tuple(types)] = meth

    def __call__(self, *args):
        '''
        Call a method based on type signature of the arguments
        '''
        types = tuple(type(arg) for arg in args[1:])
        meth = self._methods.get(types, None)
        if meth:
            return meth(*args)
        else:
            raise TypeError('No matching method for types {}'.format(types))

    def __get__(self, instance, cls):
        '''
        Descriptor method needed to make calls work in a class
        '''
        if instance is not None:
            return types.MethodType(self, instance)
        else:
            return self

class MultiDict(dict):
    '''
    Special dictionary to build multimethods in a metaclass
    '''
    def __setitem__(self, key, value):
        if key in self:
            # If key already exists, it must be a multimethod or callable
            current_value = self[key]
            if isinstance(current_value, MultiMethod):
                current_value.register(value)
            else:
                mvalue = MultiMethod(key)
                mvalue.register(current_value)
                mvalue.register(value)
                super().__setitem__(key, mvalue)
        else:
            super().__setitem__(key, value)

class MultipleMeta(type):
    '''
    Metaclass that allows multiple dispatch of methods
    '''
    def __new__(cls, clsname, bases, clsdict):
        return type.__new__(cls, clsname, bases, dict(clsdict))

    @classmethod
    def __prepare__(cls, clsname, bases):
        return MultiDict()

To use this class, you can write as follows:

class Spam(metaclass=MultipleMeta):
    def bar(self, x:int, y:int):
        print('Bar 1:', x, y)

    def bar(self, s:str, n:int = 0):
        print('Bar 2:', s, n)

# Example: overloaded __init__
import time

class Date(metaclass=MultipleMeta):
    def __init__(self, year: int, month:int, day:int):
        self.year = year
        self.month = month
        self.day = day

    def __init__(self):
        t = time.localtime()
        self.__init__(t.tm_year, t.tm_mon, t.tm_mday)

The following is an interactive example to verify that it works correctly:

>>> s = Spam()
>>> s.bar(2, 3)
Bar 1: 2 3
>>> s.bar('hello')
Bar 2: hello 0
>>> s.bar('hello', 5)
Bar 2: hello 5
>>> s.bar(2, 'hello')
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "multiple.py", line 42, in __call__
        raise TypeError('No matching method for types {}'.format(types))
TypeError: No matching method for types (<class 'int'>, <class 'str'>)
>>> # Overloaded __init__
>>> d = Date(2012, 12, 21)
>>> # Get today's date
>>> e = Date()
>>> e.year
2012
>>> e.month
12
>>> e.day
3
>>>

discuss

Frankly speaking, compared with the usual code, this section uses a lot of magic code. However, it can give us an in-depth understanding of the underlying working principles of metaclasses and descriptors, and deepen our impression of these concepts. Therefore, even if you don't immediately apply the technology in this section, some of its underlying ideas will affect other programming technologies involving metaclasses, descriptors and function annotations.

The main idea in the implementation of this section is actually very simple. The MultipleMeta metaclass uses its__ prepare__ () method to provide a custom dictionary as an instance of MultiDict. Unlike ordinary dictionaries, MultiDict will check whether the element already exists when it is set. If so, duplicate elements will be merged in the MultiMethod instance.

MultiMethod instances collect methods by building mappings from type signatures to functions. In this build process, function annotations are used to collect these signatures and then build the map. This process is in MultiMethod Register () method. A key feature of this mapping is that for multiple methods, all parameter types must be specified, otherwise an error will be reported.

In order for the MultiMethod instance to simulate a call, its__ call__ () method is implemented. This method constructs a type tuple from all parameters that exclude self, finds the method in the internal map, and then calls the corresponding method. In order to make the MultiMethod instance operate correctly during class definition__ get__ () must be realized. It is used to build the correct binding method. For example:

>>> b = s.bar
>>> b
<bound method Spam.bar of <__main__.Spam object at 0x1006a46d0>>
>>> b.__self__
<__main__.Spam object at 0x1006a46d0>
>>> b.__func__
<__main__.MultiMethod object at 0x1006a4d50>
>>> b(2, 3)
Bar 1: 2 3
>>> b('hello')
Bar 2: hello 0
>>>

However, the implementation of this section has some limitations, one of which is that it cannot use keyword parameters. For example:

>>> s.bar(x=2, y=3)
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: __call__() got an unexpected keyword argument 'y'

>>> s.bar(s='hello')
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
TypeError: __call__() got an unexpected keyword argument 's'
>>>

There may be other ways to add this support, but it requires a completely different method mapping. The problem is that keyword parameters appear in no order. When it is mixed with position parameters, your parameters will become confused. At this time, you have to__ call__ () method, do a sort first.

There are also restrictions on inheritance. For example, codes like the following cannot work normally:

class A:
    pass

class B(A):
    pass

class C:
    pass

class Spam(metaclass=MultipleMeta):
    def foo(self, x:A):
        print('Foo 1:', x)

    def foo(self, x:C):
        print('Foo 2:', x)

The reason is that the x:A annotation cannot successfully match subclass instances (such as the instance of B), as follows:

>>> s = Spam()
>>> a = A()
>>> s.foo(a)
Foo 1: <__main__.A object at 0x1006a5310>
>>> c = C()
>>> s.foo(c)
Foo 2: <__main__.C object at 0x1007a1910>
>>> b = B()
>>> s.foo(b)
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "multiple.py", line 44, in __call__
        raise TypeError('No matching method for types {}'.format(types))
TypeError: No matching method for types (<class '__main__.B'>,)
>>>

As an alternative to using metaclasses and annotations, a similar effect can be achieved through the descriptor. For example:

import types

class multimethod:
    def __init__(self, func):
        self._methods = {}
        self.__name__ = func.__name__
        self._default = func

    def match(self, *types):
        def register(func):
            ndefaults = len(func.__defaults__) if func.__defaults__ else 0
            for n in range(ndefaults+1):
                self._methods[types[:len(types) - n]] = func
            return self
        return register

    def __call__(self, *args):
        types = tuple(type(arg) for arg in args[1:])
        meth = self._methods.get(types, None)
        if meth:
            return meth(*args)
        else:
            return self._default(*args)

    def __get__(self, instance, cls):
        if instance is not None:
            return types.MethodType(self, instance)
        else:
            return self

In order to use the descriptor version, you need to write as follows:

class Spam:
    @multimethod
    def bar(self, *args):
        # Default method called if no match
        raise TypeError('No matching method for bar')

    @bar.match(int, int)
    def bar(self, x, y):
        print('Bar 1:', x, y)

    @bar.match(str, int)
    def bar(self, s, n = 0):
        print('Bar 2:', s, n)

The descriptor scheme also has the limitations mentioned above (keyword parameters and inheritance are not supported).

All things are equal, good and bad. Perhaps the best way is to avoid method overloading in ordinary code. However, some special cases are meaningful, such as overloading programs based on pattern matching. For example, the visitor pattern in section 8.21 can be modified to a class that uses method overloading. However, in addition to this, method overloading should not be used in general (simply use methods with different names).

There has been a long discussion in the Python community about implementing method overloading. For the reasons for this debate, please refer to Guido van Rossum's blog: Five-Minute Multimethods in Python

9.21 avoid duplicate attribute methods

problem

You need to repeatedly define some attribute methods that perform the same logic in the class, such as type checking. How to simplify these repeated codes?

Solution

Consider the next simple class whose properties are wrapped by property methods:

class Person:
    def __init__(self, name ,age):
        self.name = name
        self.age = age

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, value):
        if not isinstance(value, str):
            raise TypeError('name must be a string')
        self._name = value

    @property
    def age(self):
        return self._age

    @age.setter
    def age(self, value):
        if not isinstance(value, int):
            raise TypeError('age must be an int')
        self._age = value

As you can see, we have written a lot of repetitive code to implement the type check of attribute values. As long as you see code like this in the future, you should find ways to simplify it. One possible approach is to create a function that defines the property and returns it. For example:

def typed_property(name, expected_type):
    storage_name = '_' + name

    @property
    def prop(self):
        return getattr(self, storage_name)

    @prop.setter
    def prop(self, value):
        if not isinstance(value, expected_type):
            raise TypeError('{} must be a {}'.format(name, expected_type))
        setattr(self, storage_name, value)

    return prop

# Example use
class Person:
    name = typed_property('name', str)
    age = typed_property('age', int)

    def __init__(self, name, age):
        self.name = name
        self.age = age

discuss

In this section, we demonstrate an important feature of internal functions or closures, which are very similar to a macro. The function in the example is typed_property() seems a little difficult to understand. In fact, all it does is generate a property for you and return the property object. Therefore, when it is used in a class, the effect is the same as putting the code in the class definition. Although the getter and setter methods of attributes access local variables such as name and expected_ Type and store_ Name, this is normal. The values of these variables will be saved in the closure.

We can also use functools Partial () to slightly change this example, which is very interesting. For example, you can do something like this:

from functools import partial

String = partial(typed_property, expected_type=str)
Integer = partial(typed_property, expected_type=int)

# Example:
class Person:
    name = String('name')
    age = Integer('age')

    def __init__(self, name, age):
        self.name = name
        self.age = age

In fact, you can find that the code here is somewhat similar to the type system descriptor code in section 8.13.

9.22 simple method of defining context manager

problem

You want to implement a new context manager yourself to use the with statement.

Solution

The easiest way to implement a new context manager is to use the @ contextmanager decorator in the contexlib module. The following is an example of a context manager that implements the code block timing function:

import time
from contextlib import contextmanager

@contextmanager
def timethis(label):
    start = time.time()
    try:
        yield
    finally:
        end = time.time()
        print('{}: {}'.format(label, end - start))

# Example use
with timethis('counting'):
    n = 10000000
    while n > 0:
        n -= 1

In the function timethis(), the code before yield is used as a__ enter__ () method is executed, and all code after yield will be__ exit__ () method execution. If an exception occurs, it will be thrown in the yield statement.

The following is a more advanced context manager that implements certain transactions on list objects:

@contextmanager
def list_transaction(orig_list):
    working = list(orig_list)
    yield working
    orig_list[:] = working

The purpose of this code is that any modification to the list will take effect only when all the code runs and there are no exceptions. Let's demonstrate:

>>> items = [1, 2, 3]
>>> with list_transaction(items) as working:
...     working.append(4)
...     working.append(5)
...
>>> items
[1, 2, 3, 4, 5]
>>> with list_transaction(items) as working:
...     working.append(6)
...     working.append(7)
...     raise RuntimeError('oops')
...
Traceback (most recent call last):
    File "<stdin>", line 4, in <module>
RuntimeError: oops
>>> items
[1, 2, 3, 4, 5]
>>>

discuss

Usually, if you want to write a context manager, you need to define a class that contains a__ enter__ () and one__ exit__ () method, as follows:

import time

class timethis:
    def __init__(self, label):
        self.label = label

    def __enter__(self):
        self.start = time.time()

    def __exit__(self, exc_ty, exc_val, exc_tb):
        end = time.time()
        print('{}: {}'.format(self.label, end - self.start))

Although this is not difficult to write, it is a little boring compared with writing a simple function annotated with @ contextmanager.

@contextmanager should only be used to write self-contained context management functions. If you have some objects (such as a file, network connection or lock) that need to support the with statement, you need to implement it separately__ enter__ () methods and__ exit__ () method.

9.23 executing code in a local variable field

problem

You want to execute a piece of code within the scope of use, and you want all the results to be invisible after execution.

Solution

To understand this problem, try a simple scenario first. First, execute a code fragment in the global namespace:

>>> a = 13
>>> exec('b = a + 1')
>>> print(b)
14
>>>

Then, execute the same code in a function:

>>> def test():
...     a = 13
...     exec('b = a + 1')
...     print(b)
...
>>> test()
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "<stdin>", line 4, in test
NameError: global name 'b' is not defined
>>>

It can be seen that a NameError exception is thrown at last, just like the exec() statement has never been executed. If you want to use the exec() execution result in the later calculation, there will be a problem.

To correct this error, you need to use the locals() function to get a local variable dictionary before calling exec(). Then you can get the modified variable value from the local dictionary. For example:

>>> def test():
...     a = 13
...     loc = locals()
...     exec('b = a + 1')
...     b = loc['b']
...     print(b)
...
>>> test()
14
>>>

discuss

In fact, it is difficult to use exec() correctly. In most cases, when you want to consider using exec(), there are other better solutions (such as decorators, closures, metaclasses, etc.).

However, if you still want to use exec(), this section lists some ways to use it correctly. By default, exec() executes code locally and globally within the caller. However, in the function, the local scope passed to exec() is a dictionary that copies the actual local variables. Therefore, if exec() performs a modification operation, the modified result has no effect on the actual local variable value. Here is another example to demonstrate it:

>>> def test1():
...     x = 0
...     exec('x += 1')
...     print(x)
...
>>> test1()
0
>>>

In the above code, when you call locals() to get a local variable, you get a copy of the local variable passed to exec(). By reviewing the values of the dictionary after code execution, the modified values can be obtained. Here is a demonstration example:

>>> def test2():
...     x = 0
...     loc = locals()
...     print('before:', loc)
...     exec('x += 1')
...     print('after:', loc)
...     print('x =', x)
...
>>> test2()
before: {'x': 0}
after: {'loc': {...}, 'x': 1}
x = 0
>>>

Carefully observe the output of the last step. Unless you manually assign the modified value in loc to x, the value of X variable will not change.

When using locals(), you need to pay attention to the operation sequence. Each time it is called, locals() gets the value in the local variable value and overwrites the corresponding variable in the dictionary. Please note the output of the following test:

>>> def test3():
...     x = 0
...     loc = locals()
...     print(loc)
...     exec('x += 1')
...     print(loc)
...     locals()
...     print(loc)
...
>>> test3()
{'x': 0}
{'loc': {...}, 'x': 1}
{'loc': {...}, 'x': 0}
>>>

Notice how the value of x was overwritten the last time locals() was called.

As an alternative to locals(), you can use your own dictionary and pass it to exec(). For example:

>>> def test4():
...     a = 13
...     loc = { 'a' : a }
...     glb = { }
...     exec('b = a + 1', glb, loc)
...     b = loc['b']
...     print(b)
...
>>> test4()
14
>>>

In most cases, this approach is a best practice for using exec(). You only need to ensure that the global and local dictionaries have been initialized when accessing later code.

Also, before using exec(), you may need to ask yourself if there are other better alternatives. In most cases, when you want to consider using exec(), there are other better solutions, such as decorators, closures, metaclasses, or other metaprogramming features.

9.24 parsing and analyzing Python source code

problem

You want to write a program that parses and analyzes Python source code.

Solution

Most programmers know that Python can calculate or execute source code in string form. For example:

>>> x = 42
>>> eval('2 + 3*4 + x')
56
>>> exec('for i in range(10): print(i)')
0
1
2
3
4
5
6
7
8
9
>>>

Nevertheless, the ast module can be used to compile Python source code into an abstract syntax tree (AST) that can be analyzed. For example:

>>> import ast
>>> ex = ast.parse('2 + 3*4 + x', mode='eval')
>>> ex
<_ast.Expression object at 0x1007473d0>
>>> ast.dump(ex)
"Expression(body=BinOp(left=BinOp(left=Num(n=2), op=Add(),
right=BinOp(left=Num(n=3), op=Mult(), right=Num(n=4))), op=Add(),
right=Name(id='x', ctx=Load())))"

>>> top = ast.parse('for i in range(10): print(i)', mode='exec')
>>> top
<_ast.Module object at 0x100747390>
>>> ast.dump(top)
"Module(body=[For(target=Name(id='i', ctx=Store()),
iter=Call(func=Name(id='range', ctx=Load()), args=[Num(n=10)],
keywords=[], starargs=None, kwargs=None),
body=[Expr(value=Call(func=Name(id='print', ctx=Load()),
args=[Name(id='i', ctx=Load())], keywords=[], starargs=None,
kwargs=None))], orelse=[])])"
>>>

Analyzing the source tree requires you to learn more. It is composed of a series of AST nodes. The simplest way to analyze these nodes is to define a visitor class and implement many visits_ NodeName() method, NodeName() matches the nodes you are interested in. Here is a class that records which names are loaded, stored, and deleted.

import ast

class CodeAnalyzer(ast.NodeVisitor):
    def __init__(self):
        self.loaded = set()
        self.stored = set()
        self.deleted = set()

    def visit_Name(self, node):
        if isinstance(node.ctx, ast.Load):
            self.loaded.add(node.id)
        elif isinstance(node.ctx, ast.Store):
            self.stored.add(node.id)
        elif isinstance(node.ctx, ast.Del):
            self.deleted.add(node.id)

# Sample usage
if __name__ == '__main__':
    # Some Python code
    code = '''
    for i in range(10):
        print(i)
    del i
    '''

    # Parse into an AST
    top = ast.parse(code, mode='exec')

    # Feed the AST to analyze name usage
    c = CodeAnalyzer()
    c.visit(top)
    print('Loaded:', c.loaded)
    print('Stored:', c.stored)
    print('Deleted:', c.deleted)

If you run this program, you will get the following output:

Loaded: {'i', 'range', 'print'}
Stored: {'i'}
Deleted: {'i'}

Finally, AST can be compiled and executed through the compile() function. For example:

>>> exec(compile(top,'<stdin>', 'exec'))
0
1
2
3
4
5
6
7
8
9
>>>

discuss

When you can analyze the source code and get information from it, you can write a lot of code analysis, optimization or verification tools. For example, instead of blindly passing some code fragments into functions like exec(), you can first convert it into an AST, and then observe its details to see how it does it. You can also write some tools to view the full source code of a module and perform some static analysis on this basis.

Note that if you know what you're doing, you can also rewrite AST to represent new code. The following is an example of a decorator. You can reduce the global access variable to the scope of the function body by re parsing the function body source code, rewriting the ast, and re creating the function code object,

# namelower.py
import ast
import inspect

# Node visitor that lowers globally accessed names into
# the function body as local variables.
class NameLower(ast.NodeVisitor):
    def __init__(self, lowered_names):
        self.lowered_names = lowered_names

    def visit_FunctionDef(self, node):
        # Compile some assignments to lower the constants
        code = '__globals = globals()\n'
        code += '\n'.join("{0} = __globals['{0}']".format(name)
                            for name in self.lowered_names)
        code_ast = ast.parse(code, mode='exec')

        # Inject new statements into the function body
        node.body[:0] = code_ast.body

        # Save the function object
        self.func = node

# Decorator that turns global names into locals
def lower_names(*namelist):
    def lower(func):
        srclines = inspect.getsource(func).splitlines()
        # Skip source lines prior to the @lower_names decorator
        for n, line in enumerate(srclines):
            if '@lower_names' in line:
                break

        src = '\n'.join(srclines[n+1:])
        # Hack to deal with indented code
        if src.startswith((' ','\t')):
            src = 'if 1:\n' + src
        top = ast.parse(src, mode='exec')

        # Transform the AST
        cl = NameLower(namelist)
        cl.visit(top)

        # Execute the modified AST
        temp = {}
        exec(compile(top,'','exec'), temp, temp)

        # Pull out the modified code object
        func.__code__ = temp[func.__name__].__code__
        return func
    return lower

To use this code, you can write it like this:

INCR = 1
@lower_names('INCR')
def countdown(n):
    while n > 0:
        n -= INCR

The decorator rewrites the countdown() function to look like this:

def countdown(n):
    __globals = globals()
    INCR = __globals['INCR']
    while n > 0:
        n -= INCR

In the performance test, it will make the function run 20% faster

Now, do you want to add this decorator to all your functions? Maybe not. However, this is a good demonstration of some advanced technologies, such as AST operation, source code operation, etc

This section is inspired by another chapter dealing with Python bytecode in ActiveState. Using AST is a more advanced technology and simpler. Refer to the following section for more information on bytecode.

9.25 disassembly of Python bytecode

problem

You want to see how it works by decompiling your code into low-level bytecode.

Solution

The dis module can be used to output the decompiled results of any Python function. For example:

>>> def countdown(n):
...     while n > 0:
...         print('T-minus', n)
...         n -= 1
...     print('Blastoff!')
...
>>> import dis
>>> dis.dis(countdown)
  2           0 SETUP_LOOP              30 (to 32)
        >>    2 LOAD_FAST                0 (n)
              4 LOAD_CONST               1 (0)
              6 COMPARE_OP               4 (>)
              8 POP_JUMP_IF_FALSE       30

  3          10 LOAD_GLOBAL              0 (print)
             12 LOAD_CONST               2 ('T-minus')
             14 LOAD_FAST                0 (n)
             16 CALL_FUNCTION            2
             18 POP_TOP

  4          20 LOAD_FAST                0 (n)
             22 LOAD_CONST               3 (1)
             24 INPLACE_SUBTRACT
             26 STORE_FAST               0 (n)
             28 JUMP_ABSOLUTE            2
        >>   30 POP_BLOCK

  5     >>   32 LOAD_GLOBAL              0 (print)
             34 LOAD_CONST               4 ('Blastoff!')
             36 CALL_FUNCTION            1
             38 POP_TOP
             40 LOAD_CONST               0 (None)
             42 RETURN_VALUE
>>>

discuss

The dis module is useful when you want to know the underlying operating mechanism of your program. For example, if you want to try to understand performance characteristics. The original bytecode parsed by the dis() function is as follows:

>>> countdown.__code__.co_code
b"x'\x00|\x00\x00d\x01\x00k\x04\x00r)\x00t\x00\x00d\x02\x00|\x00\x00\x83
\x02\x00\x01|\x00\x00d\x03\x008}\x00\x00q\x03\x00Wt\x00\x00d\x04\x00\x83
\x01\x00\x01d\x00\x00S"
>>>

If you want to explain this code yourself, you need to use some constants defined in the opcode module. For example:

>>> c = countdown.__code__.co_code
>>> import opcode
>>> opcode.opname[c[0]]
'SETUP_LOOP'
>>> opcode.opname[c[2]]
'LOAD_FAST'
>>>

Strangely, there is no function in the dis module that allows you to easily handle bytecode programmatically. However, the following generator function can convert the original bytecode sequence into opcodes and parameters.

import opcode

def generate_opcodes(codebytes):
    extended_arg = 0
    i = 0
    n = len(codebytes)
    while i < n:
        op = codebytes[i]
        i += 1
        if op >= opcode.HAVE_ARGUMENT:
            oparg = codebytes[i] + codebytes[i+1]*256 + extended_arg
            extended_arg = 0
            i += 2
            if op == opcode.EXTENDED_ARG:
                extended_arg = oparg * 65536
                continue
        else:
            oparg = None
        yield (op, oparg)

The usage is as follows:

>>> for op, oparg in generate_opcodes(countdown.__code__.co_code):
...     print(op, opcode.opname[op], oparg)

This method is rarely known. You can use it to replace the original bytecode of any function you want to replace. Let's use an example to illustrate the whole process:

>>> def add(x, y):
...     return x + y
...
>>> c = add.__code__
>>> c
<code object add at 0x1007beed0, file "<stdin>", line 1>
>>> c.co_code
b'|\x00\x00|\x01\x00\x17S'
>>>
>>> # Make a completely new code object with bogus byte code
>>> import types
>>> newbytecode = b'xxxxxxx'
>>> nc = types.CodeType(c.co_argcount, c.co_kwonlyargcount,
...     c.co_nlocals, c.co_stacksize, c.co_flags, newbytecode, c.co_consts,
...     c.co_names, c.co_varnames, c.co_filename, c.co_name,
...     c.co_firstlineno, c.co_lnotab)
>>> nc
<code object add at 0x10069fe40, file "<stdin>", line 1>
>>> add.__code__ = nc
>>> add(2,3)
Segmentation fault

You can play big tricks like this to break the interpreter. However, for programmers writing more advanced optimization and metaprogramming tools, they may really need to rewrite bytecode. The last part of this section demonstrates how this is done. You can also refer to another similar example: this code on ActiveState

Topics: Python