Treevaluate (0x03) -- detailed analysis of function tree (Part 2)

Posted by Morthian on Mon, 29 Nov 2021 08:37:08 +0100

Long time no see, back to treevalue series again. This article will be based on Last treevalue explanation , we will continue to analyze the tree mechanism of functions in detail, and will talk more about its derivative characteristics and applications.

Tree method and class method

Firstly, based on the previous tree function, we can expand the tree function in the general sense. For the category of "function", it naturally includes two special functions: method and class method, which are essentially similar to general functions (you can read about this part) Python Popular Science Series -- classes and methods (Part 2) For further understanding in the chapter "the nature of object methods" in). Because of the similarity between them, both object methods and class methods can also be extended.

Based on the properties of the methods and class methods described above, we can make similar tree expansion. Let's look at an example

from treevalue import TreeValue, method_treelize, classmethod_treelize


class MyTreeValue(TreeValue):
    @method_treelize()
    def plus(self, x):
        return self + x

    # with the usage of rise option, final return should be a tuple of 2 trees
    @classmethod
    @classmethod_treelize(rise=True)
    def add_all(cls, a, b):
        return cls, a + b

Therefore, we build our own TreeValue class - MyTreeValue class, and can use internal methods and class methods to write object-oriented programs. For example, for the MyTreeValue class, we can perform the following operations (code continues above)

t1 = MyTreeValue({'a': 1, 'b': 2, 'x': {'c': 3, 'd': 4}})
t2 = MyTreeValue({'a': 5, 'b': 6, 'x': {'c': 7, 'd': 8}})

print(t1.plus(2))
# <MyTreeValue 0x7fe023375ee0>
# ├── a --> 3
# ├── b --> 4
# └── x --> <MyTreeValue 0x7fe023375eb0>
#     ├── c --> 5
#     └── d --> 6

print(t1.plus(t2))
# <MyTreeValue 0x7fe023375eb0>
# ├── a --> 6
# ├── b --> 8
# └── x --> <MyTreeValue 0x7fe021dd16a0>
#     ├── c --> 10
#     └── d --> 12

print(MyTreeValue.add_all(t1, t2))
# (<MyTreeValue 0x7effa62c6250>
# ├── a --> <class '__main__.MyTreeValue'>
# ├── b --> <class '__main__.MyTreeValue'>
# └── x --> <MyTreeValue 0x7effa62a0790>
#     ├── c --> <class '__main__.MyTreeValue'>
#     └── d --> <class '__main__.MyTreeValue'>
# , <MyTreeValue 0x7effa629df70>
# ├── a --> 6
# ├── b --> 8
# └── x --> <MyTreeValue 0x7effa62c6d90>
#     ├── c --> 10
#     └── d --> 12
# )

In addition, for object methods, there is obviously an operator, that is, self, and there is often a need for "in place operation", similar to sin_ in torch library. In the tree function for object methods, we provide self_copy option. When this option is enabled, the running results on each node will be mounted on the current tree object after calculation, and it will be sent out as a return value. A simple example is as follows

from treevalue import TreeValue, method_treelize


class MyTreeValue(TreeValue):
    @method_treelize(self_copy=True)
    def plus_(self, x):
        return self + x


t1 = MyTreeValue({'a': 1, 'b': 2, 'x': {'c': 3, 'd': 4}})

print(t1)
# <MyTreeValue 0x7f543c83cd60>
# ├── a --> 1
# ├── b --> 2
# └── x --> <MyTreeValue 0x7f543c83cd00>
#     ├── c --> 3
#     └── d --> 4

print(t1.plus_(2))
# <MyTreeValue 0x7f543c83cd60>
# ├── a --> 3
# ├── b --> 4
# └── x --> <MyTreeValue 0x7f543c83cd00>
#     ├── c --> 5
#     └── d --> 6

In the above code, you can see plus_ The return value of the method is still the previous tree object, and the internal node values are replaced with the calculation result value. At this time, if you access tree t1, you will get this object.
Extended thinking 1: how to tree static methods? Please verify your guess by writing code.

Extended thinking 2: for properties, if only the read (_get_) function is considered, how to tree them? Please verify your guess through the code.

Welcome to the comments section!

Tree operation

If you know the principle of arithmetic operation, you should know that in python, arithmetic operation is also supported by a special class of object methods, such as addition operation__ add__ (self + x), __ radd__ (x + self) and__ iadd__ The (self += x) operation is supported together, and the overloading of operators is often realized through this kind of magic method. For this part, you can read Python Popular Science Series -- classes and methods (Part 2) In the chapter "the wonderful use of magic methods" for further understanding.

In that case, you might as well think about what wonderful effect will be produced if the treelization function is used in this kind of special method? Yes, as you think, this kind of operation can be extended, and the effect will be like the following code

from treevalue import TreeValue, method_treelize


class AddTreeValue(TreeValue):
    @method_treelize()
    def __add__(self, other):
        return self + other

    @method_treelize()
    def __radd__(self, other):
        return other + self

    @method_treelize(self_copy=True)
    def __iadd__(self, other):
        return self + other

The effect of running is as follows

t1 = AddTreeValue({'a': 1, 'x': {'c': 3}})
t2 = AddTreeValue({'a': 5, 'x': {'c': 7}})

print(t1)
# <AddTreeValue 0x7ff25d729e50>
# ├── a --> 1
# └── x --> <AddTreeValue 0x7ff25d729e20>
#     └── c --> 3
print(t1 + 2)
# <AddTreeValue 0x7ff25d72caf0>
# ├── a --> 3
# └── x --> <AddTreeValue 0x7ff25c17aa90>
#     └── c --> 5
print(3 + t1)
# <AddTreeValue 0x7ff25d72caf0>
# ├── a --> 4
# └── x --> <AddTreeValue 0x7ff25c17aa90>
#     └── c --> 6
print(t1 + t2)
# <AddTreeValue 0x7ff25d72caf0>
# ├── a --> 6
# └── x --> <AddTreeValue 0x7ff25c17aa90>
#     └── c --> 10

t1 += t2 + 10
print(t1)
# <AddTreeValue 0x7ff25d729e50>
# ├── a --> 16
# └── x --> <AddTreeValue 0x7ff25d729e20>
#     └── c --> 20

Not only that, the author, as a developer of treevaluate, also thinks so. So here is a subclass FastTreeValue, which is based on treevalue and provides more common functions and operations to make it faster and easier to use. This class has appeared many times since the first play of this series, and here we can finally reveal its true mystery. In the FastTreeValue class, various arithmetic operations such as the above have been implemented in a similar way and can be used. For example, the following code

from treevalue import FastTreeValue

t1 = FastTreeValue({'a': 1, 'x': {'c': 3}})
t2 = FastTreeValue({'a': 5, 'x': {'c': 7}})

print(t1 * (1 - t1 + t2) % 10 + (t2 // t1))  # complex calculation
# <FastTreeValue 0x7f973be1eaf0>
# ├── a --> 10
# └── x --> <FastTreeValue 0x7f973be1ea00>
#     └── c --> 7

t3 = FastTreeValue({'a': 1, 'b': 'sdjkfh', 'x': {'c': [1, 2], 'd': 1.2}})
t4 = FastTreeValue({'a': 4, 'b': 'anstr', 'x': {'c': [4, 5, -2], 'd': -8.5}})

print(t3 + t4)  # add all together, not only int or float
# <FastTreeValue 0x7f973be1e970>
# ├── a --> 5
# ├── b --> 'sdjkfhanstr'
# └── x --> <FastTreeValue 0x7f973be1eac0>
#     ├── c --> [1, 2, 4, 5, -2]
#     └── d --> -7.3

t5 = FastTreeValue({'a': {2, 3}, 'x': {'c': 8937}})
t6 = FastTreeValue({'a': {1, 2, 4}, 'x': {'c': 910}})

print(t5 | t6)  # | and &, between sets and ints
# <FastTreeValue 0x7f973be1e640>
# ├── a --> {1, 2, 3, 4}
# └── x --> <FastTreeValue 0x7f973be1e8e0>
#     └── c --> 9199
print(t5 & t6)
# <FastTreeValue 0x7f973be1e640>
# ├── a --> {2}
# └── x --> <FastTreeValue 0x7f973be1e8e0>
#     └── c --> 648

So far, conventional arithmetic operations have been covered, and due to python's support for arithmetic operations, arithmetic operations are not limited to the type of value, but can widely support various types of operations.

Extended thinking 3: combination Python Popular Science Series -- classes and methods (Part 2) In the "wonderful use of magic methods" section, think about how this kind of arithmetic magic methods should be realized? Then check the source code of treevaluate to verify your conjecture.

Welcome to the comments section!

Application based on tree operation

In fact, the special operations starting and ending with underscores in python are not only the above arithmetic operations, but also a series of operation classes can be extended in a similar way. The most typical one is the extension of functional magic methods. For example, we can__ getitem__ , __ setitem__ Expand as follows

from treevalue import TreeValue, method_treelize


class MyTreeValue(TreeValue):
    @method_treelize()
    def __getitem__(self, item):
        return self[item]

    @method_treelize()
    def __setitem__(self, key, value):
        self[key] = value

There is a similar implementation in FastTreeValue. One effect that can be produced is that all subordinate objects can be accessed quickly through index. The code is as follows

import torch

from treevalue import FastTreeValue

t1 = FastTreeValue({
    'a': torch.randn(2, 3),
    'x': {
        'c': torch.randn(3, 4),
    }
})

print(t1)
# <FastTreeValue 0x7f93f19b9c40>
# ├── a --> tensor([[-0.5878,  0.8615, -0.1703],
# │                 [ 1.5826, -0.5806,  1.5869]])
# └── x --> <FastTreeValue 0x7f93f19b9d00>
#     └── c --> tensor([[-0.3380, -0.6968,  0.7013, -0.8895],
#                       [-0.2798,  0.6196,  0.8141, -2.5651],
#                       [ 0.0113, -2.0468,  0.1121,  0.3606]])

print(t1[0])
# <FastTreeValue 0x7f93f19b9d30>
# ├── a --> tensor([-0.5878,  0.8615, -0.1703])
# └── x --> <FastTreeValue 0x7f93901c1fd0>
#     └── c --> tensor([-0.3380, -0.6968,  0.7013, -0.8895])
print(t1[:, 1:-1])
# <FastTreeValue 0x7f93f19b9d30>
# ├── a --> tensor([[ 0.8615],
# │                 [-0.5806]])
# └── x --> <FastTreeValue 0x7f93901c1fd0>
#     └── c --> tensor([[-0.6968,  0.7013],
#                       [ 0.6196,  0.8141],
#                       [-2.0468,  0.1121]])

In addition, in the TreeValue class, a_ attr_extern method. When trying to obtain the value contained in the TreeValue object, it is generally realized by directly accessing the attribute. When the previous tree node does not have this key, it will enter_ attr_extern method. In the native TreeValue class, this method is implemented to throw a KeyError exception directly, while a similar extension is implemented in FastTreeValue (only for illustration, slightly different from the real implementation)

from treevalue import TreeValue, method_treelize


class MyTreeValue(TreeValue):
    @method_treelize()
    def _attr_extern(self, key):
        return getattr(self, key)

Thus, such an effect can be achieved

import torch

from treevalue import FastTreeValue

t1 = FastTreeValue({
    'a': torch.randn(2, 3),
    'x': {
        'c': torch.randn(3, 4),
    }
})

print(t1.shape)
# <FastTreeValue 0x7fac48ac66d0>
# ├── a --> torch.Size([2, 3])
# └── x --> <FastTreeValue 0x7fac48ac6700>
#     └── c --> torch.Size([3, 4])
print(t1.sin)
# <FastTreeValue 0x7f0fcd0e36a0>
# ├── a --> <built-in method sin of Tensor object at 0x7f0fcd0ea040>
# └── x --> <FastTreeValue 0x7f0fcd0e3df0>
#     └── c --> <built-in method sin of Tensor object at 0x7f0fcd0ea080>

It can be seen that not only the general attributes (such as shape) can be obtained and constructed into a tree, but also the object method is extracted and constructed in the same way. This is because in Python, in fact, the concept of attribute (more accurately, Field in English) contains a lot of content, including methods (for details, please refer to Python Popular Science Series -- classes and methods (Part I) Based on this, we can obtain a tree composed of object methods in a similar way to the above code, that is, like the sin method above.

At this point, we can continue to expand a magic method --__ call__ Method, which is used to make the object run directly in a way similar to function call. The overload mode is as follows

from treevalue import TreeValue, method_treelize


class MyTreeValue(TreeValue):
    @method_treelize()
    def __call__(self, *args, **kwargs):
        return self(*args, **kwargs)

A similar implementation is made in FastTreeValue, so the tree composed of object methods obtained above can actually be executed. And will be right_ attr_extern and__ call__ Combined with the extension of, a more wonderful usage can be formed by directly executing the methods contained in its internal objects on the tree object, as shown below

import torch

from treevalue import FastTreeValue

t1 = FastTreeValue({
    'a': torch.randn(2, 4),
    'x': {
        'c': torch.randn(3, 4),
    }
})

print(t1)
# <FastTreeValue 0x7f7e7534bc40>
# ├── a --> tensor([[ 1.4246,  0.4117, -1.1805,  0.1825],
# │                 [ 0.5865, -0.8895, -0.8055,  0.9112]])
# └── x --> <FastTreeValue 0x7f7e7534bd00>
#     └── c --> tensor([[ 1.6239e+00, -2.3074e+00, -2.8613e-01,  1.3310e+00],
#                       [-1.8917e-01,  1.6694e+00, -8.2944e-01,  2.8590e-01],
#                       [-4.0992e-01, -5.8827e-01,  2.0444e-03,  7.0647e-01]])

print(t1.sin())
# <FastTreeValue 0x7f7e7534bd30>
# ├── a --> tensor([[ 0.9893,  0.4002, -0.9248,  0.1814],
# │                 [ 0.5534, -0.7768, -0.7212,  0.7902]])
# └── x --> <FastTreeValue 0x7f7e7534bd60>
#     └── c --> tensor([[ 0.9986, -0.7407, -0.2822,  0.9714],
#                       [-0.1880,  0.9951, -0.7376,  0.2820],
#                       [-0.3985, -0.5549,  0.0020,  0.6491]])
print(t1.reshape((4, -1)))
# <FastTreeValue 0x7f7e13b43fa0>
# ├── a --> tensor([[ 1.4246,  0.4117],
# │                 [-1.1805,  0.1825],
# │                 [ 0.5865, -0.8895],
# │                 [-0.8055,  0.9112]])
# └── x --> <FastTreeValue 0x7f7e7534bd30>
#     └── c --> tensor([[ 1.6239e+00, -2.3074e+00, -2.8613e-01],
#                       [ 1.3310e+00, -1.8917e-01,  1.6694e+00],
#                       [-8.2944e-01,  2.8590e-01, -4.0992e-01],
#                       [-5.8827e-01,  2.0444e-03,  7.0647e-01]])

# different sizes
new_shapes = FastTreeValue({'a': (1, -1), 'x': {'c': (2, -1)}})
print(t1.reshape(new_shapes))
# <FastTreeValue 0x7f98d95241f0>
# ├── a --> tensor([[ 2.0423, -0.5339, -0.4458, -0.3386,  0.1002,  0.6809, -0.3839,  1.9945]])
# └── x --> <FastTreeValue 0x7f993b3e3d30>
#     └── c --> tensor([[ 0.9726,  0.2787,  1.2419, -0.4118,  2.2535, -0.7826],
#                       [-0.9467,  0.3230, -0.6319, -0.2424,  0.4348,  1.3872]])

Readers may be confused. Here, take the above reshape as an example to explain its operation mechanism:

First, run t1.reshape to enter the tree_ attr_ The extern method obtains a tree composed of method objects and sets it to t1_m .
Next, run t1_m((4, -1)), into the tree__ call__ Method, through the operation of each method in the tree and the assembly of the return value, a tree composed of the final result is formed, that is, t1.reshape((4, -1)).

With such functions, in fact, the whole treevaluate is enough to realize very rich and flexible functions, which are easy to understand and easy to maintain. treetensor, a special tree encapsulation library for torch, has been released. If you are interested, you can learn more: opendilab / DI-treetensor.

Extended thinking 4: in addition to reshape, sin, numpy, torch and other computing libraries in the above examples, what other common libraries and objects can achieve similar effects through the above dynamic characteristics?

Extended thinking 5: if the above example is not reshape, but a method like sum, and in some cases you may want to obtain the sum of all objects in the whole tree, how should this requirement be designed to meet?

Extended thinking 6: what other operations are similar to the sum method? What do these operations logically have in common? What is the logical difference between such methods as reshape and sin?

Welcome to the comments section!

Follow up notice

This paper mainly focuses on treevalue's core feature - tree function, and shows its specific applications in class methods and magic methods. Due to space constraints, we can only show these bright features. In the next article, we will explain in detail the application of treevaluate in numpy, torch and other computing model libraries, and compare and analyze it with similar products. Please look forward to it.

In addition, welcome to OpenDILab's open source project:

Open sourced Decision Intelligence (DI)

Programmer Think

Treevaluate (0x03) -- detailed analysis of function tree (Part 2)

Tree method and class method

Tree operation

Application based on tree operation

Follow up notice

Hot Topics