Reading guide
This paper mainly explains how to deploy the model of pytorch to the model process of C + + platform. It is divided into four blocks in order. It describes in detail the model transformation, saving the serialization model, loading the serialization pytorch model in C + + and executing the Script Module.
Recently, due to work needs, we need to deploy pytorch's model to the c + + platform. The basic process mainly refers to the teaching examples on the official website. During this period, many pits were found, which is hereby recorded.
1. Model transformation
Libtorch does not depend on python. The model trained in Python needs to be converted to script model before it can be loaded by libtorch and reasoned. In this step, the official website provides two methods:
Method 1: Tracing
This method is relatively simple. It only needs to give a group of inputs to the model, walk through the reasoning network, and then use torch ji. Trace record the information on the path and save it. Examples are as follows:
import torch import torchvision # An instance of your model. model = torchvision.models.resnet18() # An example input you would normally provide to your model's forward() method. example = torch.rand(1, 3, 224, 224) # Use torch.jit.trace to generate a torch.jit.ScriptModule via tracing. traced_script_module = torch.jit.trace(model, example)
The disadvantage is that if there is a control flow in the model, such as if else statements, a group of inputs can only traverse one branch. In this case, there is no way to record the model information completely.
Method 2: Scripting
Write the model directly in the Torch script and annotate the model accordingly through Torch jit. Script compilation module and convert it into ScriptModule. Examples are as follows:
class MyModule(torch.nn.Module): def __init__(self, N, M): super(MyModule, self).__init__() self.weight = torch.nn.Parameter(torch.rand(N, M)) def forward(self, input): if input.sum() > 0: output = self.weight.mv(input) else: output = self.weight + input return output my_module = MyModule(10,20) sm = torch.jit.script(my_module)
The forward method will be compiled by default, and the methods called in forward will also be compiled in the order of being called
If you want to compile a method other than forward and not called by forward, you can add @ torch jit. export.
If you want the method not to be compiled, you can use
@torch.jit.ignore(https://pytorch.org/docs/master/generated/torch.jit.ignore.html#torch.jit.ignore)
Or @ torch jit. unused( https://pytorch.org/docs/master/generated/torch.jit.unused.html#torch.jit.unused )
# Same behavior as pre-PyTorch 1.2 @torch.jit.script def some_fn(): return 2 # Marks a function as ignored, if nothing # ever calls it then this has no effect @torch.jit.ignore def some_fn2(): return 2 # As with ignore, if nothing calls it then it has no effect. # If it is called in script it is replaced with an exception. @torch.jit.unused def some_fn3(): import pdb; pdb.set_trace() return 4 # Doesn't do anything, this function is already # the main entry point @torch.jit.export def some_fn4(): return 2
There are many pits in this step. The main reasons can be divided into the following two points
1. Unsupported operation
The operations supported by TorchScript are a subset of python. Corresponding implementations can be found for most operations used in torch, but there are some embarrassing operations that are not supported. See the detailed list https://pytorch.org/docs/master/jit_unsupported.html#jit-unsupported, here are some actions I have encountered:
1) Variable number is not supported for parameter / return value, for example
def __init__(self, **kwargs):
perhaps
if output_flag == 0: return reshape_logits else: loss = self.loss(reshape_logits, term_mask, labels_id) return reshape_logits, loss
2) Various iteration operations
eg1.
layers = [int(a) for a in layers]
Error torch jit. frontend. UnsupportedNodeError: ListComp aren’t supported
It can be changed to:
for k in range(len(layers)): layers[k] = int(layers[k])
eg2.
seq_iter = enumerate(scores) try: _, inivalues = seq_iter.__next__() except: _, inivalues = seq_iter.next()
eg3.
line = next(infile)
3) Unsupported statement
eg1. continue is not supported
torch.jit.frontend.UnsupportedNodeError: continue statements aren't supported
eg2. Try catch is not supported
torch.jit.frontend.UnsupportedNodeError: try blocks aren't supported
eg3. The with statement is not supported
4) Other common op / modules
eg1. torch.autograd.Variable
Solution: use torch ones/torch. Randn et al float()/.long() and so on.
eg2. torch.Tensor/torch.LongTensor etc.
Solution: ibid
eg3. requires_ The grad parameter is only available in torch Supported in tensor, torch ones/torch. Zeros, etc. are not available
eg4. tensor.numpy()
eg5. tensor.bool()
Solution: tensor Bool() is replaced by tensor > 0
eg6. self.seg_emb(seg_fea_ids).to(embeds.device)
Solution: display and call where gpu needs to be transferred cuda()
In a word: libraries other than native Python and python, such as numpy, are not needed. Try to use various API s of Python.
2. Specify data type
1) Attribute, most member data types can be inferred from values, and empty lists / dictionaries need to be specified in advance
from typing import Dict class MyModule(torch.nn.Module): my_dict: Dict[str, int] def __init__(self): super(MyModule, self).__init__() # This type cannot be inferred and must be specified self.my_dict = {} # The attribute type here is inferred to be `int` self.my_int = 20 def forward(self): pass m = torch.jit.script(MyModule())
2) Constant, using the Final keyword
try: from typing_extensions import Final except: # If you don't have `typing_extensions` installed, you can use a # polyfill from `torch.jit`. from torch.jit import Final class MyModule(torch.nn.Module): my_constant: Final[int] def __init__(self): super(MyModule, self).__init__() self.my_constant = 2 def forward(self): pass m = torch.jit.script(MyModule())
3) Variable. The default is the tensor type and is immutable, so the non tensor type must be specified
def forward(self, batch_size:int, seq_len:int, use_cuda:bool):
Method 3: mixing Tracing and Scriptin
One is to call script in the trace model, which is suitable for only a small part of the model requiring control flow.
import torch @torch.jit.script def foo(x, y): if x.max() > y.max(): r = x else: r = y return r def bar(x, y, z): return foo(x, y) + z traced_bar = torch.jit.trace(bar, (torch.rand(3), torch.rand(3), torch.rand(3)))
Another way is to use tracing to generate sub modules in script module. For some layers with Python features that script module does not support, you can encapsulate the relevant layers and record the relevant layer flow with trace, and other layers do not need to be modified. Examples of use are as follows:
import torch import torchvision class MyScriptModule(torch.nn.Module): def __init__(self): super(MyScriptModule, self).__init__() self.means = torch.nn.Parameter(torch.tensor([103.939, 116.779, 123.68]) .resize_(1, 3, 1, 1)) self.resnet = torch.jit.trace(torchvision.models.resnet18(), torch.rand(1, 3, 224, 224)) def forward(self, input): return self.resnet(input - self.means) my_script_module = torch.jit.script(MyScriptModule())
2. Save serialization model
If all the holes in the previous step are stepped on, the model saving is very simple. You only need to call save and pass a file name. It should be noted that if you want to train the model on gpu and do information on cpu, you must convert it before model save, and then remember to call model Eval (), shaped as
gpu_model.eval() cpu_model = gpu_model.cpu() sample_input_cpu = sample_input_gpu.cpu() traced_cpu = torch.jit.trace(traced_cpu, sample_input_cpu) torch.jit.save(traced_cpu, "cpu.pth") traced_gpu = torch.jit.trace(traced_gpu, sample_input_gpu) torch.jit.save(traced_gpu, "gpu.pth")
3.C++load trained model
To load a serialized pytorch model in C + +, you must rely on the PyTorch C ++ API (also known as libtorch). The installation of libtorch is very simple. You only need to install it on the pytorch official website( https://pytorch.org/ )Download the corresponding version and unzip it. You will get a folder with the following structure.
libtorch/ bin/ include/ lib/ share/
Then you can build the application. A simple example directory structure is as follows:
example-app/ CMakeLists.txt example-app.cpp
example-app.cpp and cmakelists Txt example codes are as follows:
#include <torch/script.h> // One-stop header. #include <iostream>#include <memory> int main(int argc, const char* argv[]) { if (argc != 2) { std::cerr << "usage: example-app <path-to-exported-script-module>\n"; return -1; } torch::jit::script::Module module; try { // Deserialize the ScriptModule from a file using torch::jit::load(). module = torch::jit::load(argv[1]); } catch (const c10::Error& e) { std::cerr << "error loading the model\n"; return -1; } std::cout << "ok\n"; }
cmake_minimum_required(VERSION 3.0 FATAL_ERROR) project(custom_ops) find_package(Torch REQUIRED) add_executable(example-app example-app.cpp) target_link_libraries(example-app "${TORCH_LIBRARIES}") set_property(TARGET example-app PROPERTY CXX_STANDARD 14)
At this point, you can run the following command from the - app / folder:
mkdir build cd build cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch .. cmake --build . --config Release
Where / path/to/libtorch is the path of the libtorch folder after downloading. If this step is successful, you can see the prompt that the compilation is 100% complete. Next, run the executable generated by the compilation, and you will see the output of "ok". Congratulations!
4. Execute Script Module
Finally to the last step! Next, you only need to pass the construction input to the model and execute forward to get the output. A simple example is as follows:
// Create a vector of inputs. std::vector<torch::jit::IValue> inputs; inputs.push_back(torch::ones({1, 3, 224, 224})); // Execute the model and turn its output into a tensor. at::Tensor output = module.forward(inputs).toTensor(); std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';
The first two lines create a vector of torch::jit::IValue and add a single input Use torch::ones() to create the input tensor, which is equivalent to torch in C ++ API ones. Then, run the forward method of script::Module and convert the returned IValue value into tensor by calling toTensor(). C + + is relatively friendly to various operations of torch, through torch:: or post addition_ Corresponding implementations can be found for all methods, such as
torch::tensor(input_list[j]).to(at::kLong).resize_({batch, 128}).clone() //torch::tensor corresponds to torch of pytorch tensor; At:: klong corresponds to torch int64; resize_ Corresponding to resize
Finally, check to ensure that the output of c + + is consistent with pytorch, and you're done~
I stepped on countless pits and pulled away countless hair. Many things are groping by myself. If there are mistakes, please correct them!
reference material
PyTorch C++ API - PyTorch master document
Torch Script - PyTorch master documentation