Type notes to Python scripts for running PyTorch model in C ++ with libtorch

background

I want to trace a pytorch model (especially a voice processing system) (frozen model, tflite model in TensorFlow).

pytorch itself does some type inference, but it has its limits. Especially in the case of speech processing system, there are various codes other than the neural part. Also, since there are dynamic arrays and recursion, it is necessary to type appropriately.

environment

Assume pytorch v1.4.0 (latest stable as of 04/04/2020).

JIT model

https://pytorch.org/docs/stable/jit.html

TorchScript: A subset of Python, scripting. Type required
Used when the input tensor changes dynamically (especially time series data such as voice system and text data)
Do I need to compile at runtime? (In that case, it may take some time to start)
Traced model: Close to frozend model, tflite model in TensorFlow?
Can be used when the size of the input tensor is fixed.
Inference will be faster ... maybe?
Syntax may be limited

Is the traced model ideal and the runner-up TorchScript?

It seems that you can combine both.

environment

Python 3.6 or later (You can use the typing module typed in Python and the type annotation syntax)

Examine the type

You can step through the types of your current Python script to find out.

Should I use the Python debugger or ipython ... Will it be displayed on Jupyter lab?

I only know basic vim + command line execution, so I'm checking by inserting print (type (x)) etc. where I want each type ...

Type

https://pytorch.org/docs/stable/jit.html

For python2 etc. and 3.5, you can use torch.jit.annotate or write the type in the comment, but it is recommended to use the typing module and type annotation in the python syntax.

sample


def forward(self, x: List[torch.Tensor]) -> Tuple[torch.Tensor]:
  my_list: List[Tuple[int, float]] = []

You can feel that.

Optional

Like std :: optional in C ++, None or have some type, can be ʻOptional [T]`.

Optional[int]

@torch.jit.export

Normally, only the forward () method and the function called from forward are JIT compiled, but you can explicitly export (JIT compile) the method by using the @ torch.jit.export decorator. (forward is implicitly decorated with @ torch.jit.export)

nn.ModuleList

nn.ModuleList (array)

...
self.mods = nn.Modulelist([...])

for i in range(10):
  self.mods[i](x)

It is currently not possible to access by array index like.

[jit] Can't index nn.ModuleList in script function #16123 https://github.com/pytorch/pytorch/issues/16123

For the time being, it seems that iterate in the form of __constants__ and for mod in modules, but if you use multiple nn.ModuleList, you will have to redefine a dedicated class. Masu. However, in this case, the definition of network op changes (the name of state_dict changes), and the weight of the pretrained model needs to be dealt with well.

Also, in v1.5.0 (v1.6.0?), Array indexes with constants such as self.mods [0] have begun to be supported.

[JIT] Add modulelist indexing for integer literal #29236 https://github.com/pytorch/pytorch/pull/29236

An error occurred when evaluating (libtorch side) of TorchScript. (It has an expression like getattr (xxx, 10), which cannot be parsed at runtime)

We need to wait a little longer for maturity.

In addition, iterate nn.ModuleList does not support reverse iteration with reversed.

print, assert

In TorchScript, print, ʻassert` also works in TorchScript (maybe not in trace). It can be used to send a message for debugging.

Is the script running in the JIT?

When it is executed by scripting, you want to omit some processing, or None is expected, but when it is non None, the type is arbitrary, so you can not type with ʻOptional [T]`, so the processing is divided. I have a case I want.

torch.jit.is_scripting () determines whether the script is executed (executed by libtorch) at runtime, so it cannot be used to determine whether it is traced (compiled).

It would be nice to have some decorators, but there seems to be no current situation.

Therefore, it seems that it is not possible to switch between python and TorchScript for each function. As you can see in the torchscript documentation,

@torch.jit.ignore
def forward_pytorch():
  ...

def forward_for_torchscript():
  ...

def forward():
  if torch.jit.is_scripting():
    forward_for_torchscript()
  else
    foward_pytorch()

However, since the expression (statement) itself is the target of tracing, if there is code using numpy () etc., it will not be possible to compile and an error will occur. You need to function it as above and migrate the code that runs with pytorch (+ numpy) to @ torch.jit.ignore (because @ torch.jit.unused will be compiled)

_Flatten_parameters () in nn.RNN

The internally used GeneratorExp does not support TorchScript. It's for adjusting the memory layout for the GPU, so you can safely ignore it (delete the code).

Other

@torch.jit.unused decorator
@torch.jit.ignore decorator

For unused and ignore, forward is defined, but this is for learning purposes, and can be used when you want to ignore it in TorchScript.

The difference between unused and ignore is that unused raises an exception when you call a method, but ignore does nothing when you call a method. Basically, it seems better to use unused.

F.pad(x, [0, 0, maxlen - m.size(2), 0])
         ^^^^^^^^^^^^^^^^^^^^^

It wasn't type inferred as List [int]. (M is torch.Tensor). It was solved by explicitly creating an int type variable.

.numpy()

It seems that .numpy () cannot be used. Eg x.cpu (). data.numpy (). On the C ++ side, aten handles it well, so you don't have to use .numpy (). maybe...? Also, it is desirable not to use the numpy function in the tracing code.

Constant

T.B.W.

Return type of the entry `forward`

It seems good to explicitly specify the type returned by forward of the model that will be the entry. This is to make it easier to see what type it is when running on the C ++ side. (If the types do not match, an assertion will be issued at runtime)

When returning only one Tensor, it can be treated as torch :: Tensor.

If you want to return multiple tensors, it will be Tuple, so

model.forward(inputs).toTuple()

will do.

TODO


def myfun(x, activation = None):
  if activation:
    x = activation(x)

  return x

myfun(activation=torch.relu)

What should I do when I want to handle something like an arbitrary function or None?

class Mod:
  def forward(self, x, alpha=1.0):
    ...

class Model:
  def __init__(self):
    self.mod = Mod()

  def forward(self, x):
    self.mod.forward(x)
    ...

Like, there is an optional argument in the forward of the class called internally. Should I delete it if I'm not actually using it?

Type notes to Python scripts for running PyTorch model in C ++ with libtorch

background

environment

JIT model

environment

Examine the type

Type

sample

Is the script running in the JIT?

_Flatten_parameters () in nn.RNN

Other

Return type of the entry forward

Return type of the entry `forward`