Using Torch-TensorRT Directly From PyTorch ¶
You will now be able to directly access TensorRT from PyTorch APIs. The process to use this feature is very similar to the compilation workflow described in Using Torch-TensorRT in Python
Start by loading
torch_tensorrt
into your application.
import torch
import torch_tensorrt
Then given a TorchScript module, you can compile it with TensorRT using the
torch._C._jit_to_backend("tensorrt",
...)
API.
import torchvision.models as models
model = models.mobilenet_v2(pretrained=True)
script_model = torch.jit.script(model)
Unlike the
compile
API in Torch-TensorRT which assumes you are trying to compile the
forward
function of a module
or the
convert_method_to_trt_engine
which converts a specified function to a TensorRT engine, the backend API
will take a dictionary which maps names of functions to compile to Compilation Spec objects which wrap the same
sort of dictionary you would provide to
compile
. For more information on the compile spec dictionary take a look
at the documentation for the Torch-TensorRT
TensorRTCompileSpec
API.
spec = {
"forward":
torch_tensorrt.ts.TensorRTCompileSpec({
"inputs": [torch_tensorrt.Input([1, 3, 300, 300])],
"enabled_precisions": {torch.float, torch.half},
"refit": False,
"debug": False,
"strict_types": False,
"device": {
"device_type": torch_tensorrt.DeviceType.GPU,
"gpu_id": 0,
"dla_core": 0,
"allow_gpu_fallback": True
},
"capability": torch_tensorrt.EngineCapability.default,
"num_min_timing_iters": 2,
"num_avg_timing_iters": 1,
"max_batch_size": 0,
})
}
Now to compile with Torch-TensorRT, provide the target module objects and the spec dictionary to
torch._C._jit_to_backend("tensorrt",
...)
trt_model = torch._C._jit_to_backend("tensorrt", script_model, spec)
To run explicitly call the function of the method you want to run (vs. how you can just call on the module itself in standard PyTorch)
input = torch.randn((1, 3, 300, 300)).to("cuda").to(torch.half)
print(trt_model.forward(input))