Using Torch-TensorRT Directly From PyTorch ¶

You will now be able to directly access TensorRT from PyTorch APIs. The process to use this feature is very similar to the compilation workflow described in Using Torch-TensorRT in Python

Start by loading torch_tensorrt into your application.

         import torch
import torch_tensorrt

Then given a TorchScript module, you can compile it with TensorRT using the torch._C._jit_to_backend("tensorrt", ...) API.

         import torchvision.models as models

model = models.mobilenet_v2(pretrained=True)
script_model = torch.jit.script(model)

Unlike the compile API in Torch-TensorRT which assumes you are trying to compile the forward function of a module or the convert_method_to_trt_engine which converts a specified function to a TensorRT engine, the backend API will take a dictionary which maps names of functions to compile to Compilation Spec objects which wrap the same sort of dictionary you would provide to compile . For more information on the compile spec dictionary take a look at the documentation for the Torch-TensorRT TensorRTCompileSpec API.

         spec = {
    "forward":
        torch_tensorrt.ts.TensorRTCompileSpec({
            "inputs": [torch_tensorrt.Input([1, 3, 300, 300])],
            "enabled_precisions": {torch.float, torch.half},
            "refit": False,
            "debug": False,
            "strict_types": False,
            "device": {
                "device_type": torch_tensorrt.DeviceType.GPU,
                "gpu_id": 0,
                "dla_core": 0,
                "allow_gpu_fallback": True
            },
            "capability": torch_tensorrt.EngineCapability.default,
            "num_min_timing_iters": 2,
            "num_avg_timing_iters": 1,
            "max_batch_size": 0,
        })
    }

        

Now to compile with Torch-TensorRT, provide the target module objects and the spec dictionary to torch._C._jit_to_backend("tensorrt", ...)

         trt_model = torch._C._jit_to_backend("tensorrt", script_model, spec)

        

To run explicitly call the function of the method you want to run (vs. how you can just call on the module itself in standard PyTorch)

         input = torch.randn((1, 3, 300, 300)).to("cuda").to(torch.half)
print(trt_model.forward(input))