torch_tensorrt — Torch-TensorRT v1.0.0 documentation


          torch_tensorrt.


          set_device

( gpu_id ) ¶


          torch_tensorrt.


          compile

( module: Any , ir='default' , inputs=[] , enabled_precisions={<dtype.float: 0>} , **kwargs ) ¶

Compile a PyTorch module for NVIDIA GPUs using TensorRT

Takes a existing PyTorch module and a set of settings to configure the compiler and using the path specified in ir lower and compile the module to TensorRT returning a PyTorch Module back

Converts specifically the forward method of a Module

Parameters

module ( Union ( torch.nn.Module , torch.jit.ScriptModule ) – Source module

Keyword Arguments

inputs ( List [ Union ( torch_tensorrt.Input , torch.Tensor ) ] ) –

Required List of specifications of input shape, dtype and memory layout for inputs to the module. This argument is required. Input Sizes can be specified as torch sizes, tuples or lists. dtypes can be specified using torch datatypes or torch_tensorrt datatypes and you can use either torch devices or the torch_tensorrt device type enum to select device type.

               input=[
    torch_tensorrt.Input((1, 3, 224, 224)), # Static NCHW input shape for input #1
    torch_tensorrt.Input(
        min_shape=(1, 224, 224, 3),
        opt_shape=(1, 512, 512, 3),
        max_shape=(1, 1024, 1024, 3),
        dtype=torch.int32
        format=torch.channel_last
    ), # Dynamic input shape for input #2
    torch.randn((1, 3, 224, 244)) # Use an example tensor and let torch_tensorrt infer settings
]

              

enabled_precision ( Set ( Union ( torch.dtype , torch_tensorrt.dtype ) ) ) – The set of datatypes that TensorRT can use when selecting kernels
ir ( str ) – The requested strategy to compile. (Options: default - Let Torch-TensorRT decide, ts - TorchScript with scripting path)
**kwargs – Additional settings for the specific requested strategy (See submodules for more info)

Returns

Compiled Module, when run it will execute via TensorRT

Return type

torch.nn.Module


          torch_tensorrt.


          convert_method_to_trt_engine

( module: Any , method_name: str , ir='default' , inputs=[] , enabled_precisions={<dtype.float: 0>} , **kwargs ) ¶

Convert a TorchScript module method to a serialized TensorRT engine

Converts a specified method of a module to a serialized TensorRT engine given a dictionary of conversion settings

Parameters

module ( Union ( torch.nn.Module , torch.jit.ScriptModule ) – Source module

Keyword Arguments

inputs ( List [ Union ( torch_tensorrt.Input , torch.Tensor ) ] ) –

Required List of specifications of input shape, dtype and memory layout for inputs to the module. This argument is required. Input Sizes can be specified as torch sizes, tuples or lists. dtypes can be specified using torch datatypes or torch_tensorrt datatypes and you can use either torch devices or the torch_tensorrt device type enum to select device type.

               input=[
    torch_tensorrt.Input((1, 3, 224, 224)), # Static NCHW input shape for input #1
    torch_tensorrt.Input(
        min_shape=(1, 224, 224, 3),
        opt_shape=(1, 512, 512, 3),
        max_shape=(1, 1024, 1024, 3),
        dtype=torch.int32
        format=torch.channel_last
    ), # Dynamic input shape for input #2
    torch.randn((1, 3, 224, 244)) # Use an example tensor and let torch_tensorrt infer settings
]

              

enabled_precision ( Set ( Union ( torch.dtype , torch_tensorrt.dtype ) ) ) – The set of datatypes that TensorRT can use when selecting kernels
ir ( str ) – The requested strategy to compile. (Options: default - Let Torch-TensorRT decide, ts - TorchScript with scripting path)
**kwargs – Additional settings for the specific requested strategy (See submodules for more info)

Returns

Serialized TensorRT engine, can either be saved to a file or deserialized via TensorRT APIs

Return type

bytes


          torch_tensorrt.


          get_build_info

( ) → str ¶

Returns a string containing the build information of torch_tensorrt distribution

Returns: String containing the build information for torch_tensorrt distribution
Return type: str


          torch_tensorrt.


          dump_build_info

( ) ¶

Prints build information about the torch_tensorrt distribution to stdout

class


          torch_tensorrt.


          Input

( * args , ** kwargs ) ¶

Defines an input to a module in terms of expected shape, data type and tensor format.

__init__ ( * args , ** kwargs ) ¶

__init__ Method for torch_tensorrt.Input

Input accepts one of a few construction patterns

Parameters

shape ( Tuple or List , optional ) – Static shape of input tensor

Keyword Arguments

shape ( Tuple or List , optional ) – Static shape of input tensor
min_shape ( Tuple or List , optional ) – Min size of input tensor’s shape range Note: All three of min_shape, opt_shape, max_shape must be provided, there must be no positional arguments, shape must not be defined and implictly this sets Input’s shape_mode to DYNAMIC
opt_shape ( Tuple or List , optional ) – Opt size of input tensor’s shape range Note: All three of min_shape, opt_shape, max_shape must be provided, there must be no positional arguments, shape must not be defined and implictly this sets Input’s shape_mode to DYNAMIC
max_shape ( Tuple or List , optional ) – Max size of input tensor’s shape range Note: All three of min_shape, opt_shape, max_shape must be provided, there must be no positional arguments, shape must not be defined and implictly this sets Input’s shape_mode to DYNAMIC
dtype ( torch.dtype or torch_tensorrt.dtype ) – Expected data type for input tensor (default: torch_tensorrt.dtype.float32)
format ( torch.memory_format or torch_tensorrt.TensorFormat ) – The expected format of the input tensor (default: torch_tensorrt.TensorFormat.NCHW)

Examples

Input([1,3,32,32], dtype=torch.float32, format=torch.channel_last)
Input(shape=(1,3,32,32), dtype=torch_tensorrt.dtype.int32, format=torch_tensorrt.TensorFormat.NCHW)
Input(min_shape=(1,3,32,32), opt_shape=[2,3,32,32], max_shape=(3,3,32,32)) #Implicitly dtype=torch_tensorrt.dtype.float32, format=torch_tensorrt.TensorFormat.NCHW

dtype = <dtype.unknown: 5> ¶

torch_tensorrt.dtype.float32)

Type: The expected data type of the input tensor (default

format = <TensorFormat.contiguous: 0> ¶

torch_tensorrt.TensorFormat.NCHW)

Type: The expected format of the input tensor (default

shape = None ¶

Either a single Tuple or a dict of tuples defining the input shape. Static shaped inputs will have a single tuple. Dynamic inputs will have a dict of the form { "min_shape": Tuple, "opt_shape": Tuple, "max_shape": Tuple }

Type: (Tuple or Dict)

shape_mode = None ¶

Is input statically or dynamically shaped

Type: (torch_tensorrt.Input._ShapeMode)

class


          torch_tensorrt.


          Device

( * args , ** kwargs ) ¶

Defines a device that can be used to specify target devices for engines

__init__ ( * args , ** kwargs ) ¶

__init__ Method for torch_tensorrt.Device

Device accepts one of a few construction patterns

Parameters

spec ( str ) – String with device spec e.g. “dla:0” for dla, core_id 0

Keyword Arguments

gpu_id ( int ) – ID of target GPU (will get overrided if dla_core is specified to the GPU managing DLA). If specified, no positional arguments should be provided
dla_core ( int ) – ID of target DLA core. If specified, no positional arguments should be provided.
allow_gpu_fallback ( bool ) – Allow TensorRT to schedule operations on GPU if they are not supported on DLA (ignored if device type is not DLA)

Examples

Device(“gpu:1”)
Device(“cuda:1”)
Device(“dla:0”, allow_gpu_fallback=True)
Device(gpu_id=0, dla_core=0, allow_gpu_fallback=True)
Device(dla_core=0, allow_gpu_fallback=True)
Device(gpu_id=1)

allow_gpu_fallback = False ¶: (bool) Whether falling back to GPU if DLA cannot support an op should be allowed

device_type = None ¶

Target device type (GPU or DLA). Set implicitly based on if dla_core is specified.

Type: ( torch_tensorrt.DeviceType )

dla_core = -1 ¶: (int) Core ID for target DLA core

gpu_id = -1 ¶: (int) Device ID for target GPU

class


          torch_tensorrt.


          dtype

¶

Enum to specifiy operating precision for engine execution

Members:

float : 32 bit floating point number

float32 : 32 bit floating point number

half : 16 bit floating point number

float16 : 16 bit floating point number

int8 : 8 bit integer number

int32 : 32 bit integer number

bool : Boolean value

unknown : Unknown data type

class


          torch_tensorrt.


          DeviceType

¶

Enum to specify device kinds to build TensorRT engines for

Members:

GPU : Specify using GPU to execute TensorRT Engine

DLA : Specify using DLA to execute TensorRT Engine (Jetson Only)

class


          torch_tensorrt.


          EngineCapability

¶

Enum to specify engine capability settings (selections of kernels to meet safety requirements)

Members:

safe_gpu : Use safety GPU kernels only

safe_dla : Use safety DLA kernels only

default : Use default behavior

class


          torch_tensorrt.


          TensorFormat

¶

Enum to specifiy the memory layout of tensors

Members:

contiguous : Contiguous memory layout (NCHW / Linear)

channels_last : Channels last memory layout (NHWC)

torch_tensorrt ¶

Functions ¶

Classes ¶

Enums ¶

Submodules ¶