[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime #23082

jcdatin · 2024-12-11T15:00:22Z

Describe the feature request

I am doing an inference with Onnxruntime TensorRT Execution Provider by loading a tensorRT cached model that was generated previously with the libnvinfer_builder_resource.so from an original. onnx model
So I don't need anymore at runtime the huge 1GB+ libnvinfer_builder_resource.so and I can deply my onnxruntime environnement without it (tested).

Also I would like not to deploy 230MB libnvinfer.so either but 30MB libnvinfer_lean.so instead (NVDIA TensorRT says it is possible if we don't build the tensorrt model anymore (quoting https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html :
We provide the possibility to install TensorRT in three different modes:
A full installation of TensorRT, including TensorRT plan file builder functionality. This mode is the same as the runtime provided before TensorRT 8.6.0.
A lean runtime installation is significantly smaller than the full installation. It allows you to load and run engines built with a version-compatible builder flag. However, this installation does not provide the functionality to build a TensorRT plan file.
A dispatch runtime installation. This installation allows for deployments with minimum memory consumption. It allows you to load and run engines built with a version compatible with the builder flag and includes the lean runtime. However, it does not provide the functionality to build a TensorRT plan file.

However , when I am redirecting libnvinfer.so to libnvinfer_lean.so as advised , then I am getting an error from Onnxruntime TensorRT Execution Provider
terminate called after throwing an instance of 'Ort::Exception'
what(): /tmp/onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1419 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_tensorrt.so with error: /usr/local/lib/libonnxruntime_providers_tensorrt.so: undefined symbol: createInferBuilder_INTERNAL

Am I making some error or is Onnxruntime TRT EP currently not capable of using libnvinfer_lean only instead of libnvinfer ?. In the last case adding this improvement in ORT would be nice when trying to reduce the Onnxrutime gpu image footprint

Describe scenario use case

Build a trt cached model from an onnx model with libnvinfer_builder_resource.so deployed
then deploy the onnxrt and the TRT cached model in an environment where now lbvinfer.so points to lbvinfer_lean and without libnvinfer_builder_resource.so is removed
then load the cached model from onnxrt TRT EP : the model should be loaded and the infered with lbvinfer_lean.so only

ORT version 1.18.1 on Linux SLES15 compiled with GCC11 and flag --use_tensorrt_oss_parser as in build command
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/ --use_tensorrt
--use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"

Cuda version 12.4
Cudnn 8.9.7.29
TRT 8.6.1

chilo-ms · 2024-12-12T00:01:49Z

Thanks for bringing this request.
TensorRT EP doesn't support TRT lean runtime yet, but it's in our backlog.
We will have internal discuss about this.

jcdatin added the feature request request for unsupported feature or enhancement label Dec 11, 2024

github-actions bot added the ep:TensorRT issues related to TensorRT execution provider label Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime #23082

[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime #23082

jcdatin commented Dec 11, 2024

chilo-ms commented Dec 12, 2024

[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime #23082

[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime #23082

Comments

jcdatin commented Dec 11, 2024

Describe the feature request

Describe scenario use case

chilo-ms commented Dec 12, 2024