[Feature Request] use Onnxruntime TensorRT execution provider with lean tensorRT runtime #23082
Labels
ep:TensorRT
issues related to TensorRT execution provider
feature request
request for unsupported feature or enhancement
Describe the feature request
I am doing an inference with Onnxruntime TensorRT Execution Provider by loading a tensorRT cached model that was generated previously with the libnvinfer_builder_resource.so from an original. onnx model
So I don't need anymore at runtime the huge 1GB+ libnvinfer_builder_resource.so and I can deply my onnxruntime environnement without it (tested).
Also I would like not to deploy 230MB libnvinfer.so either but 30MB libnvinfer_lean.so instead (NVDIA TensorRT says it is possible if we don't build the tensorrt model anymore (quoting https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html :
We provide the possibility to install TensorRT in three different modes:
A full installation of TensorRT, including TensorRT plan file builder functionality. This mode is the same as the runtime provided before TensorRT 8.6.0.
A lean runtime installation is significantly smaller than the full installation. It allows you to load and run engines built with a version-compatible builder flag. However, this installation does not provide the functionality to build a TensorRT plan file.
A dispatch runtime installation. This installation allows for deployments with minimum memory consumption. It allows you to load and run engines built with a version compatible with the builder flag and includes the lean runtime. However, it does not provide the functionality to build a TensorRT plan file.
However , when I am redirecting libnvinfer.so to libnvinfer_lean.so as advised , then I am getting an error from Onnxruntime TensorRT Execution Provider
terminate called after throwing an instance of 'Ort::Exception'
what(): /tmp/onnxruntime/onnxruntime/core/session/provider_bridge_ort.cc:1419 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_tensorrt.so with error: /usr/local/lib/libonnxruntime_providers_tensorrt.so: undefined symbol: createInferBuilder_INTERNAL
Am I making some error or is Onnxruntime TRT EP currently not capable of using libnvinfer_lean only instead of libnvinfer ?. In the last case adding this improvement in ORT would be nice when trying to reduce the Onnxrutime gpu image footprint
Describe scenario use case
Build a trt cached model from an onnx model with libnvinfer_builder_resource.so deployed
then deploy the onnxrt and the TRT cached model in an environment where now lbvinfer.so points to lbvinfer_lean and without libnvinfer_builder_resource.so is removed
then load the cached model from onnxrt TRT EP : the model should be loaded and the infered with lbvinfer_lean.so only
ORT version 1.18.1 on Linux SLES15 compiled with GCC11 and flag --use_tensorrt_oss_parser as in build command
CC=gcc-11 CXX=g++-11 ./build.sh
--skip_submodule_sync --nvcc_threads 2
--config $ORT_BUILD_MODE --use_cuda
--cudnn_home /usr/local/cuda/lib64
--cuda_home /usr/local/cuda/ --use_tensorrt
--use_tensorrt_oss_parser --tensorrt_home /usr/local/TensorRT
--build_shared_lib --parallel --skip_tests
--allow_running_as_root --cmake_extra_defines "CMAKE_CUDA_ARCHITECTURES=75"
--cmake_extra_defines "CMAKE_CUDA_HOST_COMPILER=/usr/bin/gcc-11"
Cuda version 12.4
Cudnn 8.9.7.29
TRT 8.6.1
The text was updated successfully, but these errors were encountered: