Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDAProvider] Graph Optimization output an invalid model #23118

Open
Cookiee235 opened this issue Dec 16, 2024 · 2 comments
Open

[CUDAProvider] Graph Optimization output an invalid model #23118

Cookiee235 opened this issue Dec 16, 2024 · 2 comments
Labels
core runtime issues related to core runtime model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.

Comments

@Cookiee235
Copy link

Describe the issue

Bug Report

The initial onnx model was feed into optimizer.optimize_mode for optimization on CUDA and output the optimizaed model.
However, when further loading the optimized model, It failed and threw "This is an invalid model. Graph output (v4_0) does not exist in the graph."

Expected Behavior:

The optimized model should be an valid model.

The TraceBack:

Traceback (most recent call last):
  File "/share_container/optfuzz/ONNX/bugs/bug8.py", line 10, in <module>
    optimized_session = ort.InferenceSession(optimized_model_path)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py", line 465, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/software/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py", line 526, in _create_inference_session
    sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from ./opt.onnx failed:/software/onnxruntime/onnxruntime/core/graph/graph.cc:1467 void onnxruntime::Graph::InitializeStateFromModelFileGraphProto() This is an invalid model. Graph output (v4_0) does not exist in the graph.

The original onnx model (part):

Image

The optimized onnx model (part):

Image

To reproduce

Step 1: Download the model via this link
Step 2: run the following script:

import onnx
import onnxruntime as ort
from onnxruntime.transformers import optimizer

model_path = "duplicate_output.onnx"
optimized_model_path = f"./opt.onnx"
optimized_model = optimizer.optimize_model(model_path, opt_level=1, use_gpu=True)  # set opt_level=1 remove the duplicate output
optimized_model.save_model_to_file(optimized_model_path)
print(onnx.load(optimized_model_path).graph.output)  # need delete the "v4_0" var output simultaneously
optimized_session = ort.InferenceSession(optimized_model_path)

Urgency

No response

Platform

Linux

OS Version

Ubuntu 20.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

5c1b7cc

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

@Cookiee235 Cookiee235 changed the title Graph Optimization output an invalid model [CUDAProvider] Graph Optimization output an invalid model Dec 16, 2024
@github-actions github-actions bot added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label Dec 16, 2024
@yuslepukhin yuslepukhin added the core runtime issues related to core runtime label Dec 16, 2024
@xadupre
Copy link
Member

xadupre commented Dec 17, 2024

In your model, the operator Div does x/x. That should lead to a constant 1. You could replace this operator by ConstantOfShape(Shape(x)) in your case.

@Cookiee235
Copy link
Author

@xadupre Thank you for your suggestion. With your guidance, the optimization of this model now avoids the crash.

However, since the given model is technically invalid, ONNX Runtime should ideally produce an optimized model instead of crashing unexpectedly. Fixing this issue at the ONNXRuntime source code would not only prevent such crashes but also ensure the generation of an optimized model. This would be an exciting improvement!

@xadupre Do think should we fix this issue in the ONNXRuntime? Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core runtime issues related to core runtime model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
Projects
None yet
Development

No branches or pull requests

3 participants