Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG][CUDAProvider] No attribute with name:'activation'is defined #23119

Open
Cookiee235 opened this issue Dec 16, 2024 · 5 comments
Open

[BUG][CUDAProvider] No attribute with name:'activation'is defined #23119

Cookiee235 opened this issue Dec 16, 2024 · 5 comments
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.

Comments

@Cookiee235
Copy link

Describe the issue

Bug Report

Loading and optimizing the model with CUDA crashed! In comparison, it can run well when executing optimization on the CPU.

The crash stack trace:

Traceback (most recent call last):
  File "test", line 7, in <module>
    optimized_model = optimizer.optimize_model(model_path, opt_level=1, use_gpu=True)  
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/onnxruntime/build/Linux/Release/onnxruntime/transformers/optimizer.py", line 381, in optimize_model
    temp_model_path = optimize_by_onnxruntime(
                      ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/software/onnxruntime/build/Linux/Release/onnxruntime/transformers/optimizer.py", line 206, in optimize_by_onnxruntime
    onnxruntime.InferenceSession(onnx_model, sess_options, providers=providers, **kwargs)
  File "/software/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py", line 465, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/software/onnxruntime/build/Linux/Release/onnxruntime/capi/onnxruntime_inference_collection.py", line 537, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /software/onnxruntime/onnxruntime/contrib_ops/cuda/fused_conv.cc:67 onnxruntime::contrib::cuda::FusedConv<T>::FusedConv(const onnxruntime::OpKernelInfo&) [with T = float] [ONNXRuntimeError] : 1 : FAIL : No attribute with name:'activation'is defined.

To reproduce

  1. Download model here
  2. Run the test script:
from onnxruntime.transformers import optimizer

model_path = "model_with_activation.onnx"
optimized_model_path = f"./opt.onnx"
optimized_model = optimizer.optimize_model(model_path, opt_level=1, use_gpu=True) 

Notice:

  1. use_gpu=True & opt_level >=1 --> crash
  2. use_gpu=False --> run well
  3. opt_level = 0 --> run well

Urgency

No response

Platform

Linux

OS Version

Ubuntu 12.04

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

5c1b7cc

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

@github-actions github-actions bot added the model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc. label Dec 16, 2024
@Cookiee235 Cookiee235 changed the title [BUG][ONNXProvider] No attribute with name:'activation'is defined [BUG][CUDAProvider] No attribute with name:'activation'is defined Dec 16, 2024
@tianleiwu
Copy link
Contributor

This is an invalid model. It has a node using FusedConv, but the node does not have activation attribute.
See https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#com.microsoft.FusedConv

@Cookiee235
Copy link
Author

@tianleiwu Thanks for your explanation. I’m still a bit confused. If the FusedConv operator requires an activation attribute, why does the provided model pass validation (onnx.checker.check_model()) and execute inference successfully on the CPU? Additionally, why does the model run without any exception when optimizations are disabled (opt_level=0)?

@tianleiwu
Copy link
Contributor

FusedConv is not operator in onnx domain. It is in com.microsoft domain. onnx.checker.check_model() has no knowledge of operators in com.microsoft domain, so it might ignore it.

CPU has different implementation for activation. When activation is not defined, it falls back to Identity.

activation.ActivationKind = MlasIdentityActivation;
std::string activation_type;
if (info.GetAttr<std::string>("activation", &activation_type).IsOK()) {
if (activation_type == "Relu") {
activation.ActivationKind = MlasReluActivation;
} else if (activation_type == "Tanh") {
activation.ActivationKind = MlasTanhActivation;
} else if (activation_type == "Sigmoid") {
activation.ActivationKind = MlasLogisticActivation;
} else {
// The remaining activation types have additional parameters to be pulled out.
size_t activation_params_count;
if (activation_type == "LeakyRelu") {
activation.ActivationKind = MlasLeakyReluActivation;
activation_params_count = 1;
} else if (activation_type == "Clip") {
activation.ActivationKind = MlasClipActivation;
activation_params_count = 2;
} else if (activation_type == "HardSigmoid") {
activation.ActivationKind = MlasHardSigmoidActivation;
activation_params_count = 2;
} else {
return Status(common::ONNXRUNTIME, common::INVALID_ARGUMENT, "unimplemented activation: " + activation_type);
}

So the model can be handled by CPU provider.

@Cookiee235
Copy link
Author

@tianleiwu Thanks for the clarification.

Is it possible to modify the CPU provider implementation to align it with the CPU behavior? Specifically, we could convert the default value of activation to Identity. This would ensure that both CPU and GPU exhibit the same behavior.

@tianleiwu Do you think these changes would be a valid solution? Thanks again!

@tianleiwu
Copy link
Contributor

Is it possible to modify the CPU provider implementation to align it with the CPU behavior? Specifically, we could convert the default value of activation to Identity. This would ensure that both CPU and GPU exhibit the same behavior.

Yes, it is possible to change CUDA code to align with CPU behavior on this, and also update the spec to mark the attribute to optional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model:transformer issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.
Projects
None yet
Development

No branches or pull requests

2 participants