Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding the issue of starting services in multithreading #23094

Open
MenGuangwen0411 opened this issue Dec 12, 2024 · 0 comments
Open

Regarding the issue of starting services in multithreading #23094

MenGuangwen0411 opened this issue Dec 12, 2024 · 0 comments
Labels
performance issues related to performance regressions

Comments

@MenGuangwen0411
Copy link

Describe the issue

How to use onnxruntime with CUDAExecutionProvider in multithreading of python server.

To reproduce

from waitress import serve
from flask import Flask
app = Flask(name)
session = onnxruntime.InferenceSession("best.onnx",providers=['CUDAExecutionProvider'])
@app.route('/')
def infer_model():
......
t1 = time.time()
outputs = session.run(["output"], { "input": img })[0]
t2 = time.time()
ts = t2-t1
print(ts)
......
if name == 'main':
serve(app, host='0.0.0.0', port=8080) # case1

I use Multi-threading post methold outside, ts ranges between 0.02 sec and 0.4 sec,but I set serve as follow:

if name == 'main':
serve(app, host='0.0.0.0', port=8080,threasds = 1) # case 2

ts is approximate 0.025 sec and very smooth.
How to get a smooth and fast result from #case1 as #case 2?

Urgency

No response

Platform

Windows

OS Version

win10

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

Model File

No response

Is this a quantized model?

Yes

@MenGuangwen0411 MenGuangwen0411 added the performance issues related to performance regressions label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance issues related to performance regressions
Projects
None yet
Development

No branches or pull requests

1 participant