We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
使用流式推理,第一次生成比第二次生成的时间慢很多。这是什么原因呢?第一次推理会有一些警告,与这些警告有关系吗? 模型加载: 第一次的输出: 第二次无警告:
The text was updated successfully, but these errors were encountered:
不光是流式推理,只要加载模型就存在这个问题,我的理解是加载模型并没有加载起完整的模型,需要推理一次才会把一些变量对象创建好,下次推理就省了创建的步骤。所以我是改了api_v2.py,在启动api服务前随便拿个文本推理了一次,后面调用接口的耗时就正常了
Sorry, something went wrong.
No branches or pull requests
使用流式推理,第一次生成比第二次生成的时间慢很多。这是什么原因呢?第一次推理会有一些警告,与这些警告有关系吗?
模型加载:
第一次的输出:
第二次无警告:
The text was updated successfully, but these errors were encountered: