How to perform inference MoE model with expert parallel #6891

Guodanding · 2024-12-18T13:13:52Z

Hello, I want to perform inference on the HuggingFace MoE model Qwen1.5-MoE-A2.7B with expert parallelism using DeepSpeed in a multi-GPU environment. However, the official tutorials are not comprehensive enough, and despite reviewing the documentation, I still don't know how to proceed.

Could you please help me refine this request?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to perform inference MoE model with expert parallel #6891

How to perform inference MoE model with expert parallel #6891

Guodanding commented Dec 18, 2024 •

edited

Loading

How to perform inference MoE model with expert parallel #6891

How to perform inference MoE model with expert parallel #6891

Comments

Guodanding commented Dec 18, 2024 • edited Loading

Guodanding commented Dec 18, 2024 •

edited

Loading