Skip to content

Pull requests: huggingface/text-generation-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add fp8 kv cache for ROCm
#2856 opened Dec 18, 2024 by mht-sharma Draft
5 tasks
Add Flash decoding kernel ROCm
#2855 opened Dec 18, 2024 by mht-sharma Loading…
5 tasks
change xpu lib download link
#2852 opened Dec 17, 2024 by sywangyi Loading…
5 tasks
Update Dockerfile to use devel image for compatibility
#2848 opened Dec 16, 2024 by YaserJaradeh Loading…
2 of 5 tasks
[TRTLLM] Expose finish reason
#2841 opened Dec 13, 2024 by mfuntowicz Loading…
Update tensor_parallel.py
#2798 opened Dec 3, 2024 by Lacacy Loading…
Install text-generation-server from poetry.lock export
#2786 opened Nov 29, 2024 by alvarobartt Loading…
1 of 5 tasks
Enable qwen2vl video
#2756 opened Nov 18, 2024 by drbh Draft
6 of 9 tasks
Add llama.cpp backend
#2723 opened Nov 4, 2024 by mfuntowicz Loading…
[WIP] Add gfx1100 support to AMD pytorch build
#2642 opened Oct 13, 2024 by cazlo Draft
1 of 5 tasks
Improve vlm support (add idefics3 support)
#2437 opened Aug 20, 2024 by drbh Loading…
4 tasks done
Add model_load_time metric
#2311 opened Jul 26, 2024 by Edwinhr716 Loading…
2 of 5 tasks
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.