Announcing the Edge LLM Leaderboard – Now Live with Support from Hugging Face! #10865

Arnav0400 · 2024-12-17T09:55:24Z

Arnav0400
Dec 17, 2024

We’re thrilled to introduce the Edge LLM Leaderboard – a platform to benchmark Compressed LLMs on real edge hardware, starting with the Raspberry Pi 5 (8GB) powered by the ARM Cortex A76 CPU and optimized using llama.cpp.

Link - https://huggingface.co/spaces/nyunai/edge-llm-leaderboard

🔑 Key Highlights

🔹 Real-World Performance Metrics:
We focus on critical metrics that matter for edge deployments:
• Prefill Latency (Time to First Token)
• Decode Latency (Generation Speed)
• Model Size (Efficiency for limited storage)

🔹 130+ Models at Launch:
We’ve benchmarked sub-8B models with ARM-optimized quantizations like:
• Q8_0
• Q4_K_M
• Q4_0_4_4 (ARM Neon Optimized)

This provides a comprehensive, real-world comparison of throughput, latency, and memory utilization on accessible, low-cost devices.

🔮 What’s Next?

📈 Expanded Backend Support: Adding frameworks with ARM compatibility.
🖥️ Additional Edge Hardware: Exploring underutilized devices for LLM deployment.

📩 Share your ideas or model requests at: [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Announcing the Edge LLM Leaderboard – Now Live with Support from Hugging Face! #10865

{{title}}

Replies: 0 comments

Select a reply

Announcing the Edge LLM Leaderboard – Now Live with Support from Hugging Face! #10865

Arnav0400 Dec 17, 2024

Replies: 0 comments

Arnav0400
Dec 17, 2024