Announcing the Edge LLM Leaderboard – Now Live with Support from Hugging Face! #10865
Arnav0400
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We’re thrilled to introduce the Edge LLM Leaderboard – a platform to benchmark Compressed LLMs on real edge hardware, starting with the Raspberry Pi 5 (8GB) powered by the ARM Cortex A76 CPU and optimized using llama.cpp.
Link - https://huggingface.co/spaces/nyunai/edge-llm-leaderboard
🔑 Key Highlights
🔹 Real-World Performance Metrics:
We focus on critical metrics that matter for edge deployments:
• Prefill Latency (Time to First Token)
• Decode Latency (Generation Speed)
• Model Size (Efficiency for limited storage)
🔹 130+ Models at Launch:
We’ve benchmarked sub-8B models with ARM-optimized quantizations like:
• Q8_0
• Q4_K_M
• Q4_0_4_4 (ARM Neon Optimized)
This provides a comprehensive, real-world comparison of throughput, latency, and memory utilization on accessible, low-cost devices.
🔮 What’s Next?
📈 Expanded Backend Support: Adding frameworks with ARM compatibility.
🖥️ Additional Edge Hardware: Exploring underutilized devices for LLM deployment.
📩 Share your ideas or model requests at: [email protected]
Beta Was this translation helpful? Give feedback.
All reactions