Hacker News Viewer

Client-side GPU load balancing with Redis and Lua

by lneiman on 12/2/2025, 5:39:07 PM

https://galileo.ai/blog/how-we-boosted-gpu-utilization-by-40-with-redis-lua

Comments

by: artyom

If I understand the article correctly, any sufficiently capable attacker can:<p>- Know the global state of your GPU cluster via the client.<p>- Target the most struggling GPU instances specifically since the client decides which one to hit.<p>You offer a free tier which means anyone can get an account and try to do it (e.g. you can have one &quot;harmless, mostly inactive&quot; free account with the only purpose of retrieving GPU cluster status, and a bunch of burner accounts to overload struggling instances.<p>I may be completely wrong, but this sounds like DDoS served on a silver plate to me.

12/8/2025, 12:54:22 PM


by: lneiman

Author here. We were hitting tail latency and low GPU utilization issues serving SLMs via Triton.<p>I built a scrappy client-side router using Redis and Lua to track real-time GPU load. It boosted utilization by ~40% and improved latencies.<p>Happy to hear feedback on the implementation or thoughts on better ways to do this!

12/2/2025, 5:39:25 PM