High-Fidelity KV Cache Summarization Using Entropy and Low-Rank Reconstruction
by jchandra on 4/19/2026, 11:35:23 AM
https://jchandra.com/posts/hae-ols/
Comments
by: vivahir215
Interesting Approach. Curious about the latency tradeoff: OLS + SVD are much heavier than Top-K.Have you benchmarked end-to-end inference latency?
4/19/2026, 11:50:30 AM
by: jchandra
[dead]
4/19/2026, 11:36:37 AM