The eighth-generation TPU: An architecture deep dive
by meetpateltech on 4/22/2026, 12:28:00 PM
https://cloud.google.com/blog/products/compute/tpu-8t-and-tpu-8i-technical-deep-dive
Comments
by: zshn25
Splitting TPUs into dedicated training vs inference chips feels like an admission that the bottleneck has shifted from FLOPs to memory bandwidth + latency. Are future gains to come more from memory/system design than raw compute scaling? What’s that saying about Scaling laws?
4/22/2026, 1:30:26 PM
by: speedping
2.764 petabytes of HBM per 8i? So that's where all the RAM went.
4/22/2026, 2:32:40 PM
by: ttul
No matter how smart your large language model is, if you can’t find the energy to power it, it won’t run. I could imagine Google winning merely because their chips are more efficient. Of course, the other labs are capable of making chips, but Google has been doing it for years.
4/22/2026, 2:15:01 PM
by: ricardo81
dupe <a href="https://news.ycombinator.com/item?id=47862497">https://news.ycombinator.com/item?id=47862497</a>
4/22/2026, 2:06:57 PM
by: wotsdat
[dead]
4/22/2026, 2:25:42 PM