The eighth-generation TPU: An architecture deep dive

by meetpateltech on 4/22/2026, 12:28:00 PM

https://cloud.google.com/blog/products/compute/tpu-8t-and-tpu-8i-technical-deep-dive

Comments

by: zshn25

Splitting TPUs into dedicated training vs inference chips feels like an admission that the bottleneck has shifted from FLOPs to memory bandwidth + latency. Are future gains to come more from memory/system design than raw compute scaling? What’s that saying about Scaling laws?

4/22/2026, 1:30:26 PM

by: speedping

2.764 petabytes of HBM per 8i? So that's where all the RAM went.

4/22/2026, 2:32:40 PM

by: ttul

No matter how smart your large language model is, if you can’t find the energy to power it, it won’t run. I could imagine Google winning merely because their chips are more efficient. Of course, the other labs are capable of making chips, but Google has been doing it for years.

4/22/2026, 2:15:01 PM

by: ricardo81

dupe <a href="https://news.ycombinator.com/item?id=47862497">https://news.ycombinator.com/item?id=47862497</a>

4/22/2026, 2:06:57 PM

by: wotsdat

[dead]

4/22/2026, 2:25:42 PM

Hacker News Viewer