Hacker News Viewer

Show HN: I built a tiny LLM to demystify how language models work

by armanified on 4/6/2026, 12:20:12 AM

Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.<p>Fork it and swap the personality for your own character.

https://github.com/arman-bd/guppylm

Comments

by: ordinarily

It&#x27;s genuinely a great introduction to LLMs. I built my own awhile ago based off Milton&#x27;s Paradise Lost: <a href="https:&#x2F;&#x2F;www.wvrk.org&#x2F;works&#x2F;milton" rel="nofollow">https:&#x2F;&#x2F;www.wvrk.org&#x2F;works&#x2F;milton</a>

4/6/2026, 2:57:33 AM


by: nullbyte808

Adorable! Maybe a personality that speaks in emojis?

4/6/2026, 2:10:12 AM


by: SilentM68

Would have been funny if it were called &quot;DORY&quot; due to memory recall issues of the fish vs LLMs similar recall issues :)

4/6/2026, 2:22:34 AM


by: AndrewKemendo

I love these kinds of educational implementations.<p>I want to really praise the (unintentional?) nod to Nagel, by limiting capabilities to representation of a fish, the user is immediately able to understand the constraints. It can only talk like a fish cause it’s very simple<p>Especially compared to public models, thats a really simple correspondence to grok intuitively (small LLM &gt; only as verbose as a fish, larger LLM &gt; more verbose) so kudos to the author for making that simple and fun.

4/6/2026, 1:53:17 AM


by: Morpheus_Matrix

[dead]

4/6/2026, 2:13:10 AM


by: weiyong1024

[dead]

4/6/2026, 2:37:18 AM