Google releases Gemma 4 open models

by jeffmcjunkin on 4/2/2026, 4:10:54 PM

https://deepmind.google/models/gemma/gemma-4/

Comments

by: danielhanchen

Thinking / reasoning + multimodal + tool calling.We made some quants at <a href="https://huggingface.co/collections/unsloth/gemma-4" rel="nofollow">https://huggingface.co/collections/unsloth/gemma-4</a> for folks to run them - they work really well!Guide for those interested: <a href="https://unsloth.ai/docs/models/gemma-4">https://unsloth.ai/docs/models/gemma-4</a>Also note to use temperature = 1.0, top_p = 0.95, top_k = 64 and the EOS is "<turn|>". "<|channel>thought\n" is also used for the thinking trace!

4/2/2026, 4:16:59 PM

by: scrlk

Comparison of Gemma 4 vs. Qwen 3.5 benchmarks, consolidated from their respective Hugging Face model cards:<pre><code> | Model | MMLUP | GPQA | LCB | ELO | TAU2 | MMMLU | HLE-n | HLE-t | |----------------|-------|-------|-------|------|-------|-------|-------|-------| | G4 31B | 85.2% | 84.3% | 80.0% | 2150 | 76.9% | 88.4% | 19.5% | 26.5% | | G4 26B A4B | 82.6% | 82.3% | 77.1% | 1718 | 68.2% | 86.3% | 8.7% | 17.2% | | G4 E4B | 69.4% | 58.6% | 52.0% | 940 | 42.2% | 76.6% | - | - | | G4 E2B | 60.0% | 43.4% | 44.0% | 633 | 24.5% | 67.4% | - | - | | G3 27B no-T | 67.6% | 42.4% | 29.1% | 110 | 16.2% | 70.7% | - | - | | GPT-5-mini | 83.7% | 82.8% | 80.5% | 2160 | 69.8% | 86.2% | 19.4% | 35.8% | | GPT-OSS-120B | 80.8% | 80.1% | 82.7% | 2157 | -- | 78.2% | 14.9% | 19.0% | | Q3-235B-A22B | 84.4% | 81.1% | 75.1% | 2146 | 58.5% | 83.4% | 18.2% | -- | | Q3.5-122B-A10B | 86.7% | 86.6% | 78.9% | 2100 | 79.5% | 86.7% | 25.3% | 47.5% | | Q3.5-27B | 86.1% | 85.5% | 80.7% | 1899 | 79.0% | 85.9% | 24.3% | 48.5% | | Q3.5-35B-A3B | 85.3% | 84.2% | 74.6% | 2028 | 81.2% | 85.2% | 22.4% | 47.4% | MMLUP: MMLU-Pro GPQA: GPQA Diamond LCB: LiveCodeBench v6 ELO: Codeforces ELO TAU2: TAU2-Bench MMMLU: MMMLU HLE-n: Humanity's Last Exam (no tools / CoT) HLE-t: Humanity's Last Exam (with search / tool) no-T: no think</code></pre>

4/2/2026, 4:38:18 PM

by: simonw

I ran these in LM Studio and got unrecognizable pelicans out of the 2B and 4B models and an outstanding pelican out of the 26b-a4b model - I think the best I've seen from a model that runs on my laptop.<a href="https://gist.github.com/simonw/12ae4711288637a722fd6bd4b4b56bdb?permalink_comment_id=6074031#gistcomment-6074031" rel="nofollow">https://gist.github.com/simonw/12ae4711288637a722fd6bd4b4b56...</a>The gemma-4-31b model is completely broken for me - it just spits out "---\n" no matter what prompt I feed it.

4/2/2026, 5:26:17 PM

by: sigbottle

There are so many heavy hitting cracked people like daniel from unsloth and chris lattner coming out of the woodworks for this with their own custom stuff.How does the ecosystem work? Have things converged and standardized enough where it's "easy" (lol, with tooling) to swap out parts such as weights to fit your needs? Do you need to autogen new custom kernels to fix said things? Super cool stuff.

4/2/2026, 6:06:05 PM

by: antirez

Featuring the ELO score as the main benchmark in chart is very misleading. The big dense Gemma 4 model does not seem to reach Qwen 3.5 27B dense model in most benchmarks. This is obviously what matters. The small 2B / 4B models are interesting and may potentially be better ASR models than specialized ones (not just for performances but since they are going to be easily served via llama.cpp / MLX and front-ends). Also interesting for "fast" OCR, given they are vision models as well. But other than that, the release is a bit disappointing.

4/2/2026, 4:35:12 PM

by: chrislattner

If you want the fastest open source implementation on Blackwell and AMD MI355, check out Modular's MAX nightly. You can pip install it super fast, check it out here: <a href="https://www.modular.com/blog/day-zero-launch-fastest-performance-for-gemma-4-on-nvidia-and-amd?utm_campaign=day0&utm_source=hn_chris" rel="nofollow">https://www.modular.com/blog/day-zero-launch-fastest-perform...</a>-Chris Lattner (yes, affiliated with Modular :-)

4/2/2026, 5:16:30 PM

by: canyon289

Hi all! I work on the Gemma team, one of many as this one was a bigger effort given it was a mainline release. Happy to answer whatever questions I can

4/2/2026, 5:08:12 PM

by: NitpickLawyer

Best thing is that this is Apache 2.0 (edit: and they have base models available. Gemma3 was good for finetuning)The sizes are E2B and E4B (following gemma3n arch, with focus on mobile) and 26BA4 MoE and 31B dense. The mobile ones have audio in (so I can see some local privacy focused translation apps) and the 31B seems to be strong in agentic stuff. 26BA4 stands somewhere in between, similar VRAM footprint, but much faster inference.

4/2/2026, 4:25:10 PM

by: originalvichy

The wait is finally over. One or two iterations, and I’ll be happy to say that language models are more than fulfilling my most common needs when self-hosting. Thanks to the Gemma team!

4/2/2026, 4:36:57 PM

by: mudkipdev

Can't wait for gemma4-31b-it-claude-opus-4-6-distilled-q4-k-m on huggingface tomorrow

4/2/2026, 4:45:18 PM

by: minimaxir

The benchmark comparisons to Gemma 3 27B on Hugging Face are interesting: The Gemma 4 E4B variant (<a href="https://huggingface.co/google/gemma-4-E4B-it" rel="nofollow">https://huggingface.co/google/gemma-4-E4B-it</a>) beats the old 27B in every benchmark at a fraction of parameters.The E2B/E4B models also support voice input, which is rare.

4/2/2026, 4:25:10 PM

by: whhone

The LiteRT-LM CLI (<a href="https://ai.google.dev/edge/litert-lm/cli" rel="nofollow">https://ai.google.dev/edge/litert-lm/cli</a>) provides a way to try the Gemma 4 model.<pre><code> # with uvx uvx litert-lm run \ --from-huggingface-repo=litert-community/gemma-4-E2B-it-litert-lm \ gemma-4-E2B-it.litertlm</code></pre>

4/2/2026, 5:51:34 PM

by: ceroxylon

Even with search grounding, it scored a 2.5/5 on a basic botanical benchmark. It would take much longer for the average human to do a similar write-up, but they would likely do better than 50% hallucination if they had access to a search engine.

4/2/2026, 4:39:16 PM

by: bertili

The timing is interesting as Apple supposedly will distill google models in the upcoming Siri update [1]. So maybe Gemma is a lower bound on what we can expect baked into iPhones.[1] <a href="https://news.ycombinator.com/item?id=47520438">https://news.ycombinator.com/item?id=47520438</a>

4/2/2026, 5:40:11 PM

by: jwr

Really looking forward to testing and benchmarking this on my spam filtering benchmark. gemma-3-27b was a really strong model, surpassed later by gpt-oss:20b (which was also much faster). qwen models always had more variance.

4/2/2026, 4:17:07 PM

by: virgildotcodes

Downloaded through LM Studio on an M1 Max 32GB, 26B A4B Q4_K_MFirst message:<a href="https://i.postimg.cc/yNZzmGMM/Screenshot-2026-04-03-at-12-44-38-AM.png" rel="nofollow">https://i.postimg.cc/yNZzmGMM/Screenshot-2026-04-03-at-12-44...</a>Not sure if I'm doing something wrong?This more or less reflects my experience with most local models over the last couple years (although admittedly most aren't anywhere near this bad). People keep saying they're useful and yet I can't get them to be consistently useful at all.

4/2/2026, 5:59:50 PM

by: fooker

What's a realistic way to run this locally or a single expensive remote dev machine (in a vm, not through API calls)?

4/2/2026, 4:51:41 PM

by: VadimPR

Gemma 3 E4E runs very quick on my Samsung S26, so I am looking forward to trying Gemma 4! It is fantastic to have local alternatives to frontier models in an offline manner.

4/2/2026, 5:04:44 PM

by: DeepYogurt

maybe a dumb question but what what does the "it" stand for in the 31B-it vs 31B?

4/2/2026, 5:53:35 PM

by: babelfish

Wow, 30B parameters as capable as a 1T parameter model?

4/2/2026, 4:25:17 PM

by: darshanmakwana

This is awesome! I will try to use them locally with opencode and see if they are usable inreplacement of claude code for basic tasks

4/2/2026, 4:32:22 PM

by: wg0

Google might not have the best coding models (yet) but they seem to have the most intelligent and knowledgeable models of all especially Gemini 3.1 Pro is something.One more thing about Google is that they have everything that others do not:1. Huge data, audio, video, geospatial 2. Tons of expertise. Attention all you need was born there. 3. Libraries that they wrote. 4. Their own data centers and cloud. 4. Most of all, their own hardware TPUs that no one has.Therefore once the bubble bursts, the only player standing tall and above all would be Google.

4/2/2026, 4:41:16 PM

by: james2doyle

Hmm just tried the google/gemma-4-31B-it through HuggingFace (inference provider seems to be Novita) and function/tool calling was not enabled...

4/2/2026, 4:35:38 PM

by: flakiness

It's good they still have non-instruction-tuned models.

4/2/2026, 4:25:02 PM

by: rvz

Open weight models once again marching on and slowly being a viable alternative to the larger ones.We are at least 1 year and at most 2 years until they surpass closed models for everyday tasks that can be done locally to save spending on tokens.

4/2/2026, 4:35:21 PM

by: mwizamwiinga

curious how this scales with larger datasets. anyone tried it in production?

4/2/2026, 5:08:41 PM

by: heraldgeezer

Gemma vs Gemini?I am only a casual AI chatbot user, I use what gives me the most and best free limits and versions.

4/2/2026, 5:00:58 PM

by: evanbabaallos

Impressive

4/2/2026, 4:53:15 PM

by: bertili

Qwen: Hold my beer<a href="https://news.ycombinator.com/item?id=47615002">https://news.ycombinator.com/item?id=47615002</a>

4/2/2026, 4:49:20 PM

by:

4/2/2026, 5:04:40 PM

by: aplomb1026

[dead]

4/2/2026, 5:31:12 PM

by:

4/2/2026, 4:57:27 PM

by: a7om_com

Gemma models are already in our AIPI inference pricing index. Open source models like Gemma run 70.7% cheaper than proprietary equivalents at the median across the 2,614 SKUs we track. With Gemma 4 hitting third-party platforms the pricing will be worth watching closely. Full data at a7om.com.

4/2/2026, 4:25:00 PM

Hacker News Viewer

Top 20

Google releases Gemma 4 open models

Comments

by: danielhanchen

by: scrlk

by: simonw

by: sigbottle

by: antirez

by: chrislattner

by: canyon289

by: NitpickLawyer

by: originalvichy

by: mudkipdev

by: minimaxir

by: whhone

by: ceroxylon

by: bertili

by: jwr

by: virgildotcodes

by: fooker

by: VadimPR

by: DeepYogurt

by: babelfish

by: darshanmakwana

by: wg0

by: james2doyle

by: flakiness

by: rvz

by: mwizamwiinga

by: heraldgeezer

by: evanbabaallos

by: bertili

by:

by: aplomb1026

by:

by: a7om_com