Hacker News Viewer

Running Google Gemma 4 Locally with LM Studio's New Headless CLI and Claude Code

by vbtechguy on 4/5/2026, 5:13:51 PM

https://ai.georgeliu.com/p/running-google-gemma-4-locally-with

Comments

by: Someone1234

Using Claude Code seems like a popular frontend currently, I wonder how long until Anthropic releases an update to make it a little to a lot less turn-key? They've been very clear that they aren't exactly champions of this stuff being used outside of very specific ways.

4/5/2026, 7:25:52 PM


by: martinald

Just FYI, MoE doesn't really save (V)RAM. You still need all weights loaded in memory, it just means you consult less per forward pass. So it improves tok/s but not vram usage.

4/5/2026, 7:44:45 PM


by: vbtechguy

Here is how I set up Gemma 4 26B for local inference on macOS that can be used with Claude Code.

4/5/2026, 5:13:51 PM


by: trvz

<p><pre><code> ollama launch claude --model gemma4:26b</code></pre>

4/5/2026, 7:11:42 PM


by: jonplackett

So wait what is the interaction between Gemma and Claude?

4/5/2026, 7:17:48 PM