Show HN: Ghost Pepper – Local hold-to-talk speech-to-text for macOS
by MattHart88 on 4/6/2026, 7:50:16 PM
I built this because I wanted to see how far I could get with a voice-to-text app that used 100% local models so no data left my computer. I've been using a ton for coding and emails. Experimenting with using it as a voice interface for my other agents too. 100% open-source MIT license, would love feedback, PRs, and ideas on where to take it.
https://github.com/matthartman/ghost-pepper
Comments
by: goodroot
Nice one! For Linux folks, I developed <a href="https://github.com/goodroot/hyprwhspr" rel="nofollow">https://github.com/goodroot/hyprwhspr</a>.<p>On Linux, there's access to the latest Cohere Transcribe model and it works very, very well. Requires a GPU though. Larger local models generally shouldn't require a subordinate model for clean up.<p>Have you compared WhisperKit to faster-whisper or similar? You might be able to run turbov3 successfully and negate the need for cleanup.<p>Incidentally, waiting for Apple to blow this all up with native STT any day now. :)
4/6/2026, 8:08:38 PM
by: parhamn
I see a lot of whisper stuff out there. Are these updated models are the same old OpenAI whispers or have they been updated heavily?<p>I've been using parakeet v3 which is fantastic (and tiny). Confused still seeing whisper out there.
4/6/2026, 9:03:22 PM
by: __mharrison__
Cool, I've been doing a lot of "coding" (and other typing tasks) recently by tapping a button on my Stream Deck. It starts recording me until I tap it again. At which point, it transcribes the recording and plops it into the paste buffer.<p>The button next to it pastes when I press it. If I press it again, it hits the enter command.<p>You can get a lot done with two buttons.
4/6/2026, 9:34:04 PM
by: charlietran
Thank you for sharing, I appreciate the emphasis on local speed and privacy. As a current user of Hex (<a href="https://github.com/kitlangton/Hex" rel="nofollow">https://github.com/kitlangton/Hex</a>), which has similar goals, what are your thoughts on how they compare?
4/6/2026, 8:00:54 PM
by: lostathome
If anyone interested, I built Hitoku Draft. It is a context aware voice assistant. Local models only.<p>Here is an example <a href="https://www.youtube.com/watch?v=Dw_q6l3Cwp4" rel="nofollow">https://www.youtube.com/watch?v=Dw_q6l3Cwp4</a><p>I was mainly motivated by papers like this <a href="https://arxiv.org/pdf/2602.16800" rel="nofollow">https://arxiv.org/pdf/2602.16800</a>. But I found myself using it during vacation when I did not have internet connection.<p><a href="https://hitoku.me/draft/" rel="nofollow">https://hitoku.me/draft/</a><p>I setup a code for people to download it (HITOKUHN2026), in case you want to compare, or just give feedback!
4/6/2026, 8:59:02 PM
by: douglaswlance
does it input the text as soon as it hears it? or does it wait until the end?
4/6/2026, 9:49:33 PM
by: konaraddi
That’s awesome! Do you know how it compares to Handy? Handy is open source and local only too. It’s been around a while and what I’ve been using.<p><a href="https://github.com/cjpais/handy" rel="nofollow">https://github.com/cjpais/handy</a>
4/6/2026, 8:13:48 PM
by: ericmcer
I see quite a few of these, the killer feature to me will be one that fine tunes the model based on your own voice.<p>E.G. if your name is `Donold` (pronounced like Donald) there is not a transcription model in existence that will transcribe your name correctly. That means forget inputting your name or email ever, it will never output it correctly.<p>Combine that with any subtleties of speech you have, or industry jargon you frequently use and you will have a much more useful tool.<p>We have a ton of options for "predict the most common word that matches this audio data" but I haven't found any "predict MY most common word" setups.
4/6/2026, 9:17:55 PM
by: romeroej
always mac. when windows? why can you just make things multios
4/6/2026, 9:51:19 PM
by: ipsum2
Parakeet is significantly more accurate and faster than Whisper if it supports your language.
4/6/2026, 8:06:45 PM
by: purplehat_
Hi Matt, there's lots of speech-to-text programs out there with varying levels of quality. 100% local is admirable but it's always a tradeoff and users have to decide for themselves what's worth it.<p>Would you consider making available a video showing someone using the app?
4/6/2026, 9:35:41 PM
by: Supercompressor
I've been looking for the opposite - wanting to dump text and it be read to me, coherently. Anyone have good recommendations?
4/6/2026, 9:16:40 PM
by: mathis
If you don't feel like downloading a large model, you can also use `yap dictate`. Yap leverages the built-in models exposed though Speech.framework on macOS 26 (Tahoe).<p>Project repo: <a href="https://github.com/finnvoor/yap" rel="nofollow">https://github.com/finnvoor/yap</a>
4/6/2026, 8:40:58 PM
by: hyperhello
Feature request or beg: let me play a speech video and transcribe it for me.
4/6/2026, 8:55:01 PM
by: gegtik
how does this compare to macos built in siri TTS, in quality and in privacy?
4/6/2026, 9:15:25 PM
by: guzik
Sadly the app doesn't work. There is no popup asking for microphone permission.<p>EDIT: I see there is an open issue for that on github
4/6/2026, 9:00:17 PM
by: aristech
Great job. How about the supported languages? System languages gets recognised?
4/6/2026, 8:57:10 PM