Hacker News Viewer

Qwen3-Omni-Flash-2025-12-01:a next-generation native multimodal large model

by pretext on 12/10/2025, 4:13:38 PM

https://qwen.ai/blog?id=qwen3-omni-flash-20251201

Comments

by: gardnr

This is a 30B parameter MoE with 3B active parameters and is the successor to their previous 7B omni model. [1]<p>You can expect this model to have similar performance to the non-omni version. [2]<p>There aren&#x27;t many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.<p>1. <a href="https:&#x2F;&#x2F;huggingface.co&#x2F;Qwen&#x2F;Qwen2.5-Omni-7B" rel="nofollow">https:&#x2F;&#x2F;huggingface.co&#x2F;Qwen&#x2F;Qwen2.5-Omni-7B</a><p>2. <a href="https:&#x2F;&#x2F;artificialanalysis.ai&#x2F;models&#x2F;qwen3-30b-a3b-instruct" rel="nofollow">https:&#x2F;&#x2F;artificialanalysis.ai&#x2F;models&#x2F;qwen3-30b-a3b-instruct</a>

12/10/2025, 5:37:33 PM


by: sosodev

Does Qwen3-Omni support real-time conversation like GPT-4o? Looking at their documentation it doesn&#x27;t seem like it does.<p>Are there any open weight models that do? Not talking about speech to text -&gt; LLM -&gt; text to speech btw I mean a real voice &lt;-&gt; language model.<p>edit:<p>It does support real-time conversation! Has anybody here gotten that to work on local hardware? I&#x27;m particularly curious if anybody has run it with a non-nvidia setup.

12/10/2025, 4:55:48 PM


by: terhechte

Is there a way to run these Omni models on a Macbook quantized via GGUF or MLX? I know I can run it in LMStudio or Llama.cpp but they don&#x27;t have streaming microphone support or streaming webcam support.<p>Qwen usually provides example code in Python that requires Cuda and a non-quantized model. I wonder if there is by now a good open source project to support this use case?

12/10/2025, 5:54:43 PM


by: mohsen1

Having lots of success with Gemini Flash Live 2.5. I am hoping 3.0 to come out soon. Benchmarks here claim better results that Gemini Live but have to test it. In past I&#x27;ve always been disappointed with Qwen Omni models in my English-first case...

12/11/2025, 8:15:56 AM


by: sim04ful

The main issue I&#x27;m facing with realtime responses (speech output) is how to separate non-diegetic outputs (e.g thinking, structured outputs) from outputs meant to be heard by the end user.<p>I&#x27;m curious how anyone has solved this

12/10/2025, 5:51:43 PM


by: devinprater

Wow, just 32B? This could almost run on a good device with 64 GB RAM. Once it gets to Ollama I&#x27;ll have to see just what I can get out of this.

12/10/2025, 6:57:55 PM


by: aschobel

Looks to be API only. Bummer.

12/10/2025, 6:03:04 PM


by: banjoe

Wow, crushing 2.5 Flash on every benchmark is huge. Time to move all of my LLM workloads to a local GPU rig.

12/10/2025, 5:20:27 PM


by: binsquare

Does anyone else find that there&#x27;s hard to pin down reason of life-lessness in the speech of these voice models?<p>Especially in the fruit pricing portion of the video for this model. Sounds completely normal but I can immediately tell it is ai. Maybe it&#x27;s intonation or the overly stable rate of speech?

12/10/2025, 4:58:28 PM


by: andy_ppp

Qwen seem to be deliberately confusing about if they are releasing models open weight or not. I think largely not any more and you can go on quite a wild goose chase looking for different things that are implied they are released but are actually only available via API.

12/11/2025, 2:20:06 PM


by: dvh

I asked: &quot;How many resistors are used in fuzzhugger phantom octave guitar pedal?&quot;. It replied 29 resistors and provided a long list. Answer is 2 resistors: <a href="https:&#x2F;&#x2F;tagboardeffects.blogspot.com&#x2F;2013&#x2F;04&#x2F;fuzzhugger-phantom-octave.html" rel="nofollow">https:&#x2F;&#x2F;tagboardeffects.blogspot.com&#x2F;2013&#x2F;04&#x2F;fuzzhugger-phan...</a>

12/10/2025, 4:45:29 PM


by: mettamage

I wonder if with that music analysis mode, you can also make your own synths

12/10/2025, 4:49:10 PM


by: Aissen

Is this a new proprietary model?

12/10/2025, 5:43:48 PM


by: rarisma

GPT4o in the charts is crazy.

12/10/2025, 5:01:42 PM


by: stevenhuang

Wayback for those that can&#x27;t reach <a href="https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20251210164048&#x2F;https:&#x2F;&#x2F;qwen.ai&#x2F;blog?id=qwen3-omni-flash-20251201" rel="nofollow">https:&#x2F;&#x2F;web.archive.org&#x2F;web&#x2F;20251210164048&#x2F;https:&#x2F;&#x2F;qwen.ai&#x2F;b...</a>

12/10/2025, 5:53:24 PM


by: forgingahead

I truly enjoy how the naming conventions seem to follow how I did homework assignments back in the day: finalpaper-1-dec2nd, finalpaper-2-dec4th, etc etc.

12/11/2025, 1:34:30 AM


by:

12/10/2025, 5:38:03 PM


by: vessenes

Interesting - when I asked the omni model at qwen.com what version it was, I got a testy &quot;I don&#x27;t have a version&quot; and then was told my chat was blocked for inappropriate content. A second try asking for knowledge cutoff got me the more equivocal &quot;2024, but I know stuff after that date, too&quot;.<p>No idea how to check if this is actually deployed on qwen.com right now.

12/10/2025, 7:52:29 PM