Qwen3-Omni-Flash-2025-12-01：a next-generation native multimodal large model

by pretext on 12/10/2025, 4:13:38 PM

https://qwen.ai/blog?id=qwen3-omni-flash-20251201

Comments

by: gardnr

This is a 30B parameter MoE with 3B active parameters and is the successor to their previous 7B omni model. [1]You can expect this model to have similar performance to the non-omni version. [2]There aren't many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.1. <a href="https://huggingface.co/Qwen/Qwen2.5-Omni-7B" rel="nofollow">https://huggingface.co/Qwen/Qwen2.5-Omni-7B</a>2. <a href="https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct" rel="nofollow">https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct</a>

12/10/2025, 5:37:33 PM

by: sosodev

Does Qwen3-Omni support real-time conversation like GPT-4o? Looking at their documentation it doesn't seem like it does.Are there any open weight models that do? Not talking about speech to text -> LLM -> text to speech btw I mean a real voice <-> language model.edit:It does support real-time conversation! Has anybody here gotten that to work on local hardware? I'm particularly curious if anybody has run it with a non-nvidia setup.

12/10/2025, 4:55:48 PM

by: terhechte

Is there a way to run these Omni models on a Macbook quantized via GGUF or MLX? I know I can run it in LMStudio or Llama.cpp but they don't have streaming microphone support or streaming webcam support.Qwen usually provides example code in Python that requires Cuda and a non-quantized model. I wonder if there is by now a good open source project to support this use case?

12/10/2025, 5:54:43 PM

by: mohsen1

Having lots of success with Gemini Flash Live 2.5. I am hoping 3.0 to come out soon. Benchmarks here claim better results that Gemini Live but have to test it. In past I've always been disappointed with Qwen Omni models in my English-first case...

12/11/2025, 8:15:56 AM

by: sim04ful

The main issue I'm facing with realtime responses (speech output) is how to separate non-diegetic outputs (e.g thinking, structured outputs) from outputs meant to be heard by the end user.I'm curious how anyone has solved this

12/10/2025, 5:51:43 PM

by: devinprater

Wow, just 32B? This could almost run on a good device with 64 GB RAM. Once it gets to Ollama I'll have to see just what I can get out of this.

12/10/2025, 6:57:55 PM

by: aschobel

Looks to be API only. Bummer.

12/10/2025, 6:03:04 PM

by: banjoe

Wow, crushing 2.5 Flash on every benchmark is huge. Time to move all of my LLM workloads to a local GPU rig.

12/10/2025, 5:20:27 PM

by: binsquare

Does anyone else find that there's hard to pin down reason of life-lessness in the speech of these voice models?Especially in the fruit pricing portion of the video for this model. Sounds completely normal but I can immediately tell it is ai. Maybe it's intonation or the overly stable rate of speech?

12/10/2025, 4:58:28 PM

by: andy_ppp

Qwen seem to be deliberately confusing about if they are releasing models open weight or not. I think largely not any more and you can go on quite a wild goose chase looking for different things that are implied they are released but are actually only available via API.

12/11/2025, 2:20:06 PM

by: dvh

I asked: "How many resistors are used in fuzzhugger phantom octave guitar pedal?". It replied 29 resistors and provided a long list. Answer is 2 resistors: <a href="https://tagboardeffects.blogspot.com/2013/04/fuzzhugger-phantom-octave.html" rel="nofollow">https://tagboardeffects.blogspot.com/2013/04/fuzzhugger-phan...</a>

12/10/2025, 4:45:29 PM

by: mettamage

I wonder if with that music analysis mode, you can also make your own synths

12/10/2025, 4:49:10 PM

by: Aissen

Is this a new proprietary model?

12/10/2025, 5:43:48 PM

by: rarisma

GPT4o in the charts is crazy.

12/10/2025, 5:01:42 PM

by: stevenhuang

Wayback for those that can't reach <a href="https://web.archive.org/web/20251210164048/https://qwen.ai/blog?id=qwen3-omni-flash-20251201" rel="nofollow">https://web.archive.org/web/20251210164048/https://qwen.ai/b...</a>

12/10/2025, 5:53:24 PM

by: forgingahead

I truly enjoy how the naming conventions seem to follow how I did homework assignments back in the day: finalpaper-1-dec2nd, finalpaper-2-dec4th, etc etc.

12/11/2025, 1:34:30 AM

by:

12/10/2025, 5:38:03 PM

by: vessenes

Interesting - when I asked the omni model at qwen.com what version it was, I got a testy "I don't have a version" and then was told my chat was blocked for inappropriate content. A second try asking for knowledge cutoff got me the more equivocal "2024, but I know stuff after that date, too".No idea how to check if this is actually deployed on qwen.com right now.

12/10/2025, 7:52:29 PM

Hacker News Viewer

Top 20