January 29 — The Alibaba Qwen workforce has formally open-sourced the Qwen3-ASR mannequin collection, a robust lineup of speech recognition fashions developed beneath the Qwen household. The discharge contains two full-featured ASR fashions—Qwen3-ASR-1.7B and Qwen3-ASR-0.6B—in addition to an modern speech forced-alignment mannequin, Qwen3-ForcedAligner-0.6B. Collectively, the Qwen3-ASR collection helps speech recognition and language identification throughout 52 languages and dialects.
In response to Alibaba, Qwen3-ASR leverages a newly designed AuT pretrained speech encoder mixed with the robust multimodal basis of Qwen3-Omni, enabling extremely correct and secure speech recognition. The 1.7B mannequin achieves state-of-the-art (SOTA) efficiency throughout a number of eventualities, together with Mandarin Chinese language, English, Chinese language-accented speech, and singing voice recognition, whereas demonstrating robust robustness to advanced textual content and high-noise environments.
The 0.6B mannequin strikes a steadiness between efficiency and effectivity. Whereas sustaining excessive recognition accuracy, it helps 128-concurrent asynchronous inference with throughput as much as 2,000×, able to processing greater than 5 hours of audio in simply 10 seconds.
The Qwen3-ForcedAligner-0.6B is a timestamp prediction mannequin based mostly on non-autoregressive (NAR) giant language mannequin inference, supporting versatile and exact pressured alignment throughout 11 languages at arbitrary positions. Its timestamp accuracy surpasses conventional fashions resembling WhisperX and Nemo-Pressured-Aligner, attaining an environment friendly real-time issue (RTF) of 0.0089 beneath single-concurrency inference.
The Qwen workforce said that open-sourcing the Qwen3-ASR collection goals to speed up analysis and innovation in speech recognition and speech understanding. The mannequin architectures, weights, and a complete, user-friendly inference framework will all be launched as a part of the open-source bundle.
Supply: ITHome
Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the most recent breakthroughs, get unique updates, and join with a world community of future-focused thinkers.
Unlock tomorrow’s traits as we speak: learn extra, subscribe to our publication, and develop into a part of the NextTech group at NextTech-news.com

