January 22, 2026 — Alibaba’s Qwen crew has formally open-sourced the complete Qwen3-TTS text-to-speech mannequin household, that includes multi-codebook speech technology fashions in two sizes: 1.7B parameters for optimum efficiency and 0.6B parameters optimized for a stability between high quality and effectivity. The fashions are actually accessible on GitHub, ModelScope, and different platforms, with dwell entry supported by way of the Qwen API.
Qwen3-TTS presents a complete characteristic set, together with voice cloning, voice creation, human-like speech synthesis, and pure language instruction management. Powered by the self-developed Qwen3-TTS-Tokenizer-12Hz multi-codebook speech encoder, the mannequin preserves wealthy paralinguistic cues and acoustic surroundings particulars, enabling high-fidelity voice reconstruction.
A key innovation is its Twin-Observe modeling structure, which reduces end-to-end synthesis latency to only 97 milliseconds, with the primary audio packet generated after a single character—making it properly suited to real-time conversational functions.
The mannequin helps 10 main languages, together with Chinese language, English, Japanese, and German, in addition to a number of dialects. It may well robotically adapt intonation, rhythm, and emotional expression based mostly on semantic context, whereas displaying sturdy robustness to noisy or imperfect textual content enter. Throughout a number of benchmarks, Qwen3-TTS achieves state-of-the-art efficiency: its voice creation capabilities outperform MiniMax-Voice-Design, its cross-lingual voice cloning surpasses CosyVoice3, and its long-form speech technology achieves phrase error charges as little as 2.36% (Chinese language) and a pair of.81% (English).

By combining multilingual help, ultra-low latency, and excessive audio high quality, Qwen3-TTS offers an environment friendly and scalable answer for international voice interplay and real-time speech functions.
-
ModelScope: https://www.modelscope.cn/collections/Qwen/Qwen3-TTS
-
Hugging Face: https://huggingface.co/collections/Qwen/qwen3-tts
-
GitHub: https://github.com/QwenLM/Qwen3-TTS
Supply: IThome
Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the most recent breakthroughs, get unique updates, and join with a world community of future-focused thinkers.
Unlock tomorrow’s tendencies right now: learn extra, subscribe to our e-newsletter, and develop into a part of the NextTech group at NextTech-news.com

