Alibaba’s Qwen workforce launched a brand new picture era synthetic intelligence (AI) mannequin final week. Dubbed Qwen VLo, it’s a successor to the Qwen 2.5 imaginative and prescient language mannequin and comes with a number of upgrades in comparison with the older fashions. The newest AI picture mannequin helps each text-to-image and image-to-image era. It additionally helps textual content enter in a number of languages, together with English and Chinese language. Other than picture era, the AI mannequin can also be able to making inline edits to generated photographs in addition to enter photographs.
Qwen VLo Accepts Prompts in A number of Languages
In a put up on X (previously generally known as Twitter), the official deal with of the Qwen workforce introduced the discharge of the brand new mannequin. The mannequin’s technical title is Qwen3-235B-A22B, and it’s accessible on the corporate’s chat interface at no cost right here. Customers also can use the mannequin with out logging in.
Devices 360 workers members examined out the AI mannequin and located its picture era functionality to be on par with Google’s Imagen 2. The instruction following and picture output high quality is barely decrease than Imagen-3 and OpenAI’s GPT-4o-powered picture era characteristic. Nevertheless, its era time is quicker than each of them, and it has the next price restrict than them.
On its GitHub web page, the corporate stated that the Qwen VLo comes with improved picture understanding, which permits it to make higher inline edits with out distorting the structural integrity of the enter picture. This additionally improves the general high quality of the output. The mannequin additionally higher understands obscure and open-ended prompts, and may generate photographs which might be aligned with person expectations.
Other than picture era and enhancing, the Qwen VLo also can carry out picture annotation-related duties comparable to edge detection, segmentation, prediction mapping, and extra. The corporate stated the long run model of the mannequin can even have the ability to settle for a number of enter photographs and mix them based mostly on person requests.
Textual content rendering has additionally been improved with the most recent AI picture generator. We have been capable of generate correct textual content throughout totally different fonts in our testing of the mannequin. Lastly, the Qwen VLo additionally helps photographs with dynamic side ratios as enter, together with excessive ratios comparable to 4:1 and 1:3. The corporate plans so as to add the characteristic to generate photographs in several side ratios quickly.

