January 29 — Unitree Robotics introduced the open-source launch of its Imaginative and prescient-Language-Motion (VLA) giant mannequin, UnifoLM-VLA-0, designed to beat the constraints of conventional vision-language fashions (VLMs) in bodily interplay. By way of focused pretraining, the mannequin evolves from picture–textual content understanding into an embodied “mind” with bodily commonsense reasoning.
In accordance with Unitree, UnifoLM-VLA-0 is a part of the UnifoLM household and is particularly constructed for general-purpose humanoid robotic manipulation. The mannequin is predicated on the open-source Qwen2.5-VL-7B and constantly pretrained on a multi-task dataset spanning each common and robotic eventualities, enhancing alignment between geometric spatial understanding and semantic reasoning.
A key technical breakthrough lies in its deep integration of textual content directions with 2D and 3D spatial particulars to satisfy the excessive calls for of manipulation duties. The mannequin incorporates end-to-end dynamics prediction knowledge to reinforce generalization. Notably, Unitree built-in an motion prediction head into the structure and systematically cleaned open-source datasets. Utilizing solely round 340 hours of real-robot knowledge, mixed with motion chunking prediction and dynamics constraints, the mannequin achieves unified modeling of complicated motion sequences and long-horizon planning.
Analysis outcomes present that UnifoLM-VLA-0 considerably outperforms base fashions on a number of spatial understanding benchmarks, and in “no-thinking” mode, its efficiency is similar to Gemini-Robotics-ER 1.5. On the LIBERO simulation benchmark, its multi-task mannequin achieves close to state-of-the-art outcomes.
In real-world robotic assessments, UnifoLM-VLA-0 demonstrated sturdy capabilities on Unitree’s G1 humanoid robotic, finishing 12 classes of complicated manipulation duties—together with opening and shutting drawers, plugging and unplugging connectors, and pick-and-place operations—utilizing a single coverage community. Unitree said that the mannequin maintains sturdy execution and disturbance resistance, even underneath exterior interference.
The undertaking homepage and open-source code are actually accessible on GitHub for builders and researchers.
Mission web page: https://unigen-x.github.io/unifolm-vla.github.io/
GitHub: https://github.com/unitreerobotics/unifolm-vla
Supply: iFeng Tech
Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the most recent breakthroughs, get unique updates, and join with a world community of future-focused thinkers.
Unlock tomorrow’s tendencies as we speak: learn extra, subscribe to our publication, and turn out to be a part of the NextTech neighborhood at NextTech-news.com

