At UC Berkeley, researchers in Sergey Levine’s Robotic AI and Studying Lab eyed a desk the place a tower of 39 Jenga blocks stood completely stacked. Then a white-and-black robotic, its single limb doubled over like a hunched-over giraffe, zoomed towards the tower, brandishing a black leather-based whip. By means of what might need appeared to an informal viewer like a miracle of physics, the whip struck in exactly the suitable spot to ship a single block flying from the stack whereas the remainder of the tower remained structurally sound.
This activity, referred to as “Jenga whipping,” is a passion pursued by folks with the dexterity and reflexes to tug it off. Now, it has been mastered by robots, because of a novel, AI-powered coaching methodology created by Levine and different members of the workforce.
The brand new system, known as Human-in-the-Loop Pattern Environment friendly Robotic Reinforcement Studying (HiL-SERL), is described in a examine showing Aug. 20 within the journal Science Robotics.
By finding out demonstrations and studying from each human suggestions and its personal real-world makes an attempt, this coaching protocol teaches robots the right way to carry out difficult duties like Jenga whipping with a 100% success price. Furthermore, robots are taught at a powerful pace, enabling them to be taught inside one to 2 hours the right way to completely assemble a pc motherboard, construct a shelf and extra.
The primary time the robotic conquered the Jenga whipping problem, “that actually shocked me,” stated examine first writer Jianlan Luo, a postdoctoral researcher at UC Berkeley. “The Jenga activity may be very tough for many people. I attempted it with a whip in my hand; I had a 0% success price.”
In recent times, the robotic studying area has sought to crack the problem of the right way to train machines actions which are unpredictable or difficult, versus a single motion, like repeatedly selecting up an object from a selected place on a conveyor belt. To unravel this quandary, Levine’s lab has zeroed in on what’s known as “reinforcement studying.” In reinforcement studying, a robotic makes an attempt a activity in the true world and, utilizing suggestions from cameras, learns from its errors to ultimately grasp that ability.
The brand new examine added human intervention to hurry up this course of. With a particular mouse that controls the robotic, a human can appropriate the robotic’s course, and people corrections may be included into the robotic’s proverbial reminiscence financial institution. Utilizing reinforcement studying, the robotic analyzes the sum of all its makes an attempt—assisted and unassisted, profitable and unsuccessful—to higher carry out its activity. Luo stated a human wanted to intervene much less and fewer because the robotic realized from expertise.
“I wanted to babysit the robotic for possibly the primary 30% or one thing, after which progressively I might truly pay much less consideration,” he stated.

The lab put its robotic system via a gauntlet of difficult duties past Jenga whipping. The robotic flipped an egg in a pan; handed an object from one arm to a different; and assembled a motherboard, automotive dashboard and timing belt. The researchers chosen these challenges as a result of they have been different, and in Luo’s phrases, represented “all kinds of uncertainty when performing robotic duties within the complicated actual world.”
The researchers additionally examined the robots’ adaptability by staging mishaps. They’d pressure a gripper to open so it dropped an object or transfer a motherboard because the robotic tried to put in a microchip, coaching it to react to a shifting state of affairs it’d encounter exterior a lab atmosphere.
By the top of coaching, the robotic might execute these duties accurately 100% of the time. The researchers in contrast their outcomes to a standard “copy my conduct” methodology referred to as behavioral cloning that was skilled on the identical quantity of demonstration information; their new system made the robots sooner and extra correct.
These metrics are essential, Luo stated, as a result of the bar for robotic competency may be very excessive. Common shoppers and industrialists alike do not need to purchase an inconsistent robotic. Luo emphasised that particularly, “made-to-order” manufacturing processes like these typically used for electronics, cars and aerospace components may benefit from robots that may reliably and adaptably be taught a variety of duties.
A subsequent step, Luo stated, can be to pre-train the system with fundamental object manipulation capabilities, eliminating the necessity to be taught these from scratch and as a substitute progressing straight to buying extra complicated abilities. The lab additionally selected to make its analysis open supply in order that different researchers might use and construct on it.
“A key aim of this mission is to make the know-how as accessible and user-friendly as an iPhone,” Luo stated. “I firmly consider that the extra individuals who can use it, the larger impression we will make.”
Extra authors of the examine embrace Charles Xu and Jeffrey Wu of UC Berkeley.
Extra info:
Jianlan Luo et al, Exact and dexterous robotic manipulation through human-in-the-loop reinforcement studying, Science Robotics (2025). DOI: 10.1126/scirobotics.ads5033
College of California – Berkeley
Quotation:
With human suggestions, AI-driven robots be taught duties higher and sooner (2025, August 20)
retrieved 21 August 2025
from https://techxplore.com/information/2025-08-human-feedback-ai-driven-robots.html
This doc is topic to copyright. Other than any honest dealing for the aim of personal examine or analysis, no
half could also be reproduced with out the written permission. The content material is offered for info functions solely.
Elevate your perspective with NextTech Information, the place innovation meets perception.
Uncover the newest breakthroughs, get unique updates, and join with a worldwide community of future-focused thinkers.
Unlock tomorrow’s developments in the present day: learn extra, subscribe to our publication, and grow to be a part of the NextTech neighborhood at NextTech-news.com

