Naturally, hardware design has received the majority of attention when discussing humanoid robotics. Considering how often its developers use the term “general purpose humanoids,” the first part deserves additional consideration. It will be a significant shift from decades of single-purpose systems to more generalist ones. We haven’t arrived yet.
One of the main areas of focus for researchers has been the drive to create a robotic intelligence that is capable of fully using the variety of movements made possible by bipedal humanoid design. Recently, generative AI’s application in robotics has also gained a lot of attention. MIT has released new data that suggests the latter could have a significant impact on the former.
Training is one of the primary roadblocks in the way of general-purpose systems. We are well-versed in the most effective methods for teaching people to perform various tasks. Despite their promise, robotics approaches remain dispersed. Numerous approaches show promise, such as imitation learning and reinforcement learning; however, future solutions are probably going to include combining existing approaches with the addition of generative AI models.
The MIT team suggests that gathering pertinent data from these little, task-specific datasets is one of the main use cases. The process is known as policy composition (PoCo). Among the tasks are practical robot activities such as using a spatula to flip objects and pounding in nails.
“A distinct diffusion model is trained by researchers to acquire a strategy, or policy, for accomplishing a single task with a single, particular dataset,” the school states. Then, they create a generic policy that enables a robot to carry out several jobs in a variety of environments by combining the policies that the diffusion models have learnt.
Diffusion model integration increased task performance by 20%, according to MIT. This involves being able to perform jobs that call for several tools and picking up new skills and adjusting to them. The system may integrate relevant data from many datasets to create a sequence of steps needed to complete a task.
Lirui Wang, the lead author of the paper, says that combining policies to achieve the best of both worlds is one advantage of this strategy. “A policy trained on simulation data may be able to achieve greater generalization, but a policy trained on real-world data may be able to achieve greater dexterity.”
The development of intelligence systems that enable robots to switch between tools to accomplish various jobs is the aim of this particular effort. The industry would get closer to realizing the dream of a general-purpose system with the spread of multi-purpose systems.