AI models let robots carry out tasks in unfamiliar environments

It’s tricky to get robots to do things in environments they’ve never seen before. Typically, researchers need to train them on new data for every new place they encounter, which can become very time-consuming and expensive.

Now, researchers have developed a series of AI models that teach robots to complete basic tasks in new surroundings without further training or fine-tuning. The five AI models, called robot utility models, (RUMs), allow machines to complete five separate tasks: opening doors and drawers, and picking up tissues, bags and cylindrical objects in unfamiliar environments with a 90% success rate.

The team, consisting of researchers from New York University, Meta, and robotics company Hello Robot, hopes their findings will make it quicker and easier to teach robots new skills while helping them function within previously-unseen domains. The approach could make it easier and cheaper to deploy robots in our homes in future.

“In the past, people have focused a lot on the problem of how do we get robots to do everything, but not really asking how do we get robots to do the things that they do know how to do—everywhere,” says Mahi Shafiullah, a PhD student at New York University who worked on the project. “We looked at how do you teach a robot to, say, open any door anywhere.”

Teaching robots new skills generally requires a lot of data, which is pretty hard to come by. Because robotic training data needs to be physically collected—a time-consuming and expensive undertaking—it’s much harder to build and scale training databases than it is for other types of AI like large language models, which are trained on information scraped from the internet.

To make it quicker to gather the data essential for teaching a robot a new skill, the researchers developed a new version of a tool it had used in previous research: an iPhone attached to a cheap reacher-grabber stick, the kind typically used to pick up trash.

The team used the set-up to record around 1,000 demonstrations in 40 different environments, including homes in New York City and Jersey City for each of the five tasks—some of which had been gathered as part of previous research. They then trained learning algorithms on the five datasets to create the five RUM models.

These models were deployed on Stretch, a robot consisting of a wheeled unit, a tall pole, and a retractable arm holding an iPhone, to test how successfully they were able to execute the tasks in new environments without additional tweaking. Although they achieved a completion rate of 74.4%, the researchers were able to increase this to a 90% success rate when they took images from the iPhone and the robot’s head-mounted camera, gave it to OpenAI’s recent GPT-4o LLM model, and asked it if the task had been completed successfully. If GPT-4o said no, they simply reset the robot, and tried again.

A significant challenge facing roboticists is that training and testing their models in lab environments isn’t representative of what could happen in the real world, meaning that research that helps machines to behave more reliably in new settings is much welcomed, says Mohit Shridhar, a research scientist specializing in robotic manipulation who wasn’t involved in the work.

“It’s nice to see that it’s being evaluated in all these diverse homes and kitchens, because if you can get a robot to work in the wild in a random house, that’s the true goal of robotics,” he says.

The project could serve as a general recipe to build other utility robotics models for other tasks, helping to teach robots new skills with minimal extra work and make it easier for people who aren’t trained roboticists to deploy future robots in their homes, says Shafiullah.

“The dream that we’re going for is that I could train something, put it on the internet, and you should be able to download and run it on a robot in your home,” he says.