Robot ‘chef’ can recreate recipes from watching food videos

Grzegorz and researchers from the Engineering Department programmed their robotic chef with a ‘cookbook’ of eight simple salad recipes. After watching a video of a human demonstrating one of the recipes, the robot was able to identify which recipe was being prepared and make it.

In addition, the videos helped the robot incrementally add to its cookbook. At the end of the experiment, the robot came up with a ninth recipe on its own. Their results, reported in the journal IEEE Access, demonstrate how video content can be a valuable and rich source of data for automated food production, and could enable easier and cheaper deployment of robot chefs.

“We wanted to see whether we could train a robot chef to learn in the same incremental way that humans can – by identifying the ingredients and how they go together in the dish,” says Grzegorz, the paper’s first author.

“It’s amazing how much nuance the robot was able to detect,” he says “These recipes aren’t complex – they’re essentially chopped fruits and vegetables, but it was really effective at recognising, for example, that two chopped apples and two chopped carrots is the same recipe as three chopped apples and three chopped carrots.

“Our robot isn’t interested in the sorts of food videos that go viral on social media – they’re simply too hard to follow. But as these robot chefs get better and faster at identifying ingredients in food videos, they might be able to use sites like YouTube to learn a whole range of recipes.”

Sochacki, a PhD candidate in Professor Fumiya Iida’s Bio-Inspired Robotics Laboratory, and his colleagues devised eight simple salad recipes and filmed themselves making them. They then used a publicly available neural network to train their robot chef. The neural network had already been programmed to identify a range of different objects, including the fruits and vegetables used in the eight salad recipes (broccoli, carrot, apple, banana and orange).

Using computer vision techniques, the robot analysed each frame of video and was able to identify the different objects and features, such as a knife and the ingredients, as well as the human demonstrator’s arms, hands and face. Both the recipes and the videos were converted to vectors and the robot performed mathematical operation on the vectors to determine the similarity between a demonstration and a vector.

By correctly identifying the ingredients and the actions of the human chef, the robot could determine which of the recipes was being prepared. The robot could infer that if the human demonstrator was holding a knife in one hand and a carrot in the other, that the carrot would then get chopped up.

Of the 16 videos it watched, the robot recognised the correct recipe 93% of the time, even though it only detected 83% of the human chef’s actions. The robot was also able to detect that slight variations in a recipe, such as making a double portion or normal human error, were variations and not a new recipe. The robot also correctly recognised the demonstration of a new, ninth salad, added it to its cookbook and made it.

Discover more

This article is an abbreviated version of the article on the University website by Sarah Colins, which you can read here.

The article – Grzegorz Sochacki et al. ‘Recognition of Human Chef’s Intentions for Incremental Learning of Cookbook by Robotic Salad Chef.’ IEEE Access (2023). DOI – is available online: 10.1109/ACCESS.2023.3276234