Search CORE

468 research outputs found

Interpreting and Executing Recipes with a Cooking Robot

Author: Bollini Mario A
Roy Nicholas
Rus Daniela L
Tellex Stefanie A
Thompson Tyler C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/04/2018
Field of study

The creation of a robot chef represents a grand challenge for the field of robotics. Cooking is one of the most important activities that takes place in the home, and a robotic chef capable of following arbitrary recipes would have many applications in both household and industrial environments. The kitchen environment is a semi-structured proving ground for algorithms in robotics. It provides many computational challenges, such as accurately perceiving ingredients in cluttered environments, manipulating objects, and engaging in complex activities such as mixing and chopping. Keywords: Reward Function; Statistical Machine Translation; Human Partner; Motion Primitive; Primitive Actio

DSpace@MIT

VirtualHome: Simulating Household Activities via Programs

Author: Boben Marko
Fidler Sanja
Li Jiaman
Puig Xavier
Ra Kevin
Torralba Antonio
Wang Tingwu
Publication venue
Publication date: 18/06/2018
Field of study

In this paper, we are interested in modeling complex activities that occur in a typical household. We propose to use programs, i.e., sequences of atomic actions and interactions, as a high level representation of complex tasks. Programs are interesting because they provide a non-ambiguous representation of a task, and allow agents to execute them. However, nowadays, there is no database providing this type of information. Towards this goal, we first crowd-source programs for a variety of activities that happen in people's homes, via a game-like interface used for teaching kids how to code. Using the collected dataset, we show how we can learn to extract programs directly from natural language descriptions or from videos. We then implement the most common atomic (inter)actions in the Unity3D game engine, and use our programs to "drive" an artificial agent to execute tasks in a simulated household environment. Our VirtualHome simulator allows us to create a large activity video dataset with rich ground-truth, enabling training and testing of video understanding models. We further showcase examples of our agent performing tasks in our VirtualHome based on language descriptions.Comment: CVPR 2018 (Oral

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Recommended from our members

Improving Robotic Cooking using Batch Bayesian Optimization

Author: George Thuruthel Thomas
Hughes Josie
Iida Fumiya
Junge Kai
Publication venue: IEEE Robotics and Automation Letters
Publication date: 01/04/2020
Field of study

With advances in the field of robotic manipulation, sensing and machine learning, robotic chefs are expected to become prevalent in our kitchens and restaurants. Robotic chefs are envisioned to replicate human skills in order to reduce the burden of the cooking process. However, the potential of robots as a means to enhance the dining experience is unrecognised. This work introduces the concept of food quality optimization and its challenges with an automated omelette cooking robotic system. The design and control of the robotic system that uses general kitchen tools is presented first. Next, we investigate new optimization strategies for improving subjective food quality rating, a problem challenging because of the qualitative nature of the objective and strongly constrained number of function evaluations possible. Our results show that through appropriate design of the optimization routine using Batch Bayesian Optimization, improvements in the subjective evaluation of food quality can be achieved reliably, with very few trials and with the ability for bulk optimization. This study paves the way towards a broader vision of personalized food for taste-and-nutrition and transferable recipes

Apollo (Cambridge)

An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

Author: Kulkarni Chaitanya
Machiraju Raghu
Ritter Alan
Xu Wei
Publication venue
Publication date: 01/01/2018
Field of study

We describe an effort to annotate a corpus of natural language instructions consisting of 622 wet lab protocols to facilitate automatic or semi-automatic conversion of protocols into a machine-readable format and benefit biological research. Experimental results demonstrate the utility of our corpus for developing machine learning approaches to shallow semantic parsing of instructional texts. We make our annotated Wet Lab Protocol Corpus available to the research community

arXiv.org e-Print Archive

Crossref

Recommended from our members

English recipe flow graph corpus

Author: Carroll John
Mori Shinsuke
Yamakata Yoko
Publication venue: European Language Resources Association (ELRA)
Publication date: 15/05/2020
Field of study

We present an annotated corpus of English cooking recipe procedures, and describe and evaluate computational methods for learning these annotations. The corpus consists of 300 recipes written by members of the public, which we have annotated with domain-specific linguistic and semantic structure. Each recipe is annotated with (1) `recipe named entities' (r-NEs) specific to the recipe domain, and (2) a flow graph representing in detail the sequencing of steps, and interactions between cooking tools, food ingredients and the products of intermediate steps. For these two kinds of annotations, inter-annotator agreement ranges from 82.3 to 90.5 F1, indicating that our annotation scheme is appropriate and consistent. We experiment with producing these annotations automatically. For r-NE tagging we train a deep neural network NER tool; to compute flow graphs we train a dependency-style parsing procedure which we apply to the entire sequence of r-NEs in a recipe.In evaluations, our systems achieve 71.1 to 87.5 F1, demonstrating that our annotation scheme is learnable

Sussex Research Online

The Multimodal And Modular Ai Chef: Complex Recipe Generation From Imagery

Author: Noever David
Noever Samantha Elizabeth Miller
Publication venue
Publication date: 19/03/2023
Field of study

The AI community has embraced multi-sensory or multi-modal approaches to advance this generation of AI models to resemble expected intelligent understanding. Combining language and imagery represents a familiar method for specific tasks like image captioning or generation from descriptions. This paper compares these monolithic approaches to a lightweight and specialized method based on employing image models to label objects, then serially submitting this resulting object list to a large language model (LLM). This use of multiple Application Programming Interfaces (APIs) enables better than 95% mean average precision for correct object lists, which serve as input to the latest Open AI text generator (GPT-4). To demonstrate the API as a modular alternative, we solve the problem of a user taking a picture of ingredients available in a refrigerator, then generating novel recipe cards tailored to complex constraints on cost, preparation time, dietary restrictions, portion sizes, and multiple meal plans. The research concludes that monolithic multimodal models currently lack the coherent memory to maintain context and format for this task and that until recently, the language models like GPT-2/3 struggled to format similar problems without degenerating into repetitive or non-sensical combinations of ingredients. For the first time, an AI chef or cook seems not only possible but offers some enhanced capabilities to augment human recipe libraries in pragmatic ways. The work generates a 100-page recipe book featuring the thirty top ingredients using over 2000 refrigerator images as initializing lists

arXiv.org e-Print Archive