8 research outputs found

    Grounding Verbs of Motion in Natural Language Commands to Robots

    Get PDF
    To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. An important class of instructions involve interacting with people, such as “Follow the person to the kitchen” or “Meet the person at the elevators.” These instructions require that the robot fluidly react to changes in the environment, not simply follow a pre-computed plan. We present an algorithm for understanding natural language commands with three components. First, we create a cost function that scores the language according to how well it matches a candidate plan in the environment, defined as the log-likelihood of the plan given the command. Components of the cost function include novel models for the meanings of motion verbs such as “follow,” “meet,” and “avoid,” as well as spatial relations such as “to” and landmark phrases such as “the kitchen.” Second, an inference method uses this cost function to perform forward search, finding a plan that matches the natural language command. Third, a high-level controller repeatedly calls the inference method at each timestep to compute a new plan in response to changes in the environment such as the movement of the human partner or other people in the scene. When a command consists of more than a single task, the controller switches to the next task when an earlier one is satisfied. We evaluate our approach on a set of example tasks that require the ability to follow both simple and complex natural language commands. Keywords: Cost Function; Spatial Relation; State Sequence; Edit Distance; Statistical Machine TranslationUnited States. Office of Naval Research (Grant MURI N00014-07-1-0749

    Learning perceptually grounded word meanings from unaligned parallel data

    Get PDF
    In order for robots to effectively understand natural language commands, they must be able to acquire meaning representations that can be mapped to perceptual features in the external world. Previous approaches to learning these grounded meaning representations require detailed annotations at training time. In this paper, we present an approach to grounded language acquisition which is capable of jointly learning a policy for following natural language commands such as “Pick up the tire pallet,” as well as a mapping between specific phrases in the language and aspects of the external world; for example the mapping between the words “the tire pallet” and a specific object in the environment. Our approach assumes a parametric form for the policy that the robot uses to choose actions in response to a natural language command that factors based on the structure of the language. We use a gradient method to optimize model parameters. Our evaluation demonstrates the effectiveness of the model on a corpus of commands given to a robotic forklift by untrained users.U.S. Army Research Laboratory (Collaborative Technology Alliance Program, Cooperative Agreement W911NF-10-2-0016)United States. Office of Naval Research (MURIs N00014-07-1-0749)United States. Army Research Office (MURI N00014-11-1-0688)United States. Defense Advanced Research Projects Agency (DARPA BOLT program under contract HR0011-11-2-0008

    The use of coreference resolution for understanding manipulation commands for the PR2 Robot

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 81-84).Natural language interaction can enable us to interface with robots such as the Personal Robot 2 (PR2), without the need for a special training or equipment. Programming such a robot to follow commands is challenging because natural language has a complex structure and semantics, a model for which needs to be based on linguistic knowledge or learned from examples. In this thesis we first enable the PR2 robot to follow manipulation commands expressed in natural language by applying the Generalized Grounding Graph (G3 ). We model the PR2's actions and their trajectories in the physical environment, define the state-action space and learn a grounding model from an annotated corpus of robot actions aligned with commands. We achieved lower overall performance than previous implementations of G3 had reported. After that, we present an approach for using the linguistic technique of coreference resolution to improve the robot's ability to understand commands consisting of multiple clauses. We constrain the groundings for coreferent phrases to be identical by merging their nodes in the grounding graph. We show that using coreference information increases the robot ability to infer the right action sequence. This brings the robotic capabilities of modeling and understanding natural language closer to our theoretical understanding of discourse.by Dimitar N. Simeonov.M.Eng

    Natural language and spatial reasoning

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2010.Cataloged from PDF version of thesis.Includes bibliographical references (p. 109-112).Making systems that understand language has long been a dream of artificial intelligence. This thesis develops a model for understanding language about space and movement in realistic situations. The system understands language from two real-world domains: finding video clips that match a spatial language description such as "People walking through the kitchen and then going to the dining room" and following natural language commands such as "Go down the hall towards the fireplace in the living room." Understanding spatial language expressions is a challenging problem because linguistic expressions, themselves complex and ambiguous, must be connected to real-world objects and events. The system bridges the gap between language and the world by modeling the meaning of spatial language expressions hierarchically, first capturing the semantics of spatial prepositions, and then composing these meanings into higher level structures. Corpus-based evaluations of how well the system performs in different, realistic domains show that the system effectively and robustly understands spatial language expressions.by Stefanie Anne Tellex.Ph.D

    Object schemas for grounding language in a responsive robot

    No full text
    An approach is introduced for physically grounded natural language interpretation by robots that reacts appropriately to unanticipated physical changes in the environment and dynamically assimilates new information pertinent to ongoing tasks. At the core of the approach is a model of object schemas that enables a robot to encode beliefs about physical objects in its environment using collections of coupled processes responsible for sensorimotor interaction. These interaction processes run concurrently in order to ensure responsiveness to the environment, while co-ordinating sensorimotor expectations, action planning and language use. The model has been implemented on a robot that manipulates objects on a tabletop in response to verbal input. The implementation responds to verbal requests such as ‘Group the green block and the red apple’, while adapting in real time to unexpected physical collisions and taking opportunistic advantage of any new information it may receive through perceptual and linguistic channels.National Science Foundation (U.S.) (NSF Graduate Research Fellowship
    corecore