Search CORE

5 research outputs found

Narrated guided tour following and interpretation by an autonomous wheelchair

Author: Hemachandra Sachithra Madhawa
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 79-81).This work addresses the fundamental problem of how a robot acquires local knowledge about its environment. The domain that we are concerned with is a speech-commandable robotic wheelchair operating in a home/special care environment, capable of navigating autonomously to a verbally-specified location in the environment. We address this problem by incorporating a narrated guided tour following capability into the autonomous wheelchair. In our method, a human gives a narrated guided tour through the environment, while the wheelchair follows. The guide carries out a continuous dialogue with the wheelchair, describing the names of the salient locations in and around his/her immediate vicinity. The wheelchair constructs a metrical map of the environment, and based on the spatial structure and the locations of the described places, segments the map into a topological representation with corresponding tagged locations. This representation of the environment allows the wheelchair to interpret and implement high-level navigation commands issued by the user. To achieve this capability, our system consists of an autonomous wheelchair, a person- following module allowing the wheelchair to track and follow the tour guide as s/he conducts the tour, a simultaneous localization and mapping module to construct the metric gridmap, a spoken dialogue manager to acquire semantic information about the environment, a map segmentation module to bind the metrical and topological representations and to relate tagged locations to relevant nodes, and a navigation module to utilize these representations to provide speech-commandable autonomous navigation.by Sachithra Madhawa Hemachandra.S.M

DSpace@MIT

Learning Models for Following Natural Language Directions in Unknown Environments

Author: Duvallet Felix
Hemachandra Sachithra
Howard Thomas M.
Roy Nicholas
Stentz Anthony
Walter Matthew R.
Publication venue
Publication date: 17/03/2015
Field of study

Natural language offers an intuitive and flexible means for humans to communicate with the robots that we will increasingly work alongside in our homes and workplaces. Recent advancements have given rise to robots that are able to interpret natural language manipulation and navigation commands, but these methods require a prior map of the robot's environment. In this paper, we propose a novel learning framework that enables robots to successfully follow natural language route directions without any previous knowledge of the environment. The algorithm utilizes spatial and semantic information that the human conveys through the command to learn a distribution over the metric and semantic properties of spatially extended environments. Our method uses this distribution in place of the latent world model and interprets the natural language instruction as a distribution over the intended behavior. A novel belief space planner reasons directly over the map and behavior distributions to solve for a policy using imitation learning. We evaluate our framework on a voice-commandable wheelchair. The results demonstrate that by learning and performing inference over a latent environment model, the algorithm is able to successfully follow natural language route directions within novel, extended environments.Comment: ICRA 201

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Learning semantic maps from natural language

Author: Hemachandra Sachithra Madhawa
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2015
Field of study

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from PDF student-submitted version of thesis.Includes bibliographical references (pages 185-193).As robots move into human-occupied environments, the need for effective mechanisms to enable interactions with humans becomes vital. Natural language is a flexible, intuitive medium that can enable such interactions, but language understanding requires robots to learn representations of their environments that are compatible with the conceptual models used by people. Current approaches to constructing such spatial-semantic representations rely solely on traditional sensors to acquire knowledge of the environment, which restricts robots to learning limited knowledge of their local surround. Furthermore, they can only reason over the limited portion of the environment that is in the robot's field-of-view. Natural language, on the other hand, allows people to share rich properties of their environment with their robotic partners in a flexible, efficient manner. The ability to integrate such descriptions can allow the robot to learn semantic properties such as colloquial names that are difficult to infer using existing methods, and learn about the world outside its perception range. The spatial and temporal disconnect between language descriptions and the robot's onboard sensors makes fusing the two sources of information challenging. This thesis addresses the problem of fusing information contained in natural language descriptions with the robot's onboard sensors to construct spatial-semantic representations useful for interacting with humans. The novelty lies in treating natural language descriptions as another sensor observation that informs the robot about its environment. Towards this end, we introduce the semantic graph, a spatial-semantic representation that provides a common framework in which we integrate information that the user communicates (e.g., labels and spatial relations) with observations from the robot's sensors. Our algorithm efficiently maintains a factored distribution over semantic graphs based upon the stream of natural language and low-level sensor information. We detail the means by which the framework incorporates knowledge conveyed by the user's descriptions, including the ability to reason over expressions that reference yet unknown regions in the environment. We evaluate the algorithm's ability to learn human-centric maps of several different environments and analyze the knowledge inferred from language and the utility of the learned maps. The results demonstrate that the incorporation of information from free-form descriptions increases the metric, topological and semantic accuracy of the recovered environment model. Next, we outline an algorithm that enables robots to improve their spatial-semantic representation of an environment by engaging users in dialog. The algorithm reasons over the ambiguity of language descriptions provided by the user given the current map, and selects information-gathering actions in the form of targeted questions about its local surroundings and areas distant from the robot. Our algorithm balances the information-theoretic value of candidate questions with a measure of cost associated with dialog. We demonstrate that by asking deliberate questions of the user, the method significantly improves the accuracy of the learned semantic map. Finally, we introduce a learning framework that enables robots to successfully follow natural language navigation instructions within previously unknown environments. The algorithm utilizes information about the environment that the human conveys within the command to learn a distribution over the spatial-semantic model of the environment. We achieve this through a formulation of our semantic mapping algorithm that uses information conveyed in the command to directly reason over unobserved spatial structure. The framework then uses this distribution in place of the latent world model to interpret the natural language instruction as a distribution over the intended actions. Next, a belief space planner solves for the action that best satisfies the intent of the command. We apply this towards following directions to objects and natural language route directions in unknown environments. We evaluate this approach through simulation and physical experiments, and demonstrate its ability to follow navigation commands with performance comparable to that of a fully-known environment.by Sachithra Madhawa Hemachandra.Ph. D

DSpace@MIT

Inferring Maps and Behaviors from Natural Language Instructions

Author: Duvallet Felix
Hemachandra Sachithra Madhawa
Howard Thomas M.
Oh Jean
Roy Nicholas
Stentz Anthony
Teller Seth
Walter Matthew Robert
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/04/2018
Field of study

Natural language provides a flexible, intuitive way for people to command robots, which is becoming increasingly important as robots transition to working alongside people in our homes and workplaces. To follow instructions in unknown environments, robots will be expected to reason about parts of the environments that were described in the instruction, but that the robot has no direct knowledge about. However, most existing approaches to natural language understanding require that the robot’s environment be known a priori. This paper proposes a probabilistic framework that enables robots to follow commands given in natural language, without any prior knowledge of the environment. The novelty lies in exploiting environment information implicit in the instruction, thereby treating language as a type of sensor that is used to formulate a prior distribution over the unknown parts of the environment. The algorithm then uses this learned distribution to infer a sequence of actions that are most consistent with the command, updating our belief as we gather Keywords Natural Language; Mobile Robot; Parse Tree; World Model; Behavior Inferenc

DSpace@MIT