1,030 research outputs found

    Learning Models for Following Natural Language Directions in Unknown Environments

    Get PDF
    Natural language offers an intuitive and flexible means for humans to communicate with the robots that we will increasingly work alongside in our homes and workplaces. Recent advancements have given rise to robots that are able to interpret natural language manipulation and navigation commands, but these methods require a prior map of the robot's environment. In this paper, we propose a novel learning framework that enables robots to successfully follow natural language route directions without any previous knowledge of the environment. The algorithm utilizes spatial and semantic information that the human conveys through the command to learn a distribution over the metric and semantic properties of spatially extended environments. Our method uses this distribution in place of the latent world model and interprets the natural language instruction as a distribution over the intended behavior. A novel belief space planner reasons directly over the map and behavior distributions to solve for a policy using imitation learning. We evaluate our framework on a voice-commandable wheelchair. The results demonstrate that by learning and performing inference over a latent environment model, the algorithm is able to successfully follow natural language route directions within novel, extended environments.Comment: ICRA 201

    Where Snow is a Landmark: Route Direction Elements in Alpine Contexts

    Get PDF
    Route directions research has mostly focused on urban space so far, highlighting human concepts of street networks based on a range of recurring elements such as route segments, decision points, landmarks and actions. We explored the way route directions reflect the features of space and activity in the context of mountaineering. Alpine route directions are only rarely segmented through decision points related to reorientation; instead, segmentation is based on changing topography. Segments are described with various degrees of detail, depending on difficulty. For landmark description, direction givers refer to properties such as type of surface, dimension, colour of landscape features; terrain properties (such as snow) can also serve as landmarks. Action descriptions reflect the geometrical conceptualization of landscape features and dimensionality of space. Further, they are very rich in the semantics of manner of motion

    A discriminative model for understanding natural language route directions

    Get PDF
    To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. However, determining the correct sequence of actions in response to a set of spoken instructions is a complex decision-making problem. There is a "semantic gap" between the high-level symbolic models of the world that people use, and the low-level models of geometry, state dynamics, and perceptions that robots use. In this paper, we show how this gap can be bridged by inferring the best sequence of actions from a linguistic description and environmental features. This work improves upon previous work in three ways. First, by using a conditional random field (CRF), we learn the relative weight of environmental and linguistic features, enabling the system to learn the meanings of words and reducing the modeling effort in learning how to follow commands. Second, a number of long-range features are added, which help the system to use additional structure in the problem. Finally, given a natural language command, we infer both the referred path and landmark directly, thereby requiring the algorithm to pick a landmark by which it should navigate. The CRF is demonstrated to have 15% error on a held-out dataset, when compared with 39% error for a Markov random field (MRF). Finally, by analyzing the additional annotations necessary for this work, we find that natural language route directions map sequentially onto the corresponding path and landmarks 99.6% of the time. In addition, the size of the referred landmark varies from 0m[superscript 2] to 1964m[superscript 2] and the length of the referred path varies from 0m to 40.83m.United States. Office of Naval Research (MURIs N00014-07-1-0749

    When gestures show us the way: Co-speech gestures selectively facilitate navigation and spatial memory.

    Get PDF
    How does gesturing during route learning relate to subsequent spatial performance? We examined the relationship between gestures produced spontaneously while studying route directions and spatial representations of the navigated environment. Participants studied route directions, then navigated those routes from memory in a virtual environment, and finally had their memory of the environment assessed. We found that, for navigators with low spatial perspective-taking performance on the Spatial Orientation Test, more gesturing from a survey perspective predicted more accurate memory following navigation. Thus, co-thought gestures accompanying route learning relate to performance selectively, depending on the gesturers’ spatial ability and the perspective of their gestures. Survey gestures may help some individuals visualize an overall route that they can retain in memory

    Which way to turn? Guide orientation in virtual way finding

    Get PDF
    In this paper we describe an experiment aimed at determining the most effective and natural orientation of a virtual guide that gives route directions in a 3D virtual environment. We hypothesized that, due to the presence of mirrored gestures, having the route provider directly face the route seeker would result in a less effective and less natural route description than having the route provider adapt his orientation to that of the route seeker. To compare the effectiveness of the different orientations, after having received a route description the participants in our experiment had to ‘virtually’ traverse the route using prerecorded route segments. The results showed no difference in effectiveness between the two orientations, but suggested that the orientation where the speaker directly faces the route seeker is more natural

    Verbal planning in route directions

    No full text

    GeoCAM: A geovisual analytics workspace to contextualize and interpret statements about movement

    Get PDF
    This article focuses on integrating computational and visual methods in a system that supports analysts to identify extract map and relate linguistic accounts of movement. We address two objectives: (1) build the conceptual theoretical and empirical framework needed to represent and interpret human-generated directions; and (2) design and implement a geovisual analytics workspace for direction document analysis. We have built a set of geo-enabled computational methods to identify documents containing movement statements and a visual analytics environment that uses natural language processing methods iteratively with geographic database support to extract interpret and map geographic movement references in context. Additionally analysts can provide feedback to improve computational results. To demonstrate the value of this integrative approach we have realized a proof-of-concept implementation focusing on identifying and processing documents that contain human-generated route directions. Using our visual analytic interface an analyst can explore the results provide feedback to improve those results pose queries against a database of route directions and interactively represent the route on a map

    Automatic Extraction of Destinations, Origins and Route Parts from Human Generated Route Directions

    Full text link
    Researchers from the cognitive and spatial sciences are studying text descriptions of movement patterns in order to examine how humans communicate and understand spatial information. In particular, route directions offer a rich source of information on how cognitive systems conceptualize movement patterns by segmenting them into meaningful parts. Route directions are composed using a plethora of cognitive spatial organization principles: changing levels of granularity, hierarchical organization, incorporation of cognitively and perceptually salient elements, and so forth. Identifying such information in text documents automatically is crucial for enabling machine-understanding of human spatial language. The benefits are: a) creating opportunities for large-scale studies of human linguistic behavior; b) extracting and georeferencing salient entities (landmarks) that are used by human route direction providers; c) developing methods to translate route directions to sketches and maps; and d) enabling queries on large corpora of crawled/analyzed movement data. In this paper, we introduce our approach and implementations that bring us closer to the goal of automatically processing linguistic route directions. We report on research directed at one part of the larger problem, that is, extracting the three most critical parts of route directions and movement patterns in general: origin, destination, and route parts. We use machine-learning based algorithms to extract these parts of routes, including, for example, destination names and types. We prove the effectiveness of our approach in several experiments using hand-tagged corpora
    corecore