1,030 research outputs found
Learning Models for Following Natural Language Directions in Unknown Environments
Natural language offers an intuitive and flexible means for humans to
communicate with the robots that we will increasingly work alongside in our
homes and workplaces. Recent advancements have given rise to robots that are
able to interpret natural language manipulation and navigation commands, but
these methods require a prior map of the robot's environment. In this paper, we
propose a novel learning framework that enables robots to successfully follow
natural language route directions without any previous knowledge of the
environment. The algorithm utilizes spatial and semantic information that the
human conveys through the command to learn a distribution over the metric and
semantic properties of spatially extended environments. Our method uses this
distribution in place of the latent world model and interprets the natural
language instruction as a distribution over the intended behavior. A novel
belief space planner reasons directly over the map and behavior distributions
to solve for a policy using imitation learning. We evaluate our framework on a
voice-commandable wheelchair. The results demonstrate that by learning and
performing inference over a latent environment model, the algorithm is able to
successfully follow natural language route directions within novel, extended
environments.Comment: ICRA 201
Where Snow is a Landmark: Route Direction Elements in Alpine Contexts
Route directions research has mostly focused on urban space so far, highlighting human concepts of street networks based on a range of recurring elements such as route segments, decision points, landmarks and actions. We explored the way route directions reflect the features of space and activity in the context of mountaineering. Alpine route directions are only rarely segmented through decision points related to reorientation; instead, segmentation is based on changing topography. Segments are described with various degrees of detail, depending on difficulty. For landmark description, direction givers refer to properties such as type of surface, dimension, colour of landscape features; terrain properties (such as snow) can also serve as landmarks. Action descriptions reflect the geometrical conceptualization of landscape features and dimensionality of space. Further, they are very rich in the semantics of manner of motion
A discriminative model for understanding natural language route directions
To be useful teammates to human partners, robots must be able to follow spoken instructions given in natural language. However, determining the correct sequence of actions in response to a set of spoken instructions is a complex decision-making problem. There is a "semantic gap" between the high-level symbolic models of the world that people use, and the low-level models of geometry, state dynamics, and perceptions that robots use. In this paper, we show how this gap can be bridged by inferring the best sequence of actions from a linguistic description and environmental features. This work improves upon previous work in three ways. First, by using a conditional random field (CRF), we learn the relative weight of environmental and linguistic features, enabling the system to learn the meanings of words and reducing the modeling effort in learning how to follow commands. Second, a number of long-range features are added, which help the system to use additional structure in the problem. Finally, given a natural language command, we infer both the referred path and landmark directly, thereby requiring the algorithm to pick a landmark by which it should navigate. The CRF is demonstrated to have 15% error on a held-out dataset, when compared with 39% error for a Markov random field (MRF). Finally, by analyzing the additional annotations necessary for this work, we find that natural language route directions map sequentially onto the corresponding path and landmarks 99.6% of the time. In addition, the size of the referred landmark varies from 0m[superscript 2] to 1964m[superscript 2] and the length of the referred path varies from 0m to 40.83m.United States. Office of Naval Research (MURIs N00014-07-1-0749
When gestures show us the way: Co-speech gestures selectively facilitate navigation and spatial memory.
How does gesturing during route learning relate to subsequent
spatial performance? We examined the relationship
between gestures produced spontaneously while studying
route directions and spatial representations of the navigated
environment. Participants studied route directions, then navigated
those routes from memory in a virtual environment, and
finally had their memory of the environment assessed. We
found that, for navigators with low spatial perspective-taking
performance on the Spatial Orientation Test, more gesturing
from a survey perspective predicted more accurate memory
following navigation. Thus, co-thought gestures accompanying
route learning relate to performance selectively, depending on
the gesturers’ spatial ability and the perspective of their gestures.
Survey gestures may help some individuals visualize an
overall route that they can retain in memory
Which way to turn? Guide orientation in virtual way finding
In this paper we describe an experiment aimed at determining the most effective and natural orientation of a virtual guide that gives route directions in a 3D virtual environment. We hypothesized that, due to the presence of mirrored gestures, having the route provider directly face the route seeker would result in a less effective and less natural route description than having the route provider adapt his orientation to that of the route seeker. To compare the effectiveness of the different orientations, after having received a route description the participants in our experiment had to ‘virtually’ traverse the route using prerecorded route segments. The results showed no difference in effectiveness between the two orientations, but suggested that the orientation where the speaker directly faces the route seeker is more natural
GeoCAM: A geovisual analytics workspace to contextualize and interpret statements about movement
This article focuses on integrating computational and visual methods in a system that supports analysts to identify extract map and relate linguistic accounts of movement. We address two objectives: (1) build the conceptual theoretical and empirical framework needed to represent and interpret human-generated directions; and (2) design and implement a geovisual analytics workspace for direction document analysis. We have built a set of geo-enabled computational methods to identify documents containing movement statements and a visual analytics environment that uses natural language processing methods iteratively with geographic database support to extract interpret and map geographic movement references in context. Additionally analysts can provide feedback to improve computational results. To demonstrate the value of this integrative approach we have realized a proof-of-concept implementation focusing on identifying and processing documents that contain human-generated route directions. Using our visual analytic interface an analyst can explore the results provide feedback to improve those results pose queries against a database of route directions and interactively represent the route on a map
Components of an intelligible route description: Creating graphic maps from written route directions
Automatic Extraction of Destinations, Origins and Route Parts from Human Generated Route Directions
Researchers from the cognitive and spatial sciences are studying text descriptions of movement patterns in order to examine how humans communicate and understand spatial information. In particular, route directions offer a rich source of information on how cognitive systems conceptualize movement patterns by segmenting them into meaningful parts. Route directions are composed using a plethora of cognitive spatial organization principles: changing levels of granularity, hierarchical organization, incorporation of cognitively and perceptually salient elements, and so forth. Identifying such information in text documents automatically is crucial for enabling machine-understanding of human spatial language. The benefits are: a) creating opportunities for large-scale studies of human linguistic behavior; b) extracting and georeferencing salient entities (landmarks) that are used by human route direction providers; c) developing methods to translate route directions to sketches and maps; and d) enabling queries on large corpora of crawled/analyzed movement data. In this paper, we introduce our approach and implementations that bring us closer to the goal of automatically processing linguistic route directions. We report on research directed at one part of the larger problem, that is, extracting the three most critical parts of route directions and movement patterns in general: origin, destination, and route parts. We use machine-learning based algorithms to extract these parts of routes, including, for example, destination names and types. We prove the effectiveness of our approach in several experiments using hand-tagged corpora
- …