31,305 research outputs found
DeepNav: Learning to Navigate Large Cities
We present DeepNav, a Convolutional Neural Network (CNN) based algorithm for
navigating large cities using locally visible street-view images. The DeepNav
agent learns to reach its destination quickly by making the correct navigation
decisions at intersections. We collect a large-scale dataset of street-view
images organized in a graph where nodes are connected by roads. This dataset
contains 10 city graphs and more than 1 million street-view images. We propose
3 supervised learning approaches for the navigation task and show how A* search
in the city graph can be used to generate supervision for the learning. Our
annotation process is fully automated using publicly available mapping services
and requires no human input. We evaluate the proposed DeepNav models on 4
held-out cities for navigating to 5 different types of destinations. Our
algorithms outperform previous work that uses hand-crafted features and Support
Vector Regression (SVR)[19].Comment: CVPR 2017 camera ready versio
A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects
Tracking humans that are interacting with the other subjects or environment
remains unsolved in visual tracking, because the visibility of the human of
interests in videos is unknown and might vary over time. In particular, it is
still difficult for state-of-the-art human trackers to recover complete human
trajectories in crowded scenes with frequent human interactions. In this work,
we consider the visibility status of a subject as a fluent variable, whose
change is mostly attributed to the subject's interaction with the surrounding,
e.g., crossing behind another object, entering a building, or getting into a
vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the
causal-effect relations between an object's visibility fluent and its
activities, and develop a probabilistic graph model to jointly reason the
visibility fluent change (e.g., from visible to invisible) and track humans in
videos. We formulate this joint task as an iterative search of a feasible
causal graph structure that enables fast search algorithm, e.g., dynamic
programming method. We apply the proposed method on challenging video sequences
to evaluate its capabilities of estimating visibility fluent changes of
subjects and tracking subjects of interests over time. Results with comparisons
demonstrate that our method outperforms the alternative trackers and can
recover complete trajectories of humans in complicated scenarios with frequent
human interactions.Comment: accepted by CVPR 201
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
Zero-Shot Object Searching Using Large-scale Object Relationship Prior
Home-assistant robots have been a long-standing research topic, and one of
the biggest challenges is searching for required objects in housing
environments. Previous object-goal navigation requires the robot to search for
a target object category in an unexplored environment, which may not be
suitable for home-assistant robots that typically have some level of semantic
knowledge of the environment, such as the location of static furniture. In our
approach, we leverage this knowledge and the fact that a target object may be
located close to its related objects for efficient navigation. To achieve this,
we train a graph neural network using the Visual Genome dataset to learn the
object co-occurrence relationships and formulate the searching process as
iteratively predicting the possible areas where the target object may be
located. This approach is entirely zero-shot, meaning it doesn't require new
accurate object correlation in the test environment. We empirically show that
our method outperforms prior correlational object search algorithms. As our
ultimate goal is to build fully autonomous assistant robots for everyday use,
we further integrate the task planner for parsing natural language and
generating task-completing plans with object navigation to execute human
instructions. We demonstrate the effectiveness of our proposed pipeline in both
the AI2-THOR simulator and a Stretch robot in a real-world environment
- …