41,855 research outputs found
Toward Crowd-Sensitive Path Planning
If a robot can predict crowds in parts of its environment that are
inaccessible to its sensors, then it can plan to avoid them. This paper
proposes a fast, online algorithm that learns average crowd densities in
different areas. It also describes how these densities can be incorporated into
existing navigation architectures. In simulation across multiple challenging
crowd scenarios, the robot reaches its target faster, travels less, and risks
fewer collisions than if it were to plan with the traditional A* algorithm.Comment: Accepted at AAAI fall symposium 201
LeTS-Drive: Driving in a Crowd by Learning from Tree Search
Autonomous driving in a crowded environment, e.g., a busy traffic
intersection, is an unsolved challenge for robotics. The robot vehicle must
contend with a dynamic and partially observable environment, noisy sensors, and
many agents. A principled approach is to formalize it as a Partially Observable
Markov Decision Process (POMDP) and solve it through online belief-tree search.
To handle a large crowd and achieve real-time performance in this very
challenging setting, we propose LeTS-Drive, which integrates online POMDP
planning and deep learning. It consists of two phases. In the offline phase, we
learn a policy and the corresponding value function by imitating the belief
tree search. In the online phase, the learned policy and value function guide
the belief tree search. LeTS-Drive leverages the robustness of planning and the
runtime efficiency of learning to enhance the performance of both. Experimental
results in simulation show that LeTS-Drive outperforms either planning or
imitation learning alone and develops sophisticated driving skills
RoboBrain: Large-Scale Knowledge Engine for Robots
In this paper we introduce a knowledge engine, which learns and shares
knowledge representations, for robots to carry out a variety of tasks. Building
such an engine brings with it the challenge of dealing with multiple data
modalities including symbols, natural language, haptic senses, robot
trajectories, visual features and many others. The \textit{knowledge} stored in
the engine comes from multiple sources including physical interactions that
robots have while performing tasks (perception, planning and control),
knowledge bases from the Internet and learned representations from several
robotics research groups.
We discuss various technical aspects and associated challenges such as
modeling the correctness of knowledge, inferring latent information and
formulating different robotic tasks as queries to the knowledge engine. We
describe the system architecture and how it supports different mechanisms for
users and robots to interact with the engine. Finally, we demonstrate its use
in three important research areas: grounding natural language, perception, and
planning, which are the key building blocks for many robotic tasks. This
knowledge engine is a collaborative effort and we call it RoboBrain.Comment: 10 pages, 9 figure
ALAN: Adaptive Learning for Multi-Agent Navigation
In multi-agent navigation, agents need to move towards their goal locations
while avoiding collisions with other agents and static obstacles, often without
communication with each other. Existing methods compute motions that are
optimal locally but do not account for the aggregated motions of all agents,
producing inefficient global behavior especially when agents move in a crowded
space. In this work, we develop methods to allow agents to dynamically adapt
their behavior to their local conditions. We accomplish this by formulating the
multi-agent navigation problem as an action-selection problem, and propose an
approach, ALAN, that allows agents to compute time-efficient and collision-free
motions. ALAN is highly scalable because each agent makes its own decisions on
how to move using a set of velocities optimized for a variety of navigation
tasks. Experimental results show that the agents using ALAN, in general, reach
their destinations faster than using ORCA, a state-of-the-art collision
avoidance framework, the Social Forces model for pedestrian navigation, and a
Predictive collision avoidance model.Comment: Submitted to the Autonomous Robots Journal, Special Issue on
Distributed Robot
Design Challenges of Multi-UAV Systems in Cyber-Physical Applications: A Comprehensive Survey, and Future Directions
Unmanned Aerial Vehicles (UAVs) have recently rapidly grown to facilitate a
wide range of innovative applications that can fundamentally change the way
cyber-physical systems (CPSs) are designed. CPSs are a modern generation of
systems with synergic cooperation between computational and physical potentials
that can interact with humans through several new mechanisms. The main
advantages of using UAVs in CPS application is their exceptional features,
including their mobility, dynamism, effortless deployment, adaptive altitude,
agility, adjustability, and effective appraisal of real-world functions anytime
and anywhere. Furthermore, from the technology perspective, UAVs are predicted
to be a vital element of the development of advanced CPSs. Therefore, in this
survey, we aim to pinpoint the most fundamental and important design challenges
of multi-UAV systems for CPS applications. We highlight key and versatile
aspects that span the coverage and tracking of targets and infrastructure
objects, energy-efficient navigation, and image analysis using machine learning
for fine-grained CPS applications. Key prototypes and testbeds are also
investigated to show how these practical technologies can facilitate CPS
applications. We present and propose state-of-the-art algorithms to address
design challenges with both quantitative and qualitative methods and map these
challenges with important CPS applications to draw insightful conclusions on
the challenges of each application. Finally, we summarize potential new
directions and ideas that could shape future research in these areas
Modeling and Inferring Human Intents and Latent Functional Objects for Trajectory Prediction
This paper is about detecting functional objects and inferring human
intentions in surveillance videos of public spaces. People in the videos are
expected to intentionally take shortest paths toward functional objects subject
to obstacles, where people can satisfy certain needs (e.g., a vending machine
can quench thirst), by following one of three possible intent behaviors: reach
a single functional object and stop, or sequentially visit several functional
objects, or initially start moving toward one goal but then change the intent
to move toward another. Since detecting functional objects in low-resolution
surveillance videos is typically unreliable, we call them "dark matter"
characterized by the functionality to attract people. We formulate the
Agent-based Lagrangian Mechanics wherein human trajectories are
probabilistically modeled as motions of agents in many layers of "dark-energy"
fields, where each agent can select a particular force field to affect its
motions, and thus define the minimum-energy Dijkstra path toward the
corresponding source "dark matter". For evaluation, we compiled and annotated a
new dataset. The results demonstrate our effectiveness in predicting human
intent behaviors and trajectories, and localizing functional objects, as well
as discovering distinct functional classes of objects by clustering human
motion behavior in the vicinity of functional objects
A Survey of Data Fusion in Smart City Applications
The advancement of various research sectors such as Internet of Things (IoT),
Machine Learning, Data Mining, Big Data, and Communication Technology has shed
some light in transforming an urban city integrating the aforementioned
techniques to a commonly known term - Smart City. With the emergence of smart
city, plethora of data sources have been made available for wide variety of
applications. The common technique for handling multiple data sources is data
fusion, where it improves data output quality or extracts knowledge from the
raw data. In order to cater evergrowing highly complicated applications,
studies in smart city have to utilize data from various sources and evaluate
their performance based on multiple aspects. To this end, we introduce a
multi-perspectives classification of the data fusion to evaluate the smart city
applications. Moreover, we applied the proposed multi-perspectives
classification to evaluate selected applications in each domain of the smart
city. We conclude the paper by discussing potential future direction and
challenges of data fusion integration.Comment: Accepted and To be published in Elsevier Information Fusio
Robobarista: Learning to Manipulate Novel Objects via Deep Multimodal Embedding
There is a large variety of objects and appliances in human environments,
such as stoves, coffee dispensers, juice extractors, and so on. It is
challenging for a roboticist to program a robot for each of these object types
and for each of their instantiations. In this work, we present a novel approach
to manipulation planning based on the idea that many household objects share
similarly-operated object parts. We formulate the manipulation planning as a
structured prediction problem and learn to transfer manipulation strategy
across different objects by embedding point-cloud, natural language, and
manipulation trajectory data into a shared embedding space using a deep neural
network. In order to learn semantically meaningful spaces throughout our
network, we introduce a method for pre-training its lower layers for multimodal
feature embedding and a method for fine-tuning this embedding space using a
loss-based margin. In order to collect a large number of manipulation
demonstrations for different objects, we develop a new crowd-sourcing platform
called Robobarista. We test our model on our dataset consisting of 116 objects
and appliances with 249 parts along with 250 language instructions, for which
there are 1225 crowd-sourced manipulation demonstrations. We further show that
our robot with our model can even prepare a cup of a latte with appliances it
has never seen before.Comment: Journal Versio
Robobarista: Object Part based Transfer of Manipulation Trajectories from Crowd-sourcing in 3D Pointclouds
There is a large variety of objects and appliances in human environments,
such as stoves, coffee dispensers, juice extractors, and so on. It is
challenging for a roboticist to program a robot for each of these object types
and for each of their instantiations. In this work, we present a novel approach
to manipulation planning based on the idea that many household objects share
similarly-operated object parts. We formulate the manipulation planning as a
structured prediction problem and design a deep learning model that can handle
large noise in the manipulation demonstrations and learns features from three
different modalities: point-clouds, language and trajectory. In order to
collect a large number of manipulation demonstrations for different objects, we
developed a new crowd-sourcing platform called Robobarista. We test our model
on our dataset consisting of 116 objects with 249 parts along with 250 language
instructions, for which there are 1225 crowd-sourced manipulation
demonstrations. We further show that our robot can even manipulate objects it
has never seen before.Comment: In International Symposium on Robotics Research (ISRR) 201
Discovering Underlying Plans Based on Distributed Representations of Actions
Plan recognition aims to discover target plans (i.e., sequences of actions)
behind observed actions, with history plan libraries or domain models in hand.
Previous approaches either discover plans by maximally "matching" observed
actions to plan libraries, assuming target plans are from plan libraries, or
infer plans by executing domain models to best explain the observed actions,
assuming complete domain models are available. In real world applications,
however, target plans are often not from plan libraries and complete domain
models are often not available, since building complete sets of plans and
complete domain models are often difficult or expensive. In this paper we view
plan libraries as corpora and learn vector representations of actions using the
corpora; we then discover target plans based on the vector representations. Our
approach is capable of discovering underlying plans that are not from plan
libraries, without requiring domain models provided. We empirically demonstrate
the effectiveness of our approach by comparing its performance to traditional
plan recognition approaches in three planning domains
- …