404 research outputs found
Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning
We pose an active perception problem where an autonomous agent actively
interacts with a second agent with potentially adversarial behaviors. Given the
uncertainty in the intent of the other agent, the objective is to collect
further evidence to help discriminate potential threats. The main technical
challenges are the partial observability of the agent intent, the adversary
modeling, and the corresponding uncertainty modeling. Note that an adversary
agent may act to mislead the autonomous agent by using a deceptive strategy
that is learned from past experiences. We propose an approach that combines
belief space planning, generative adversary modeling, and maximum entropy
reinforcement learning to obtain a stochastic belief space policy. By
accounting for various adversarial behaviors in the simulation framework and
minimizing the predictability of the autonomous agent's action, the resulting
policy is more robust to unmodeled adversarial strategies. This improved
robustness is empirically shown against an adversary that adapts to and
exploits the autonomous agent's policy when compared with a standard
Chance-Constraint Partially Observable Markov Decision Process robust approach
Approximate Decentralized Bayesian Inference
This paper presents an approximate method for performing Bayesian inference
in models with conditional independence over a decentralized network of
learning agents. The method first employs variational inference on each
individual learning agent to generate a local approximate posterior, the agents
transmit their local posteriors to other agents in the network, and finally
each agent combines its set of received local posteriors. The key insight in
this work is that, for many Bayesian models, approximate inference schemes
destroy symmetry and dependencies in the model that are crucial to the correct
application of Bayes' rule when combining the local posteriors. The proposed
method addresses this issue by including an additional optimization step in the
combination procedure that accounts for these broken dependencies. Experiments
on synthetic and real data demonstrate that the decentralized method provides
advantages in computational performance and predictive test likelihood over
previous batch and distributed methods.Comment: This paper was presented at UAI 2014. Please use the following BibTeX
citation: @inproceedings{Campbell14_UAI, Author = {Trevor Campbell and
Jonathan P. How}, Title = {Approximate Decentralized Bayesian Inference},
Booktitle = {Uncertainty in Artificial Intelligence (UAI)}, Year = {2014}
Transferable Pedestrian Motion Prediction Models at Intersections
One desirable capability of autonomous cars is to accurately predict the
pedestrian motion near intersections for safe and efficient trajectory
planning. We are interested in developing transfer learning algorithms that can
be trained on the pedestrian trajectories collected at one intersection and yet
still provide accurate predictions of the trajectories at another, previously
unseen intersection. We first discussed the feature selection for transferable
pedestrian motion models in general. Following this discussion, we developed
one transferable pedestrian motion prediction algorithm based on Inverse
Reinforcement Learning (IRL) that infers pedestrian intentions and predicts
future trajectories based on observed trajectory. We evaluated our algorithm on
a dataset collected at two intersections, trained at one intersection and
tested at the other intersection. We used the accuracy of augmented
semi-nonnegative sparse coding (ASNSC), trained and tested at the same
intersection as a baseline. The result shows that the proposed algorithm
improves the baseline accuracy by 40% in the non-transfer task, and 16% in the
transfer task
FASTER: Fast and Safe Trajectory Planner for Flights in Unknown Environments
High-speed trajectory planning through unknown environments requires
algorithmic techniques that enable fast reaction times while maintaining safety
as new information about the operating environment is obtained. The requirement
of computational tractability typically leads to optimization problems that do
not include the obstacle constraints (collision checks are done on the
solutions) or use a convex decomposition of the free space and then impose an
ad-hoc time allocation scheme for each interval of the trajectory. Moreover,
safety guarantees are usually obtained by having a local planner that plans a
trajectory with a final "stop" condition in the free-known space. However,
these two decisions typically lead to slow and conservative trajectories. We
propose FASTER (Fast and Safe Trajectory Planner) to overcome these issues.
FASTER obtains high-speed trajectories by enabling the local planner to
optimize in both the free-known and unknown spaces. Safety guarantees are
ensured by always having a feasible, safe back-up trajectory in the free-known
space at the start of each replanning step. Furthermore, we present a Mixed
Integer Quadratic Program formulation in which the solver can choose the
trajectory interval allocation, and where a time allocation heuristic is
computed efficiently using the result of the previous replanning iteration.
This proposed algorithm is tested extensively both in simulation and in real
hardware, showing agile flights in unknown cluttered environments with
velocities up to 3.6 m/s.Comment: IROS 201
Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning
Robots that navigate among pedestrians use collision avoidance algorithms to
enable safe and efficient operation. Recent works present deep reinforcement
learning as a framework to model the complex interactions and cooperation.
However, they are implemented using key assumptions about other agents'
behavior that deviate from reality as the number of agents in the environment
increases. This work extends our previous approach to develop an algorithm that
learns collision avoidance among a variety of types of dynamic agents without
assuming they follow any particular behavior rules. This work also introduces a
strategy using LSTM that enables the algorithm to use observations of an
arbitrary number of other agents, instead of previous methods that have a fixed
observation size. The proposed algorithm outperforms our previous approach in
simulation as the number of agents increases, and the algorithm is demonstrated
on a fully autonomous robotic vehicle traveling at human walking speed, without
the use of a 3D Lidar
Estimation-based synthesis of H∞-optimal adaptive FIR filtersfor filtered-LMS problems
This paper presents a systematic synthesis procedure for H∞-optimal adaptive FIR filters in the context of an active noise cancellation (ANC) problem. An estimation interpretation of the adaptive control problem is introduced first. Based on this interpretation, an H∞ estimation problem is formulated, and its finite horizon prediction (filtering) solution is discussed. The solution minimizes the maximum energy gain from the disturbances to the predicted (filtered) estimation error and serves as the adaptation criterion for the weight vector in the adaptive FIR filter. We refer to this adaptation scheme as estimation-based adaptive filtering (EBAF). We show that the steady-state gain vector in the EBAF algorithm approaches that of the classical (normalized) filtered-X LMS algorithm. The error terms, however, are shown to be different. Thus, these classical algorithms can be considered to be approximations of our algorithm. We examine the performance of the proposed EBAF algorithm (both experimentally and in simulation) in an active noise cancellation problem of a one-dimensional (1-D) acoustic duct for both narrowband and broadband cases. Comparisons to the results from a conventional filtered-LMS (FxLMS) algorithm show faster convergence without compromising steady-state performance and/or robustness of the algorithm to feedback contamination of the reference signal
Quantifying Nonlocal Informativeness in High-Dimensional, Loopy Gaussian Graphical Models
We consider the problem of selecting informative observations in Gaussian graphical models containing both cycles and nuisances. More specifically, we consider the subproblem of quantifying conditional mutual information measures that are nonlocal on such graphs. The ability to efficiently quantify the information content of observations is crucial for resource-constrained data acquisition (adaptive sampling) and data processing (active learning) systems. While closed-form expressions for Gaussian mutual information exist, standard linear algebraic techniques, with complexity cubic in the network size, are intractable for high-dimensional distributions. We investigate the use of embedded trees for computing nonlocal pairwise mutual information and demonstrate through numerical simulations that the presented approach achieves a significant reduction in computational cost over inversion-based methods.United States. Defense Advanced Research Projects Agency (Mathematics of Sensing, Exploitation and Execution
Gradient Projection Anti-windup Scheme on Constrained Planar LTI Systems
The gradient projection anti-windup (GPAW) scheme was recently proposed as an anti-windup method for nonlinear multi-input-multi-output systems/controllers, the solution of which was recognized as a largely open problem in a recent survey paper. This report analyzes the properties of the GPAW scheme applied to an input constrained first order linear time invariant (LTI) system driven by a first order LTI controller, where the objective is to regulate the system state about the origin. We show that the GPAW compensated system is in fact a projected dynamical system (PDS), and use results in the PDS literature to assert existence and uniqueness of its solutions. The main result is that the GPAW scheme can only maintain/enlarge the exact region of attraction of the uncompensated system. We illustrate the qualitative weaknesses of some results in establishing true advantages of anti-windup methods, and propose a new paradigm to address the anti-windup problem, where results relative to the uncompensated system are sought.DSO National Laboratories, Singapore and AFOSR grant FA9550-08-1-008
- …