14,550 research outputs found
Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation
The control of nonlinear dynamical systems remains a major challenge for
autonomous agents. Current trends in reinforcement learning (RL) focus on
complex representations of dynamics and policies, which have yielded impressive
results in solving a variety of hard control tasks. However, this new
sophistication and extremely over-parameterized models have come with the cost
of an overall reduction in our ability to interpret the resulting policies. In
this paper, we take inspiration from the control community and apply the
principles of hybrid switching systems in order to break down complex dynamics
into simpler components. We exploit the rich representational power of
probabilistic graphical models and derive an expectation-maximization (EM)
algorithm for learning a sequence model to capture the temporal structure of
the data and automatically decompose nonlinear dynamics into stochastic
switching linear dynamical systems. Moreover, we show how this framework of
switching models enables extracting hierarchies of Markovian and
auto-regressive locally linear controllers from nonlinear experts in an
imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
Hybrid Reinforcement Learning with Expert State Sequences
Existing imitation learning approaches often require that the complete
demonstration data, including sequences of actions and states, are available.
In this paper, we consider a more realistic and difficult scenario where a
reinforcement learning agent only has access to the state sequences of an
expert, while the expert actions are unobserved. We propose a novel
tensor-based model to infer the unobserved actions of the expert state
sequences. The policy of the agent is then optimized via a hybrid objective
combining reinforcement learning and imitation learning. We evaluated our
hybrid approach on an illustrative domain and Atari games. The empirical
results show that (1) the agents are able to leverage state expert sequences to
learn faster than pure reinforcement learning baselines, (2) our tensor-based
action inference model is advantageous compared to standard deep neural
networks in inferring expert actions, and (3) the hybrid policy optimization
objective is robust against noise in expert state sequences.Comment: AAAI 2019; https://github.com/XiaoxiaoGuo/tensor4r
Online Predictive Optimization Framework for Stochastic Demand-Responsive Transit Services
This study develops an online predictive optimization framework for
dynamically operating a transit service in an area of crowd movements. The
proposed framework integrates demand prediction and supply optimization to
periodically redesign the service routes based on recently observed demand. To
predict demand for the service, we use Quantile Regression to estimate the
marginal distribution of movement counts between each pair of serviced
locations. The framework then combines these marginals into a joint demand
distribution by constructing a Gaussian copula, which captures the structure of
correlation between the marginals. For supply optimization, we devise a linear
programming model, which simultaneously determines the route structure and the
service frequency according to the predicted demand. Importantly, our framework
both preserves the uncertainty structure of future demand and leverages this
for robust route optimization, while keeping both components decoupled. We
evaluate our framework using a real-world case study of autonomous mobility in
a university campus in Denmark. The results show that our framework often
obtains the ground truth optimal solution, and can outperform conventional
methods for route optimization, which do not leverage full predictive
distributions.Comment: 34 pages, 12 figures, 5 table
- …