14,549 research outputs found

    Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation

    Full text link
    The control of nonlinear dynamical systems remains a major challenge for autonomous agents. Current trends in reinforcement learning (RL) focus on complex representations of dynamics and policies, which have yielded impressive results in solving a variety of hard control tasks. However, this new sophistication and extremely over-parameterized models have come with the cost of an overall reduction in our ability to interpret the resulting policies. In this paper, we take inspiration from the control community and apply the principles of hybrid switching systems in order to break down complex dynamics into simpler components. We exploit the rich representational power of probabilistic graphical models and derive an expectation-maximization (EM) algorithm for learning a sequence model to capture the temporal structure of the data and automatically decompose nonlinear dynamics into stochastic switching linear dynamical systems. Moreover, we show how this framework of switching models enables extracting hierarchies of Markovian and auto-regressive locally linear controllers from nonlinear experts in an imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro

    Human Motion Trajectory Prediction: A Survey

    Full text link
    With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

    Hybrid Reinforcement Learning with Expert State Sequences

    Full text link
    Existing imitation learning approaches often require that the complete demonstration data, including sequences of actions and states, are available. In this paper, we consider a more realistic and difficult scenario where a reinforcement learning agent only has access to the state sequences of an expert, while the expert actions are unobserved. We propose a novel tensor-based model to infer the unobserved actions of the expert state sequences. The policy of the agent is then optimized via a hybrid objective combining reinforcement learning and imitation learning. We evaluated our hybrid approach on an illustrative domain and Atari games. The empirical results show that (1) the agents are able to leverage state expert sequences to learn faster than pure reinforcement learning baselines, (2) our tensor-based action inference model is advantageous compared to standard deep neural networks in inferring expert actions, and (3) the hybrid policy optimization objective is robust against noise in expert state sequences.Comment: AAAI 2019; https://github.com/XiaoxiaoGuo/tensor4r

    Online Predictive Optimization Framework for Stochastic Demand-Responsive Transit Services

    Full text link
    This study develops an online predictive optimization framework for dynamically operating a transit service in an area of crowd movements. The proposed framework integrates demand prediction and supply optimization to periodically redesign the service routes based on recently observed demand. To predict demand for the service, we use Quantile Regression to estimate the marginal distribution of movement counts between each pair of serviced locations. The framework then combines these marginals into a joint demand distribution by constructing a Gaussian copula, which captures the structure of correlation between the marginals. For supply optimization, we devise a linear programming model, which simultaneously determines the route structure and the service frequency according to the predicted demand. Importantly, our framework both preserves the uncertainty structure of future demand and leverages this for robust route optimization, while keeping both components decoupled. We evaluate our framework using a real-world case study of autonomous mobility in a university campus in Denmark. The results show that our framework often obtains the ground truth optimal solution, and can outperform conventional methods for route optimization, which do not leverage full predictive distributions.Comment: 34 pages, 12 figures, 5 table
    • …
    corecore