6,384 research outputs found

    Hierarchical Decomposition of Nonlinear Dynamics and Control for System Identification and Policy Distillation

    Full text link
    The control of nonlinear dynamical systems remains a major challenge for autonomous agents. Current trends in reinforcement learning (RL) focus on complex representations of dynamics and policies, which have yielded impressive results in solving a variety of hard control tasks. However, this new sophistication and extremely over-parameterized models have come with the cost of an overall reduction in our ability to interpret the resulting policies. In this paper, we take inspiration from the control community and apply the principles of hybrid switching systems in order to break down complex dynamics into simpler components. We exploit the rich representational power of probabilistic graphical models and derive an expectation-maximization (EM) algorithm for learning a sequence model to capture the temporal structure of the data and automatically decompose nonlinear dynamics into stochastic switching linear dynamical systems. Moreover, we show how this framework of switching models enables extracting hierarchies of Markovian and auto-regressive locally linear controllers from nonlinear experts in an imitation learning scenario.Comment: 2nd Annual Conference on Learning for Dynamics and Contro

    A Learning-Based Framework for Two-Dimensional Vehicle Maneuver Prediction over V2V Networks

    Full text link
    Situational awareness in vehicular networks could be substantially improved utilizing reliable trajectory prediction methods. More precise situational awareness, in turn, results in notably better performance of critical safety applications, such as Forward Collision Warning (FCW), as well as comfort applications like Cooperative Adaptive Cruise Control (CACC). Therefore, vehicle trajectory prediction problem needs to be deeply investigated in order to come up with an end to end framework with enough precision required by the safety applications' controllers. This problem has been tackled in the literature using different methods. However, machine learning, which is a promising and emerging field with remarkable potential for time series prediction, has not been explored enough for this purpose. In this paper, a two-layer neural network-based system is developed which predicts the future values of vehicle parameters, such as velocity, acceleration, and yaw rate, in the first layer and then predicts the two-dimensional, i.e. longitudinal and lateral, trajectory points based on the first layer's outputs. The performance of the proposed framework has been evaluated in realistic cut-in scenarios from Safety Pilot Model Deployment (SPMD) dataset and the results show a noticeable improvement in the prediction accuracy in comparison with the kinematics model which is the dominant employed model by the automotive industry. Both ideal and nonideal communication circumstances have been investigated for our system evaluation. For non-ideal case, an estimation step is included in the framework before the parameter prediction block to handle the drawbacks of packet drops or sensor failures and reconstruct the time series of vehicle parameters at a desirable frequency

    Inferring Latent States and Refining Force Estimates via Hierarchical Dirichlet Process Modeling in Single Particle Tracking Experiments

    Get PDF
    Optical microscopy provides rich spatio-temporal information characterizing in vivo molecular motion. However, effective forces and other parameters used to summarize molecular motion change over time in live cells due to latent state changes, e.g., changes induced by dynamic micro-environments, photobleaching, and other heterogeneity inherent in biological processes. This study focuses on techniques for analyzing Single Particle Tracking (SPT) data experiencing abrupt state changes. We demonstrate the approach on GFP tagged chromatids experiencing metaphase in yeast cells and probe the effective forces resulting from dynamic interactions that reflect the sum of a number of physical phenomena. State changes are induced by factors such as microtubule dynamics exerting force through the centromere, thermal polymer fluctuations, etc. Simulations are used to demonstrate the relevance of the approach in more general SPT data analyses. Refined force estimates are obtained by adopting and modifying a nonparametric Bayesian modeling technique, the Hierarchical Dirichlet Process Switching Linear Dynamical System (HDP-SLDS), for SPT applications. The HDP-SLDS method shows promise in systematically identifying dynamical regime changes induced by unobserved state changes when the number of underlying states is unknown in advance (a common problem in SPT applications). We expand on the relevance of the HDP-SLDS approach, review the relevant background of Hierarchical Dirichlet Processes, show how to map discrete time HDP-SLDS models to classic SPT models, and discuss limitations of the approach. In addition, we demonstrate new computational techniques for tuning hyperparameters and for checking the statistical consistency of model assumptions directly against individual experimental trajectories; the techniques circumvent the need for "ground-truth" and subjective information.Comment: 25 pages, 6 figures. Differs only typographically from PLoS One publication available freely as an open-access article at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.013763

    Speech Synthesis Based on Hidden Markov Models

    Get PDF

    Real-Time Predictive Modeling and Robust Avoidance of Pedestrians with Uncertain, Changing Intentions

    Full text link
    To plan safe trajectories in urban environments, autonomous vehicles must be able to quickly assess the future intentions of dynamic agents. Pedestrians are particularly challenging to model, as their motion patterns are often uncertain and/or unknown a priori. This paper presents a novel changepoint detection and clustering algorithm that, when coupled with offline unsupervised learning of a Gaussian process mixture model (DPGP), enables quick detection of changes in intent and online learning of motion patterns not seen in prior training data. The resulting long-term movement predictions demonstrate improved accuracy relative to offline learning alone, in terms of both intent and trajectory prediction. By embedding these predictions within a chance-constrained motion planner, trajectories which are probabilistically safe to pedestrian motions can be identified in real-time. Hardware experiments demonstrate that this approach can accurately predict pedestrian motion patterns from onboard sensor/perception data and facilitate robust navigation within a dynamic environment.Comment: Submitted to 2014 International Workshop on the Algorithmic Foundations of Robotic
    corecore