226,865 research outputs found
Gaussian Processes for Data-Efficient Learning in Robotics and Control.
Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.The research leading to these results has received funding from the EC’s Seventh Framework Programme (FP7/2007-2013) under grant agreement #270327, ONR MURI grant N00014-09-1-1052, Intel Labs, and the Department of Computing, Imperial College London.This is the author accepted manuscript. The final version is available from IEEE via http://dx.doi.org/10.1109/TPAMI.2013.21
A deep learning framework based on Koopman operator for data-driven modeling of vehicle dynamics
Autonomous vehicles and driving technologies have received notable attention
in the past decades. In autonomous driving systems, \textcolor{black}{the}
information of vehicle dynamics is required in most cases for designing of
motion planning and control algorithms. However, it is nontrivial for
identifying a global model of vehicle dynamics due to the existence of strong
non-linearity and uncertainty. Many efforts have resorted to machine learning
techniques for building data-driven models, but it may suffer from
interpretability and result in a complex nonlinear representation. In this
paper, we propose a deep learning framework relying on an interpretable Koopman
operator to build a data-driven predictor of the vehicle dynamics. The main
idea is to use the Koopman operator for representing the nonlinear dynamics in
a linear lifted feature space. The approach results in a global model that
integrates the dynamics in both longitudinal and lateral directions. As the
core contribution, we propose a deep learning-based extended dynamic mode
decomposition (Deep EDMD) algorithm to learn a finite approximation of the
Koopman operator. Different from other machine learning-based approaches, deep
neural networks play the role of learning feature representations for EDMD in
the framework of the Koopman operator. Simulation results in a high-fidelity
CarSim environment are reported, which show the capability of the Deep EDMD
approach in multi-step prediction of vehicle dynamics at a wide operating
range. Also, the proposed approach outperforms the EDMD method, the multi-layer
perception (MLP) method, and the Extreme Learning Machines-based EDMD
(ELM-EDMD) method in terms of modeling performance. Finally, we design a linear
MPC with Deep EDMD (DE-MPC) for realizing reference tracking and test the
controller in the CarSim environment.Comment: 12 pages, 10 figures, 1 table, and 2 algorithm
Learning to Recognize Actions from Limited Training Examples Using a Recurrent Spiking Neural Model
A fundamental challenge in machine learning today is to build a model that
can learn from few examples. Here, we describe a reservoir based spiking neural
model for learning to recognize actions with a limited number of labeled
videos. First, we propose a novel encoding, inspired by how microsaccades
influence visual perception, to extract spike information from raw video data
while preserving the temporal correlation across different frames. Using this
encoding, we show that the reservoir generalizes its rich dynamical activity
toward signature action/movements enabling it to learn from few training
examples. We evaluate our approach on the UCF-101 dataset. Our experiments
demonstrate that our proposed reservoir achieves 81.3%/87% Top-1/Top-5
accuracy, respectively, on the 101-class data while requiring just 8 video
examples per class for training. Our results establish a new benchmark for
action recognition from limited video examples for spiking neural models while
yielding competetive accuracy with respect to state-of-the-art non-spiking
neural models.Comment: 13 figures (includes supplementary information
The Green Choice: Learning and Influencing Human Decisions on Shared Roads
Autonomous vehicles have the potential to increase the capacity of roads via
platooning, even when human drivers and autonomous vehicles share roads.
However, when users of a road network choose their routes selfishly, the
resulting traffic configuration may be very inefficient. Because of this, we
consider how to influence human decisions so as to decrease congestion on these
roads. We consider a network of parallel roads with two modes of
transportation: (i) human drivers who will choose the quickest route available
to them, and (ii) ride hailing service which provides an array of autonomous
vehicle ride options, each with different prices, to users. In this work, we
seek to design these prices so that when autonomous service users choose from
these options and human drivers selfishly choose their resulting routes, road
usage is maximized and transit delay is minimized. To do so, we formalize a
model of how autonomous service users make choices between routes with
different price/delay values. Developing a preference-based algorithm to learn
the preferences of the users, and using a vehicle flow model related to the
Fundamental Diagram of Traffic, we formulate a planning optimization to
maximize a social objective and demonstrate the benefit of the proposed routing
and learning scheme.Comment: Submitted to CDC 201
- …