Search CORE

525 research outputs found

Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing

Author: A Liniger
B Paden
C Urmson
CW Anderson
D Dolgov
D Wierstra
DQ Mayne
E Frazzoli
HT Siegelmann
J Xu
P Falcone
R Tedrake
T Schouwenaars
Publication venue
Publication date: 02/08/2018
Field of study

Within the context of autonomous driving a model-based reinforcement learning algorithm is proposed for the design of neural network-parameterized controllers. Classical model-based control methods, which include sampling- and lattice-based algorithms and model predictive control, suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. To circumvent this trade-off, a 2-step procedure is motivated: first learning of a controller during offline training based on an arbitrarily complicated mathematical system model, before online fast feedforward evaluation of the trained controller. The contribution of this paper is the proposition of a simple gradient-free and model-based algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). In particular, (i) simultaneous training on separate deterministic tasks with the purpose of encoding many motion primitives in a neural network, and (ii) the employment of maximally sparse rewards in combination with virtual velocity constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

Contrasting Views of Complexity and Their Implications For Network-Centric Infrastructures

Author: Alderson David L.
Doyle John C.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2010
Field of study

There exists a widely recognized need to better understand and manage complex “systems of systems,” ranging from biology, ecology, and medicine to network-centric technologies. This is motivating the search for universal laws of highly evolved systems and driving demand for new mathematics and methods that are consistent, integrative, and predictive. However, the theoretical frameworks available today are not merely fragmented but sometimes contradictory and incompatible. We argue that complexity arises in highly evolved biological and technological systems primarily to provide mechanisms to create robustness. However, this complexity itself can be a source of new fragility, leading to “robust yet fragile” tradeoffs in system design. We focus on the role of robustness and architecture in networked infrastructures, and we highlight recent advances in the theory of distributed control driven by network technologies. This view of complexity in highly organized technological and biological systems is fundamentally different from the dominant perspective in the mainstream sciences, which downplays function, constraints, and tradeoffs, and tends to minimize the role of organization and design

Caltech Authors

Calhoun, Institutional Archive of the Naval Postgraduate School

Recurrent neural networks: methods and applications to non-linear predictions

Author: BAY ALESSANDRO
Publication venue: country:Italy
Publication date: 01/01/2017
Field of study

This thesis deals with recurrent neural networks, a particular class of artificial neural networks which can learn a generative model of input sequences. The input is mapped, through a feedback loop and a non-linear activation function, into a hidden state, which is then projected into the output space, obtaining either a probability distribution or the new input for the next time-step. This work consists mainly of two parts: a theoretical study for helping the understanding of recurrent neural networks framework, which is not yet deeply investigated, and their application to non-linear prediction problems, since recurrent neural networks are really powerful models suitable for solving several practical tasks in different fields. For what concerns the theoretical part, we analyse the weaknesses of state-of-the-art models and tackle them in order to improve the performance of a recurrent neural network. Firstly, we contribute in the understanding of the dynamical properties of a recurrent neural network, highlighting the close relation between the definition of stable limit cycles and the echo state property of an echo state network. We provide sufficient conditions for the convergence of the hidden state to a trajectory, which is uniquely determined by the input signal, independently of the initial states. This may help extend the memory of the network and increase the design options for the network. Moreover, we develop a novel approach to address the main problem in training recurrent neural networks, the so-called vanishing gradient problem. Our new method allows us to train a very simple recurrent neural network, making the gradient not to vanish even after many time-steps. Exploiting the singular value decomposition of the vanishing factors in the gradient and random matrices theory, we find that the singular values have to be confined in a narrow interval and derive conditions about their root mean square value. Then, we also improve the efficiency of the training of a recurrent neural network, defining a new method for speeding up this process. Thanks to a least square regularization, we can initialize the parameters of the network, in order to set them closer to the minimum and running fewer epochs of classical training algorithms. Moreover, it is also possible to completely train the network with our initialization method, running more iterations of it without losing in performance with respect to classical training algorithms. Finally, it is also possible to use it as a real-time learning algorithm, adjusting the parameters to the new data through one iteration of our initialization. In the last part of this thesis, we apply recurrent neural networks to non-linear prediction problems. We consider prediction of numerical sequences, estimating the following input choosing it from a probability distribution. We study an automatic text generation problem, where we need to predict the following character in order to compose words and sentences, and a path prediction of walking mobile users in the central area of a city, as a sequence of crossroads. Then, we analyse the prediction of video frames, discovering a wide range of applications related to the prediction of movements. We study the collision problem of bouncing balls, taking into account only the sequence of video frames without any knowledge about the physical characteristics of the problem, and the distribution over days of mobile user in a city and in a whole region. Finally, we address the state-of-the-art problem of missing data imputation, analysing the incomplete spectrogram of audio signals. We restore audio signals with missing time-frequency data, demonstrating via numerical experiments that a performance improvement can be achieved involving recurrent neural networks

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Latent data augmentation and modular structure for improved generalization

Author: Lamb Alexander
Publication venue
Publication date: 01/08/2022
Field of study

This thesis explores the nature of generalization in deep learning and several settings in which it fails. In particular, deep neural networks can struggle to generalize in settings with limited data, insufficient supervision, challenging long-range dependencies, or complex structure and subsystems. This thesis explores the nature of these challenges for generalization in deep learning and presents several algorithms which seek to address these challenges. In the first article, we show how training with interpolated hidden states can improve generalization and calibration in deep learning. We also introduce a theory showing how our algorithm, which we call Manifold Mixup, leads to a flattening of the per-class hidden representations, which can be seen as a compression of the information in the hidden states. The second article is related to the first and shows how interpolated examples can be used for semi-supervised learning. In addition to interpolating the input examples, the model’s interpolated predictions are used as targets for these examples. This improves results on standard benchmarks as well as classic 2D toy problems for semi-supervised learning. The third article studies how a recurrent neural network can be divided into multiple modules with different parameters and well separated hidden states, as well as a competition mechanism restricting updating of the hidden states to a subset of the most relevant modules on a specific time-step. This improves systematic generalization when the pattern distribution is changed between the training and evaluation phases. It also improves generalization in reinforcement learning. In the fourth article, we show that attention can be used to control the flow of information between successive layers in deep networks. This allows each layer to only process the subset of the previously computed layers’ outputs which are most relevant. This improves generalization on relational reasoning tasks as well as standard benchmark classification tasks.Cette thèse explore la nature de la généralisation dans l’apprentissage en profondeur et plusieurs contextes dans lesquels elle échoue. En particulier, les réseaux de neurones profonds peuvent avoir du mal à se généraliser dans des contextes avec des données limitées, une supervision insuffisante, des dépendances à longue portée difficiles ou une structure et des sous-systèmes complexes. Cette thèse explore la nature de ces défis pour la généralisation en apprentissage profond et présente plusieurs algorithmes qui cherchent à relever ces défis. Dans le premier article, nous montrons comment l’entraînement avec des états cachés interpolés peut améliorer la généralisation et la calibration en apprentissage profond. Nous introduisons également une théorie montrant comment notre algorithme, que nous appelons Manifold Mixup, conduit à un aplatissement des représentations cachées par classe, ce qui peut être vu comme une compression de l’information dans les états cachés. Le deuxième article est lié au premier et montre comment des exemples interpolés peuvent être utilisés pour un apprentissage semi-supervisé. Outre l’interpolation des exemples d’entrée, les prédictions interpolées du modèle sont utilisées comme cibles pour ces exemples. Cela améliore les résultats sur les benchmarks standard ainsi que sur les problèmes de jouets 2D classiques pour l’apprentissage semi-supervisé. Le troisième article étudie comment un réseau de neurones récurrent peut être divisé en plusieurs modules avec des paramètres différents et des états cachés bien séparés, ainsi qu’un mécanisme de concurrence limitant la mise à jour des états cachés à un sous-ensemble des modules les plus pertinents sur un pas de temps spécifique. . Cela améliore la généralisation systématique lorsque la distribution des modèles est modifiée entre les phases de entraînement et d’évaluation. Il améliore également la généralisation dans l’apprentissage par renforcement. Dans le quatrième article, nous montrons que l’attention peut être utilisée pour contrôler le flux d’informations entre les couches successives des réseaux profonds. Cela permet à chaque couche de ne traiter que le sous-ensemble des sorties des couches précédemment calculées qui sont les plus pertinentes. Cela améliore la généralisation sur les tâches de raisonnement relationnel ainsi que sur les tâches de classification de référence standard

Dépôt Institutionnel Numérique

Stochastic Motion Planning as Gaussian Variational Inference: Theory and Algorithms

Author: Chen Yongxin
Yu Hongzhe
Publication venue
Publication date: 28/08/2023
Field of study

We consider the motion planning problem under uncertainty and address it using probabilistic inference. A collision-free motion plan with linear stochastic dynamics is modeled by a posterior distribution. Gaussian variational inference is an optimization over the path distributions to infer this posterior within the scope of Gaussian distributions. We propose Gaussian Variational Inference Motion Planner (GVI-MP) algorithm to solve this Gaussian inference, where a natural gradient paradigm is used to iteratively update the Gaussian distribution, and the factorized structure of the joint distribution is leveraged. We show that the direct optimization over the state distributions in GVI-MP is equivalent to solving a stochastic control that has a closed-form solution. Starting from this observation, we propose our second algorithm, Proximal Gradient Covariance Steering Motion Planner (PGCS-MP), to solve the same inference problem in its stochastic control form with terminal constraints. We use a proximal gradient paradigm to solve the linear stochastic control with nonlinear collision cost, where the nonlinear cost is iteratively approximated using quadratic functions and a closed-form solution can be obtained by solving a linear covariance steering at each iteration. We evaluate the effectiveness and the performance of the proposed approaches through extensive experiments on various robot models. The code for this paper can be found in https://github.com/hzyu17/VIMP.Comment: 19 page

arXiv.org e-Print Archive

Recommended from our members

When Can Nonconvex Optimization Problems be Solved with Gradient Descent? A Few Case Studies

Author: Gilboa Dar
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Gradient descent and related algorithms are ubiquitously used to solve optimization problems arising in machine learning and signal processing. In many cases, these problems are nonconvex yet such simple algorithms are still effective. In an attempt to better understand this phenomenon, we study a number of nonconvex problems, proving that they can be solved efficiently with gradient descent. We will consider complete, orthogonal dictionary learning, and present a geometric analysis allowing us to obtain efficient convergence rate for gradient descent that hold with high probability. We also show that similar geometric structure is present in other nonconvex problems such as generalized phase retrieval. Turning next to neural networks, we will also calculate conditions on certain classes of networks under which signals and gradients propagate through the network in a stable manner during the initial stages of training. Initialization schemes derived using these calculations allow training recurrent networks on long sequence tasks, and in the case of networks with low precision activation functions they make explicit a tradeoff between the reduction in precision and the maximal depth of a model that can be trained with gradient descent. We finally consider manifold classification with a deep feed-forward neural network, for a particularly simple configuration of the manifolds. We provide an end-to-end analysis of the training process, proving that under certain conditions on the architectural hyperparameters of the network, it can successfully classify any point on the manifolds with high probability given a sufficient number of independent samples from the manifold, in a timely manner. Our analysis relates the depth and width of the network to its fitting capacity and statistical regularity respectively in early stages of training

Columbia University Academic Commons