525 research outputs found
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Contrasting Views of Complexity and Their Implications For Network-Centric Infrastructures
There exists a widely recognized need to better understand
and manage complex âsystems of systems,â ranging from
biology, ecology, and medicine to network-centric technologies.
This is motivating the search for universal laws of highly evolved
systems and driving demand for new mathematics and methods
that are consistent, integrative, and predictive. However, the theoretical
frameworks available today are not merely fragmented
but sometimes contradictory and incompatible. We argue that
complexity arises in highly evolved biological and technological
systems primarily to provide mechanisms to create robustness.
However, this complexity itself can be a source of new fragility,
leading to ârobust yet fragileâ tradeoffs in system design. We
focus on the role of robustness and architecture in networked
infrastructures, and we highlight recent advances in the theory
of distributed control driven by network technologies. This view
of complexity in highly organized technological and biological systems
is fundamentally different from the dominant perspective in
the mainstream sciences, which downplays function, constraints,
and tradeoffs, and tends to minimize the role of organization and
design
Recurrent neural networks: methods and applications to non-linear predictions
This thesis deals with recurrent neural networks, a particular class of artificial neural networks which can learn a generative model of input sequences. The input is mapped, through a feedback loop and a non-linear activation function, into a hidden state, which is then projected into the output space, obtaining either a probability distribution or the new input for the next time-step.
This work consists mainly of two parts: a theoretical study for helping the understanding of recurrent neural networks framework, which is not yet deeply investigated, and their application to non-linear prediction problems, since recurrent neural networks are really powerful models suitable for solving several practical tasks in different fields.
For what concerns the theoretical part, we analyse the weaknesses of state-of-the-art models and tackle them in order to improve the performance of a recurrent neural network. Firstly, we contribute in the understanding of the dynamical properties of a recurrent neural network, highlighting the close relation between the definition of stable limit cycles and the echo state property of an echo state network. We provide sufficient conditions for the convergence of the hidden state to a trajectory, which is uniquely determined by the input signal, independently of the initial states. This may help extend the memory of the network and increase the design options for the network.
Moreover, we develop a novel approach to address the main problem in training recurrent neural networks, the so-called vanishing gradient problem. Our new method allows us to train a very simple recurrent neural network, making the gradient not to vanish even after many time-steps. Exploiting the singular value decomposition of the vanishing factors in the gradient and random matrices theory, we find that the singular values have to be confined in a narrow interval and derive conditions about their root mean square value.
Then, we also improve the efficiency of the training of a recurrent neural network, defining a new method for speeding up this process. Thanks to a least square regularization, we can initialize the parameters of the network, in order to set them closer to the minimum and running fewer epochs of classical training algorithms. Moreover, it is also possible to completely train the network with our initialization method, running more iterations of it without losing in performance with respect to classical training algorithms. Finally, it is also possible to use it as a real-time learning algorithm, adjusting the parameters to the new data through one iteration of our initialization.
In the last part of this thesis, we apply recurrent neural networks to non-linear prediction problems. We consider prediction of numerical sequences, estimating the following input choosing it from a probability distribution. We study an automatic text generation problem, where we need to predict the following character in order to compose words and sentences, and a path prediction of walking mobile users in the central area of a city, as a sequence of crossroads.
Then, we analyse the prediction of video frames, discovering a wide range of applications related to the prediction of movements. We study the collision problem of bouncing balls, taking into account only the sequence of video frames without any knowledge about the physical characteristics of the problem, and the distribution over days of mobile user in a city and in a whole region.
Finally, we address the state-of-the-art problem of missing data imputation, analysing the incomplete spectrogram of audio signals. We restore audio signals with missing time-frequency data, demonstrating via numerical experiments that a performance improvement can be achieved involving recurrent neural networks
Latent data augmentation and modular structure for improved generalization
This thesis explores the nature of generalization in deep learning and several settings in which it fails. In particular, deep neural networks can struggle to generalize in settings with limited data, insufficient supervision, challenging long-range dependencies, or complex structure and subsystems. This thesis explores the nature of these challenges for generalization in deep learning and presents several algorithms which seek to address these challenges. In the first article, we show how training with interpolated hidden states can improve generalization and calibration in deep learning. We also introduce a theory showing how our algorithm, which we call Manifold Mixup, leads to a flattening of the per-class hidden representations, which can be seen as a compression of the information in the hidden states. The second article is related to the first and shows how interpolated examples can be used for semi-supervised learning. In addition to interpolating the input examples, the modelâs interpolated predictions are used as targets for these examples. This improves results on standard benchmarks as well as classic 2D toy problems for semi-supervised learning. The third article studies how a recurrent neural network can be divided into multiple modules with different parameters and well separated hidden states, as well as a competition mechanism restricting updating of the hidden states to a subset of the most relevant modules on a specific time-step. This improves systematic generalization when the pattern distribution is changed between the training and evaluation phases. It also improves generalization in reinforcement learning. In the fourth article, we show that attention can be used to control the flow of information between successive layers in deep networks. This allows each layer to only process the subset of the previously computed layersâ outputs which are most relevant. This improves generalization on relational reasoning tasks as well as standard benchmark classification tasks.Cette thĂšse explore la nature de la gĂ©nĂ©ralisation dans lâapprentissage en profondeur et
plusieurs contextes dans lesquels elle échoue. En particulier, les réseaux de neurones profonds
peuvent avoir du mal à se généraliser dans des contextes avec des données limitées, une
supervision insuffisante, des dépendances à longue portée difficiles ou une structure et des
sous-systĂšmes complexes.
Cette thÚse explore la nature de ces défis pour la généralisation en apprentissage profond
et présente plusieurs algorithmes qui cherchent à relever ces défis.
Dans le premier article, nous montrons comment lâentraĂźnement avec des Ă©tats cachĂ©s
interpolés peut améliorer la généralisation et la calibration en apprentissage profond. Nous
introduisons également une théorie montrant comment notre algorithme, que nous appelons
Manifold Mixup, conduit à un aplatissement des représentations cachées par classe, ce qui
peut ĂȘtre vu comme une compression de lâinformation dans les Ă©tats cachĂ©s.
Le deuxiÚme article est lié au premier et montre comment des exemples interpolés peuvent
ĂȘtre utilisĂ©s pour un apprentissage semi-supervisĂ©. Outre lâinterpolation des exemples dâentrĂ©e,
les prédictions interpolées du modÚle sont utilisées comme cibles pour ces exemples. Cela
améliore les résultats sur les benchmarks standard ainsi que sur les problÚmes de jouets 2D
classiques pour lâapprentissage semi-supervisĂ©.
Le troisiĂšme article Ă©tudie comment un rĂ©seau de neurones rĂ©current peut ĂȘtre divisĂ© en
plusieurs modules avec des paramĂštres diffĂ©rents et des Ă©tats cachĂ©s bien sĂ©parĂ©s, ainsi quâun
mécanisme de concurrence limitant la mise à jour des états cachés à un sous-ensemble des
modules les plus pertinents sur un pas de temps spécifique. . Cela améliore la généralisation
systématique lorsque la distribution des modÚles est modifiée entre les phases de entraßnement
et dâĂ©valuation. Il amĂ©liore Ă©galement la gĂ©nĂ©ralisation dans lâapprentissage par renforcement.
Dans le quatriĂšme article, nous montrons que lâattention peut ĂȘtre utilisĂ©e pour contrĂŽler le
flux dâinformations entre les couches successives des rĂ©seaux profonds. Cela permet Ă chaque
couche de ne traiter que le sous-ensemble des sorties des couches précédemment calculées
qui sont les plus pertinentes. Cela améliore la généralisation sur les tùches de raisonnement
relationnel ainsi que sur les tùches de classification de référence standard
Stochastic Motion Planning as Gaussian Variational Inference: Theory and Algorithms
We consider the motion planning problem under uncertainty and address it
using probabilistic inference. A collision-free motion plan with linear
stochastic dynamics is modeled by a posterior distribution. Gaussian
variational inference is an optimization over the path distributions to infer
this posterior within the scope of Gaussian distributions. We propose Gaussian
Variational Inference Motion Planner (GVI-MP) algorithm to solve this Gaussian
inference, where a natural gradient paradigm is used to iteratively update the
Gaussian distribution, and the factorized structure of the joint distribution
is leveraged. We show that the direct optimization over the state distributions
in GVI-MP is equivalent to solving a stochastic control that has a closed-form
solution. Starting from this observation, we propose our second algorithm,
Proximal Gradient Covariance Steering Motion Planner (PGCS-MP), to solve the
same inference problem in its stochastic control form with terminal
constraints. We use a proximal gradient paradigm to solve the linear stochastic
control with nonlinear collision cost, where the nonlinear cost is iteratively
approximated using quadratic functions and a closed-form solution can be
obtained by solving a linear covariance steering at each iteration. We evaluate
the effectiveness and the performance of the proposed approaches through
extensive experiments on various robot models. The code for this paper can be
found in https://github.com/hzyu17/VIMP.Comment: 19 page
Recommended from our members
When Can Nonconvex Optimization Problems be Solved with Gradient Descent? A Few Case Studies
Gradient descent and related algorithms are ubiquitously used to solve optimization problems arising in machine learning and signal processing. In many cases, these problems are nonconvex yet such simple algorithms are still effective. In an attempt to better understand this phenomenon, we study a number of nonconvex problems, proving that they can be solved efficiently with gradient descent. We will consider complete, orthogonal dictionary learning, and present a geometric analysis allowing us to obtain efficient convergence rate for gradient descent that hold with high probability. We also show that similar geometric structure is present in other nonconvex problems such as generalized phase retrieval.
Turning next to neural networks, we will also calculate conditions on certain classes of networks under which signals and gradients propagate through the network in a stable manner during the initial stages of training. Initialization schemes derived using these calculations allow training recurrent networks on long sequence tasks, and in the case of networks with low precision activation functions they make explicit a tradeoff between the reduction in precision and the maximal depth of a model that can be trained with gradient descent.
We finally consider manifold classification with a deep feed-forward neural network, for a particularly simple configuration of the manifolds. We provide an end-to-end analysis of the training process, proving that under certain conditions on the architectural hyperparameters of the network, it can successfully classify any point on the manifolds with high probability given a sufficient number of independent samples from the manifold, in a timely manner. Our analysis relates the depth and width of the network to its fitting capacity and statistical regularity respectively in early stages of training
- âŠ