Search CORE

10,620 research outputs found

Importance mixing: Improving sample reuse in evolutionary policy search methods

Author: Perrin Nicolas
Pourchot Aloïs
Sigaud Olivier
Publication venue
Publication date: 17/08/2018
Field of study

Deep neuroevolution, that is evolutionary policy search methods based on deep neural networks, have recently emerged as a competitor to deep reinforcement learning algorithms due to their better parallelization capabilities. However, these methods still suffer from a far worse sample efficiency. In this paper we investigate whether a mechanism known as "importance mixing" can significantly improve their sample efficiency. We provide a didactic presentation of importance mixing and we explain how it can be extended to reuse more samples. Then, from an empirical comparison based on a simple benchmark, we show that, though it actually provides better sample efficiency, it is still far from the sample efficiency of deep reinforcement learning, though it is more stable

arXiv.org e-Print Archive

Efficient Optimization of Loops and Limits with Randomized Telescoping Sums

Author: Adams Ryan P.
Beatson Alex
Publication venue
Publication date: 01/01/2019
Field of study

We consider optimization problems in which the objective requires an inner loop with many steps or is the limit of a sequence of increasingly costly approximations. Meta-learning, training recurrent neural networks, and optimization of the solutions to differential equations are all examples of optimization problems with this character. In such problems, it can be expensive to compute the objective function value and its gradient, but truncating the loop or using less accurate approximations can induce biases that damage the overall solution. We propose randomized telescope (RT) gradient estimators, which represent the objective as the sum of a telescoping series and sample linear combinations of terms to provide cheap unbiased gradient estimates. We identify conditions under which RT estimators achieve optimization convergence rates independent of the length of the loop or the required accuracy of the approximation. We also derive a method for tuning RT estimators online to maximize a lower bound on the expected decrease in loss per unit of computation. We evaluate our adaptive RT estimators on a range of applications including meta-optimization of learning rates, variational inference of ODE parameters, and training an LSTM to model long sequences

arXiv.org e-Print Archive

Princeton University Open Access Repository

Practical recommendations for gradient-based training of deep architectures

Author: Bengio Yoshua
Publication venue
Publication date: 16/09/2012
Field of study

Learning algorithms related to artificial neural networks and in particular for Deep Learning may seem to involve many bells and whistles, called hyper-parameters. This chapter is meant as a practical guide with recommendations for some of the most commonly used hyper-parameters, in particular in the context of learning algorithms based on back-propagated gradient and gradient-based optimization. It also discusses how to deal with the fact that more interesting results can be obtained when allowing one to adjust many hyper-parameters. Overall, it describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks. It closes with open questions about the training difficulties observed with deeper architectures

arXiv.org e-Print Archive

CiteSeerX

Trust-Region Variational Inference with Gaussian Mixture Models

Author: Arenz O.
Neumann G.
Zhong M.
Publication venue: Journal of Machine Learning Research
Publication date: 04/08/2020
Field of study

Many methods for machine learning rely on approximate inference from intractable probability distributions. Variational inference approximates such distributions by tractable models that can be subsequently used for approximate inference. Learning sufficiently accurate approximations requires a rich model family and careful exploration of the relevant modes of the target distribution. We propose a method for learning accurate GMM approximations of intractable probability distributions based on insights from policy search by using information-geometric trust regions for principled exploration. For efficient improvement of the GMM approximation, we derive a lower bound on the corresponding optimization objective enabling us to update the components independently. Our use of the lower bound ensures convergence to a stationary point of the original objective. The number of components is adapted online by adding new components in promising regions and by deleting components with negligible weight. We demonstrate on several domains that we can learn approximations of complex, multimodal distributions with a quality that is unmet by previous variational inference methods, and that the GMM approximation can be used for drawing samples that are on par with samples created by state-of-theart MCMC samplers while requiring up to three orders of magnitude less computational resources

arXiv.org e-Print Archive

KITopen

Lagrangian Based Methods for Coherent Structure Detection

Author: Ma T.
Michael R. Allshouse
Tang W.
Thomas Peacock
Ulam S.
Publication venue
Publication date: 01/09/2015
Field of study

There has been a proliferation in the development of Lagrangian analytical methods for detecting coherent structures in fluid flow transport, yielding a variety of qualitatively different approaches. We present a review of four approaches and demonstrate the utility of these methods via their application to the same sample analytic model, the canonical double-gyre flow, highlighting the pros and cons of each approach. Two of the methods, the geometric and probabilistic approaches, are well established and require velocity field data over the time interval of interest to identify particularly important material lines and surfaces, and influential regions, respectively. The other two approaches, implementing tools from cluster and braid theory, seek coherent structures based on limited trajectory data, attempting to partition the flow transport into distinct regions. All four of these approaches share the common trait that they are objective methods, meaning that their results do not depend on the frame of reference used. For each method, we also present a number of example applications ranging from blood flow and chemical reactions to ocean and atmospheric flows. (C) 2015 AIP Publishing LLC.ONR N000141210665Center for Nonlinear Dynamic

Crossref

Texas ScholarWorks