Search CORE

4,154 research outputs found

Importance mixing: Improving sample reuse in evolutionary policy search methods

Author: Perrin Nicolas
Pourchot Aloïs
Sigaud Olivier
Publication venue
Publication date: 17/08/2018
Field of study

Deep neuroevolution, that is evolutionary policy search methods based on deep neural networks, have recently emerged as a competitor to deep reinforcement learning algorithms due to their better parallelization capabilities. However, these methods still suffer from a far worse sample efficiency. In this paper we investigate whether a mechanism known as "importance mixing" can significantly improve their sample efficiency. We provide a didactic presentation of importance mixing and we explain how it can be extended to reuse more samples. Then, from an empirical comparison based on a simple benchmark, we show that, though it actually provides better sample efficiency, it is still far from the sample efficiency of deep reinforcement learning, though it is more stable

arXiv.org e-Print Archive

Increasing pattern recognition accuracy for chemical sensing by evolutionary based drift compensation

Author: A. Scionti
A. Tonda
Aliwell
Artursson
Chen
Di Carlo
Duda
E. Sanchez
Falasconi
G. Squillero
Gobbi
Gutierrez-Osuna
Hansen
Hansen
Hansen
Haugen
Hines
Hui
Ionescu
Kuhn
Llobet
M. Falasconi
Marco
Natale
Nelli
Owens
Padilla
Pardo
Pearce
Polster
S. Di Carlo
Sharma
Sisk
Tibshirani
Tomic
Vlachos
Zuppa
Zuppa
Publication venue: Elsevier
Publication date: 01/01/2011
Field of study

Artificial olfaction systems, which mimic human olfaction by using arrays of gas chemical sensors combined with pattern recognition methods, represent a potentially low-cost tool in many areas of industry such as perfumery, food and drink production, clinical diagnosis, health and safety, environmental monitoring and process control. However, successful applications of these systems are still largely limited to specialized laboratories. Sensor drift, i.e., the lack of a sensor's stability over time, still limits real in dustrial setups. This paper presents and discusses an evolutionary based adaptive drift-correction method designed to work with state-of-the-art classification systems. The proposed approach exploits a cutting-edge evolutionary strategy to iteratively tweak the coefficients of a linear transformation which can transparently correct raw sensors' measures thus mitigating the negative effects of the drift. The method learns the optimal correction strategy without the use of models or other hypotheses on the behavior of the physical chemical sensors

Crossref

HAL Descartes

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Hal-Diderot

PORTO Publications Open Repository TOrino

Variable Metric Random Pursuit

Author: Gärtner Bernd
Müller Christian L.
Stich Sebastian U.
Publication venue
Publication date: 30/09/2014
Field of study

We consider unconstrained randomized optimization of smooth convex objective functions in the gradient-free setting. We analyze Random Pursuit (RP) algorithms with fixed (F-RP) and variable metric (V-RP). The algorithms only use zeroth-order information about the objective function and compute an approximate solution by repeated optimization over randomly chosen one-dimensional subspaces. The distribution of search directions is dictated by the chosen metric. Variable Metric RP uses novel variants of a randomized zeroth-order Hessian approximation scheme recently introduced by Leventhal and Lewis (D. Leventhal and A. S. Lewis., Optimization 60(3), 329--245, 2011). We here present (i) a refined analysis of the expected single step progress of RP algorithms and their global convergence on (strictly) convex functions and (ii) novel convergence bounds for V-RP on strongly convex functions. We also quantify how well the employed metric needs to match the local geometry of the function in order for the RP algorithms to converge with the best possible rate. Our theoretical results are accompanied by numerical experiments, comparing V-RP with the derivative-free schemes CMA-ES, Implicit Filtering, Nelder-Mead, NEWUOA, Pattern-Search and Nesterov's gradient-free algorithms.Comment: 42 pages, 6 figures, 15 tables, submitted to journal, Version 3: majorly revised second part, i.e. Section 5 and Appendi

arXiv.org e-Print Archive

DIAL UCLouvain

Policy Search in Continuous Action Domains: an Overview

Author: Sigaud Olivier
Stulp Freek
Publication venue
Publication date: 01/01/2019
Field of study

Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms. In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods. The main message of this overview is in the relationship between the families of methods, but we also outline some factors underlying sample efficiency properties of the various approaches.Comment: Accepted in the Neural Networks Journal (Volume 113, May 2019

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Transfer Learning for Improving Model Predictions in Highly Configurable Software

Author: Jamshidi Pooyan
Kawthekar Prasad
Kästner Christian
Siegmund Norbert
Velez Miguel
Publication venue
Publication date: 20/04/2017
Field of study

Modern software systems are built to be used in dynamic environments using configuration capabilities to adapt to changes and external uncertainties. In a self-adaptation context, we are often interested in reasoning about the performance of the systems under different configurations. Usually, we learn a black-box model based on real measurements to predict the performance of the system given a specific configuration. However, as modern systems become more complex, there are many configuration parameters that may interact and we end up learning an exponentially large configuration space. Naturally, this does not scale when relying on real measurements in the actual changing environment. We propose a different solution: Instead of taking the measurements from the real system, we learn the model using samples from other sources, such as simulators that approximate performance of the real system at low cost. We define a cost model that transform the traditional view of model learning into a multi-objective problem that not only takes into account model accuracy but also measurements effort as well. We evaluate our cost-aware transfer learning solution using real-world configurable software including (i) a robotic system, (ii) 3 different stream processing applications, and (iii) a NoSQL database system. The experimental results demonstrate that our approach can achieve (a) a high prediction accuracy, as well as (b) a high model reliability.Comment: To be published in the proceedings of the 12th International Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS'17

arXiv.org e-Print Archive

Crossref

Trust-Region Variational Inference with Gaussian Mixture Models

Author: Arenz O.
Neumann G.
Zhong M.
Publication venue: Journal of Machine Learning Research
Publication date: 04/08/2020
Field of study

Many methods for machine learning rely on approximate inference from intractable probability distributions. Variational inference approximates such distributions by tractable models that can be subsequently used for approximate inference. Learning sufficiently accurate approximations requires a rich model family and careful exploration of the relevant modes of the target distribution. We propose a method for learning accurate GMM approximations of intractable probability distributions based on insights from policy search by using information-geometric trust regions for principled exploration. For efficient improvement of the GMM approximation, we derive a lower bound on the corresponding optimization objective enabling us to update the components independently. Our use of the lower bound ensures convergence to a stationary point of the original objective. The number of components is adapted online by adding new components in promising regions and by deleting components with negligible weight. We demonstrate on several domains that we can learn approximations of complex, multimodal distributions with a quality that is unmet by previous variational inference methods, and that the GMM approximation can be used for drawing samples that are on par with samples created by state-of-theart MCMC samplers while requiring up to three orders of magnitude less computational resources

arXiv.org e-Print Archive

KITopen