1,109 research outputs found

    Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization

    Get PDF
    Majorization-minimization algorithms consist of iteratively minimizing a majorizing surrogate of an objective function. Because of its simplicity and its wide applicability, this principle has been very popular in statistics and in signal processing. In this paper, we intend to make this principle scalable. We introduce a stochastic majorization-minimization scheme which is able to deal with large-scale or possibly infinite data sets. When applied to convex optimization problems under suitable assumptions, we show that it achieves an expected convergence rate of O(1/n)O(1/\sqrt{n}) after nn iterations, and of O(1/n)O(1/n) for strongly convex functions. Equally important, our scheme almost surely converges to stationary points for a large class of non-convex problems. We develop several efficient algorithms based on our framework. First, we propose a new stochastic proximal gradient method, which experimentally matches state-of-the-art solvers for large-scale 1\ell_1-logistic regression. Second, we develop an online DC programming algorithm for non-convex sparse estimation. Finally, we demonstrate the effectiveness of our approach for solving large-scale structured matrix factorization problems.Comment: accepted for publication for Neural Information Processing Systems (NIPS) 2013. This is the 9-pages version followed by 16 pages of appendices. The title has changed compared to the first technical repor

    A Stochastic Majorize-Minimize Subspace Algorithm for Online Penalized Least Squares Estimation

    Full text link
    Stochastic approximation techniques play an important role in solving many problems encountered in machine learning or adaptive signal processing. In these contexts, the statistics of the data are often unknown a priori or their direct computation is too intensive, and they have thus to be estimated online from the observed signals. For batch optimization of an objective function being the sum of a data fidelity term and a penalization (e.g. a sparsity promoting function), Majorize-Minimize (MM) methods have recently attracted much interest since they are fast, highly flexible, and effective in ensuring convergence. The goal of this paper is to show how these methods can be successfully extended to the case when the data fidelity term corresponds to a least squares criterion and the cost function is replaced by a sequence of stochastic approximations of it. In this context, we propose an online version of an MM subspace algorithm and we study its convergence by using suitable probabilistic tools. Simulation results illustrate the good practical performance of the proposed algorithm associated with a memory gradient subspace, when applied to both non-adaptive and adaptive filter identification problems

    Classical Optimizers for Noisy Intermediate-Scale Quantum Devices

    Get PDF
    We present a collection of optimizers tuned for usage on Noisy Intermediate-Scale Quantum (NISQ) devices. Optimizers have a range of applications in quantum computing, including the Variational Quantum Eigensolver (VQE) and Quantum Approximate Optimization (QAOA) algorithms. They are also used for calibration tasks, hyperparameter tuning, in machine learning, etc. We analyze the efficiency and effectiveness of different optimizers in a VQE case study. VQE is a hybrid algorithm, with a classical minimizer step driving the next evaluation on the quantum processor. While most results to date concentrated on tuning the quantum VQE circuit, we show that, in the presence of quantum noise, the classical minimizer step needs to be carefully chosen to obtain correct results. We explore state-of-the-art gradient-free optimizers capable of handling noisy, black-box, cost functions and stress-test them using a quantum circuit simulation environment with noise injection capabilities on individual gates. Our results indicate that specifically tuned optimizers are crucial to obtaining valid science results on NISQ hardware, and will likely remain necessary even for future fault tolerant circuits

    Dagstuhl Reports : Volume 1, Issue 2, February 2011

    Get PDF
    Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

    Otimização multi-objetivo envolvendo aproximadores de função via processos gaussianos e algoritmos híbridos que empregam otimização direta do hipervolume

    Get PDF
    Orientador: Fernando José Von ZubenTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: O principal propósito desta tese é reduzir a lacuna entre otimização mono-objetivo e multiobjetivo e mostrar que conectar técnicas de lados opostos pode gerar melhores resultados. Para atingir esta meta, nós fornecemos contribuições em três direções. Primeiro, mostra-se a conexão entre otimalidade da perda média e do hipervolume quando avaliando uma única solução, provando limites de otimalidade quando a solução de um é aplicada ao outro. Ademais, uma avaliação do gradiente do hipervolume mostra que ele pode ser interpretado como um caso particular da perda média ponderada, onde os pesos aumentam conforme as perdas associadas aumentam. Levantou-se a hipótese de que isto pode ajudar a treinar modelos de aprendizado de máquina, uma vez que amostras com erro alto também terão peso alto. Um experimento com uma rede neural valida a hipótese, mostrando melhor desempenho. Segundo, avaliaram-se tentativas anteriores de usar otimização do hipervolume baseada em gradiente para resolver problemas multi-objetivo e por que elas falharam. Baseado na análise, foi proposto um algoritmo híbrido que combina otimização evolutiva e baseada em gradiente. Experimentos nas funções de benchmark ZDT mostram melhor desempenho e convergência mais rápida comparado a algoritmos evolutivos de referência. Finalmente, foram apresentadas condições necessárias e suficientes para que uma função descreva uma fronteira de Pareto válida. Com base nestes resultados, adaptou-se um processo Gaussiano para penalizar violações das condições e mostrou-se que ele fornece melhores estimativas do que outros algoritmos de aproximação. Em particular, ele cria uma curva que não viola as restrições tanto quanto algoritmos que não consideram as condições, sendo mais confiável como um indicador de performance. Foi também demonstrado que uma métrica de otimização comum, quando aproximando funções com processos Gaussianos, é uma boa indicadora das regiões que um algoritmo deveria explorar para encontrar a fronteira de ParetoAbstract: The main purpose of this thesis is to bridge the gap between single-objective and multi- objective optimization and to show that connecting techniques from both ends can lead to improved results. To reach this goal, we provide contributions in three directions. First, we show the connection between optimality of a mean loss and the hypervolume when evaluating a single solution, proving optimality bounds when the solution from one is applied to the other. Furthermore, an evaluation of the gradient of the hypervolume shows that it can be interpreted as a particular case of the weighted mean loss, where the weights increase as their associated losses increases. We hypothesize that this can help to train a machine learning model, since samples with high error will also have high weight. An experiment with a neural network validates the hypothesis, showing improved performance. Second, we evaluate previous attempts at using gradient-based hypervolume optimization to solve multi-objective problems and why they have failed. Based on the analysis, we propose a hybrid algorithm that combines gradient-based and evolutionary optimization. Experiments on the benchmark functions ZDT show improved performance and faster convergence compared with reference evolutionary algorithms. Finally, we prove necessary and sufficient conditions for a function to describe a valid Pareto frontier. Based on this result, we adapt a Gaussian process to penalize violation of the conditions and show that it provides better estimates than other approximation algorithms. In particular, it creates a curve that does not violate the constraints as much as done by algorithms that do not consider the restrictions, being a more reliable performance indicator. We also show that a common optimization metric when approximating functions with Gaussian processes is a good indicator of the regions an algorithm should explore to find the Pareto frontierDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétrica2015/09199-0CAPESFAPES

    A Survey of Contextual Optimization Methods for Decision Making under Uncertainty

    Full text link
    Recently there has been a surge of interest in operations research (OR) and the machine learning (ML) community in combining prediction algorithms and optimization techniques to solve decision-making problems in the face of uncertainty. This gave rise to the field of contextual optimization, under which data-driven procedures are developed to prescribe actions to the decision-maker that make the best use of the most recently updated information. A large variety of models and methods have been presented in both OR and ML literature under a variety of names, including data-driven optimization, prescriptive optimization, predictive stochastic programming, policy optimization, (smart) predict/estimate-then-optimize, decision-focused learning, (task-based) end-to-end learning/forecasting/optimization, etc. Focusing on single and two-stage stochastic programming problems, this review article identifies three main frameworks for learning policies from data and discusses their strengths and limitations. We present the existing models and methods under a uniform notation and terminology and classify them according to the three main frameworks identified. Our objective with this survey is to both strengthen the general understanding of this active field of research and stimulate further theoretical and algorithmic advancements in integrating ML and stochastic programming

    Machine Learning for Fluid Mechanics

    Full text link
    The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from field measurements, experiments and large-scale simulations at multiple spatiotemporal scales. Machine learning offers a wealth of techniques to extract information from data that could be translated into knowledge about the underlying fluid mechanics. Moreover, machine learning algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of machine learning for fluid mechanics. It outlines fundamental machine learning methodologies and discusses their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experimentation, and simulation. Machine learning provides a powerful information processing framework that can enrich, and possibly even transform, current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202
    corecore