Search CORE

53,119 research outputs found

Sensitivity of trust-region algorithms to their parameters

Author: Gould Nick
Orban D.
Sartenaer A.
Toint Philippe
Publication venue
Publication date: 01/01/2004
Field of study

In this paper, we examine the sensitivity of trust-region algorithms on the parameters related to the step acceptance and update of the trust region. We show, in the context of unconstrained programming, that the numerical efficiency of these algorithms can easily be improved by choosing appropriate parameters. Recommended ranges of values for these parameters are exhibited on the basis of extensive numerical tests. © Springer-Verlag 2005

PolyPublie

ePubs: the open archive for STFC research publications

Repository of the University of Namur

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

Author: Mahoney Michael W.
Roosta-Khorasani Farbod
Xu Peng
Publication venue
Publication date: 25/08/2017
Field of study

While first-order optimization methods such as stochastic gradient descent (SGD) are popular in machine learning (ML), they come with well-known deficiencies, including relatively-slow convergence, sensitivity to the settings of hyper-parameters such as learning rate, stagnation at high training errors, and difficulty in escaping flat regions and saddle points. These issues are particularly acute in highly non-convex settings such as those arising in neural networks. Motivated by this, there has been recent interest in second-order methods that aim to alleviate these shortcomings by capturing curvature information. In this paper, we report detailed empirical evaluations of a class of Newton-type methods, namely sub-sampled variants of trust region (TR) and adaptive regularization with cubics (ARC) algorithms, for non-convex ML problems. In doing so, we demonstrate that these methods not only can be computationally competitive with hand-tuned SGD with momentum, obtaining comparable or better generalization performance, but also they are highly robust to hyper-parameter settings. Further, in contrast to SGD with momentum, we show that the manner in which these Newton-type methods employ curvature information allows them to seamlessly escape flat regions and saddle points.Comment: 21 pages, 11 figures. Restructure the paper and add experiment

arXiv.org e-Print Archive

University of Queensland eSpace

Discretizing Continuous Action Space for On-Policy Optimization

Author: Agrawal Shipra
Tang Yunhao
Publication venue
Publication date: 19/03/2020
Field of study

In this work, we show that discretizing action space for continuous control is a simple yet powerful technique for on-policy optimization. The explosion in the number of discrete actions can be efficiently addressed by a policy with factorized distribution across action dimensions. We show that the discrete policy achieves significant performance gains with state-of-the-art on-policy optimization algorithms (PPO, TRPO, ACKTR) especially on high-dimensional tasks with complex dynamics. Additionally, we show that an ordinal parameterization of the discrete distribution can introduce the inductive bias that encodes the natural ordering between discrete actions. This ordinal architecture further significantly improves the performance of PPO/TRPO.Comment: Accepted at AAAI Conference on Artificial Intelligence (2020) in New York, NY, USA. An open source implementation can be found at https://github.com/robintyh1/onpolicybaseline

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics

Author: AG Barto
C Watkins
D Silver
LJ Lin
R Bellman
RJ Williams
VR Konda
WR Thompson
Publication venue
Publication date: 12/06/2019
Field of study

Value-based reinforcement-learning algorithms provide state-of-the-art results in model-free discrete-action settings, and tend to outperform actor-critic algorithms. We argue that actor-critic algorithms are limited by their need for an on-policy critic. We propose Bootstrapped Dual Policy Iteration (BDPI), a novel model-free reinforcement-learning algorithm for continuous states and discrete actions, with an actor and several off-policy critics. Off-policy critics are compatible with experience replay, ensuring high sample-efficiency, without the need for off-policy corrections. The actor, by slowly imitating the average greedy policy of the critics, leads to high-quality and state-specific exploration, which we compare to Thompson sampling. Because the actor and critics are fully decoupled, BDPI is remarkably stable, and unusually robust to its hyper-parameters. BDPI is significantly more sample-efficient than Bootstrapped DQN, PPO, and ACKTR, on discrete, continuous and pixel-based tasks. Source code: https://github.com/vub-ai-lab/bdpi.Comment: Accepted at the European Conference on Machine Learning 2019 (ECML

arXiv.org e-Print Archive

VU Research Portal

Crossref

Classical Optimizers for Noisy Intermediate-Scale Quantum Devices

Author: De Jong W
Iancu C
Lavrijsen W
Muller J
Tudor A
Publication venue: eScholarship, University of California
Publication date: 01/10/2020
Field of study

We present a collection of optimizers tuned for usage on Noisy Intermediate-Scale Quantum (NISQ) devices. Optimizers have a range of applications in quantum computing, including the Variational Quantum Eigensolver (VQE) and Quantum Approximate Optimization (QAOA) algorithms. They are also used for calibration tasks, hyperparameter tuning, in machine learning, etc. We analyze the efficiency and effectiveness of different optimizers in a VQE case study. VQE is a hybrid algorithm, with a classical minimizer step driving the next evaluation on the quantum processor. While most results to date concentrated on tuning the quantum VQE circuit, we show that, in the presence of quantum noise, the classical minimizer step needs to be carefully chosen to obtain correct results. We explore state-of-the-art gradient-free optimizers capable of handling noisy, black-box, cost functions and stress-test them using a quantum circuit simulation environment with noise injection capabilities on individual gates. Our results indicate that specifically tuned optimizers are crucial to obtaining valid science results on NISQ hardware, and will likely remain necessary even for future fault tolerant circuits

Crossref

eScholarship - University of California

Static Pricing Problems under Mixed Multinomial Logit Demand

Author: Lurkin Virginie
Marandi Ahmadreza
Publication venue
Publication date: 15/05/2020
Field of study

Price differentiation is a common strategy for many transport operators. In this paper, we study a static multiproduct price optimization problem with demand given by a continuous mixed multinomial logit model. To solve this new problem, we design an efficient iterative optimization algorithm that asymptotically converges to the optimal solution. To this end, a linear optimization (LO) problem is formulated, based on the trust-region approach, to find a "good" feasible solution and approximate the problem from below. Another LO problem is designed using piecewise linear relaxations to approximate the optimization problem from above. Then, we develop a new branching method to tighten the optimality gap. Numerical experiments show the effectiveness of our method on a published, non-trivial, parking choice model

arXiv.org e-Print Archive

Pure OAI Repository

Directory of Open Access Journals