53,119 research outputs found
Sensitivity of trust-region algorithms to their parameters
In this paper, we examine the sensitivity of trust-region algorithms on the parameters related to the step acceptance and update of the trust region. We show, in the context of unconstrained programming, that the numerical efficiency of these algorithms can easily be improved by choosing appropriate parameters. Recommended ranges of values for these parameters are exhibited on the basis of extensive numerical tests. © Springer-Verlag 2005
Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study
While first-order optimization methods such as stochastic gradient descent
(SGD) are popular in machine learning (ML), they come with well-known
deficiencies, including relatively-slow convergence, sensitivity to the
settings of hyper-parameters such as learning rate, stagnation at high training
errors, and difficulty in escaping flat regions and saddle points. These issues
are particularly acute in highly non-convex settings such as those arising in
neural networks. Motivated by this, there has been recent interest in
second-order methods that aim to alleviate these shortcomings by capturing
curvature information. In this paper, we report detailed empirical evaluations
of a class of Newton-type methods, namely sub-sampled variants of trust region
(TR) and adaptive regularization with cubics (ARC) algorithms, for non-convex
ML problems. In doing so, we demonstrate that these methods not only can be
computationally competitive with hand-tuned SGD with momentum, obtaining
comparable or better generalization performance, but also they are highly
robust to hyper-parameter settings. Further, in contrast to SGD with momentum,
we show that the manner in which these Newton-type methods employ curvature
information allows them to seamlessly escape flat regions and saddle points.Comment: 21 pages, 11 figures. Restructure the paper and add experiment
Discretizing Continuous Action Space for On-Policy Optimization
In this work, we show that discretizing action space for continuous control
is a simple yet powerful technique for on-policy optimization. The explosion in
the number of discrete actions can be efficiently addressed by a policy with
factorized distribution across action dimensions. We show that the discrete
policy achieves significant performance gains with state-of-the-art on-policy
optimization algorithms (PPO, TRPO, ACKTR) especially on high-dimensional tasks
with complex dynamics. Additionally, we show that an ordinal parameterization
of the discrete distribution can introduce the inductive bias that encodes the
natural ordering between discrete actions. This ordinal architecture further
significantly improves the performance of PPO/TRPO.Comment: Accepted at AAAI Conference on Artificial Intelligence (2020) in New
York, NY, USA. An open source implementation can be found at
https://github.com/robintyh1/onpolicybaseline
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Value-based reinforcement-learning algorithms provide state-of-the-art
results in model-free discrete-action settings, and tend to outperform
actor-critic algorithms. We argue that actor-critic algorithms are limited by
their need for an on-policy critic. We propose Bootstrapped Dual Policy
Iteration (BDPI), a novel model-free reinforcement-learning algorithm for
continuous states and discrete actions, with an actor and several off-policy
critics. Off-policy critics are compatible with experience replay, ensuring
high sample-efficiency, without the need for off-policy corrections. The actor,
by slowly imitating the average greedy policy of the critics, leads to
high-quality and state-specific exploration, which we compare to Thompson
sampling. Because the actor and critics are fully decoupled, BDPI is remarkably
stable, and unusually robust to its hyper-parameters. BDPI is significantly
more sample-efficient than Bootstrapped DQN, PPO, and ACKTR, on discrete,
continuous and pixel-based tasks. Source code:
https://github.com/vub-ai-lab/bdpi.Comment: Accepted at the European Conference on Machine Learning 2019 (ECML
Classical Optimizers for Noisy Intermediate-Scale Quantum Devices
We present a collection of optimizers tuned for usage on Noisy Intermediate-Scale Quantum (NISQ) devices. Optimizers have a range of applications in quantum computing, including the Variational Quantum Eigensolver (VQE) and Quantum Approximate Optimization (QAOA) algorithms. They are also used for calibration tasks, hyperparameter tuning, in machine learning, etc. We analyze the efficiency and effectiveness of different optimizers in a VQE case study. VQE is a hybrid algorithm, with a classical minimizer step driving the next evaluation on the quantum processor. While most results to date concentrated on tuning the quantum VQE circuit, we show that, in the presence of quantum noise, the classical minimizer step needs to be carefully chosen to obtain correct results. We explore state-of-the-art gradient-free optimizers capable of handling noisy, black-box, cost functions and stress-test them using a quantum circuit simulation environment with noise injection capabilities on individual gates. Our results indicate that specifically tuned optimizers are crucial to obtaining valid science results on NISQ hardware, and will likely remain necessary even for future fault tolerant circuits
Static Pricing Problems under Mixed Multinomial Logit Demand
Price differentiation is a common strategy for many transport operators. In
this paper, we study a static multiproduct price optimization problem with
demand given by a continuous mixed multinomial logit model. To solve this new
problem, we design an efficient iterative optimization algorithm that
asymptotically converges to the optimal solution. To this end, a linear
optimization (LO) problem is formulated, based on the trust-region approach, to
find a "good" feasible solution and approximate the problem from below. Another
LO problem is designed using piecewise linear relaxations to approximate the
optimization problem from above. Then, we develop a new branching method to
tighten the optimality gap. Numerical experiments show the effectiveness of our
method on a published, non-trivial, parking choice model
- …