23,251 research outputs found
GLOBAL OPTIMIZATION METHODS
Training a neural network is a difficult optimization problem because of numerous local minimums. Many global search algorithms have been used to train neural networks. However, local search algorithms are more efficient with computational resources, and therefore numerous random restarts with a local algorithm may be more effective than a global algorithm. This study uses Monte-Carlo simulations to determine the relative efficiency of a local search algorithm to 9 stochastic global algorithms. The computational requirements of the global algorithms are several times higher than the local algorithm and there is little gain in using the global algorithms to train neural networks.Research Methods/ Statistical Methods,
On-line Search History-assisted Restart Strategy for Covariance Matrix Adaptation Evolution Strategy
Restart strategy helps the covariance matrix adaptation evolution strategy
(CMA-ES) to increase the probability of finding the global optimum in
optimization, while a single run CMA-ES is easy to be trapped in local optima.
In this paper, the continuous non-revisiting genetic algorithm (cNrGA) is used
to help CMA-ES to achieve multiple restarts from different sub-regions of the
search space. The CMA-ES with on-line search history-assisted restart strategy
(HR-CMA-ES) is proposed. The entire on-line search history of cNrGA is stored
in a binary space partitioning (BSP) tree, which is effective for performing
local search. The frequently sampled sub-region is reflected by a deep position
in the BSP tree. When leaf nodes are located deeper than a threshold, the
corresponding sub-region is considered a region of interest (ROI). In
HR-CMA-ES, cNrGA is responsible for global exploration and suggesting ROI for
CMA-ES to perform an exploitation within or around the ROI. CMA-ES restarts
independently in each suggested ROI. The non-revisiting mechanism of cNrGA
avoids to suggest the same ROI for a second time. Experimental results on the
CEC 2013 and 2017 benchmark suites show that HR-CMA-ES performs better than
both CMA-ES and cNrGA. A positive synergy is observed by the memetic
cooperation of the two algorithms.Comment: 8 pages, 9 figure
A note on ‘good starting values’ in numerical optimisation
Many optimisation problems in finance and economics have multiple local optima or discontinuities in their objective functions. In such cases it is stressed that ‘good starting points are important’. We look into a particular example: calibrating a yield curve model. We find that while ‘good starting values’ suggested in the literature produce parameters that are indeed ‘good’, a simple best-of-n–restarts strategy with random starting points gives results that are never worse, but better in many cases.
The Potential of Restarts for ProbSAT
This work analyses the potential of restarts for probSAT, a quite successful
algorithm for k-SAT, by estimating its runtime distributions on random 3-SAT
instances that are close to the phase transition. We estimate an optimal
restart time from empirical data, reaching a potential speedup factor of 1.39.
Calculating restart times from fitted probability distributions reduces this
factor to a maximum of 1.30. A spin-off result is that the Weibull distribution
approximates the runtime distribution for over 93% of the used instances well.
A machine learning pipeline is presented to compute a restart time for a
fixed-cutoff strategy to exploit this potential. The main components of the
pipeline are a random forest for determining the distribution type and a neural
network for the distribution's parameters. ProbSAT performs statistically
significantly better than Luby's restart strategy and the policy without
restarts when using the presented approach. The structure is particularly
advantageous on hard problems.Comment: Eurocast 201
Runtime Distributions and Criteria for Restarts
Randomized algorithms sometimes employ a restart strategy. After a certain
number of steps, the current computation is aborted and restarted with a new,
independent random seed. In some cases, this results in an improved overall
expected runtime. This work introduces properties of the underlying runtime
distribution which determine whether restarts are advantageous. The most
commonly used probability distributions admit the use of a scale and a location
parameter. Location parameters shift the density function to the right, while
scale parameters affect the spread of the distribution. It is shown that for
all distributions scale parameters do not influence the usefulness of restarts
and that location parameters only have a limited influence. This result
simplifies the analysis of the usefulness of restarts. The most important
runtime probability distributions are the log-normal, the Weibull, and the
Pareto distribution. In this work, these distributions are analyzed for the
usefulness of restarts. Secondly, a condition for the optimal restart time (if
it exists) is provided. The log-normal, the Weibull, and the generalized Pareto
distribution are analyzed in this respect. Moreover, it is shown that the
optimal restart time is also not influenced by scale parameters and that the
influence of location parameters is only linear
Alternating Synthetic and Real Gradients for Neural Language Modeling
Training recurrent neural networks (RNNs) with backpropagation through time
(BPTT) has known drawbacks such as being difficult to capture longterm
dependencies in sequences. Successful alternatives to BPTT have not yet been
discovered. Recently, BP with synthetic gradients by a decoupled neural
interface module has been proposed to replace BPTT for training RNNs. On the
other hand, it has been shown that the representations learned with synthetic
and real gradients are different though they are functionally identical. In
this project, we explore ways of combining synthetic and real gradients with
application to neural language modeling tasks. Empirically, we demonstrate the
effectiveness of alternating training with synthetic and real gradients after
periodic warm restarts on language modeling tasks
- …