5,678 research outputs found
Optimization with Discrete Simultaneous Perturbation Stochastic Approximation Using Noisy Loss Function Measurements
Discrete stochastic optimization considers the problem of minimizing (or
maximizing) loss functions defined on discrete sets, where only noisy
measurements of the loss functions are available. The discrete stochastic
optimization problem is widely applicable in practice, and many algorithms have
been considered to solve this kind of optimization problem. Motivated by the
efficient algorithm of simultaneous perturbation stochastic approximation
(SPSA) for continuous stochastic optimization problems, we introduce the middle
point discrete simultaneous perturbation stochastic approximation (DSPSA)
algorithm for the stochastic optimization of a loss function defined on a
p-dimensional grid of points in Euclidean space. We show that the sequence
generated by DSPSA converges to the optimal point under some conditions.
Consistent with other stochastic approximation methods, DSPSA formally
accommodates noisy measurements of the loss function. We also show the rate of
convergence analysis of DSPSA by solving an upper bound of the mean squared
error of the generated sequence. In order to compare the performance of DSPSA
with the other algorithms such as the stochastic ruler algorithm (SR) and the
stochastic comparison algorithm (SC), we set up a bridge between DSPSA and the
other two algorithms by comparing the probability in a big-O sense of not
achieving the optimal solution. We show the theoretical and numerical
comparison results of DSPSA, SR, and SC. In addition, we consider an
application of DSPSA towards developing optimal public health strategies for
containing the spread of influenza given limited societal resources
Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation
Feature selection (FS) has become an indispensable task in dealing with
today's highly complex pattern recognition problems with massive number of
features. In this study, we propose a new wrapper approach for FS based on
binary simultaneous perturbation stochastic approximation (BSPSA). This
pseudo-gradient descent stochastic algorithm starts with an initial feature
vector and moves toward the optimal feature vector via successive iterations.
In each iteration, the current feature vector's individual components are
perturbed simultaneously by random offsets from a qualified probability
distribution. We present computational experiments on datasets with numbers of
features ranging from a few dozens to thousands using three widely-used
classifiers as wrappers: nearest neighbor, decision tree, and linear support
vector machine. We compare our methodology against the full set of features as
well as a binary genetic algorithm and sequential FS methods using
cross-validated classification error rate and AUC as the performance criteria.
Our results indicate that features selected by BSPSA compare favorably to
alternative methods in general and BSPSA can yield superior feature sets for
datasets with tens of thousands of features by examining an extremely small
fraction of the solution space. We are not aware of any other wrapper FS
methods that are computationally feasible with good convergence properties for
such large datasets.Comment: This is the Istanbul Sehir University Technical Report
#SHR-ISE-2016.01. A short version of this report has been accepted for
publication at Pattern Recognition Letter
- …