2,872 research outputs found
Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation
Feature selection (FS) has become an indispensable task in dealing with
today's highly complex pattern recognition problems with massive number of
features. In this study, we propose a new wrapper approach for FS based on
binary simultaneous perturbation stochastic approximation (BSPSA). This
pseudo-gradient descent stochastic algorithm starts with an initial feature
vector and moves toward the optimal feature vector via successive iterations.
In each iteration, the current feature vector's individual components are
perturbed simultaneously by random offsets from a qualified probability
distribution. We present computational experiments on datasets with numbers of
features ranging from a few dozens to thousands using three widely-used
classifiers as wrappers: nearest neighbor, decision tree, and linear support
vector machine. We compare our methodology against the full set of features as
well as a binary genetic algorithm and sequential FS methods using
cross-validated classification error rate and AUC as the performance criteria.
Our results indicate that features selected by BSPSA compare favorably to
alternative methods in general and BSPSA can yield superior feature sets for
datasets with tens of thousands of features by examining an extremely small
fraction of the solution space. We are not aware of any other wrapper FS
methods that are computationally feasible with good convergence properties for
such large datasets.Comment: This is the Istanbul Sehir University Technical Report
#SHR-ISE-2016.01. A short version of this report has been accepted for
publication at Pattern Recognition Letter
Curvature Aligned Simplex Gradient: Principled Sample Set Construction For Numerical Differentiation
The simplex gradient, a popular numerical differentiation method due to its
flexibility, lacks a principled method by which to construct the sample set,
specifically the location of function evaluations. Such evaluations, especially
from real-world systems, are often noisy and expensive to obtain, making it
essential that each evaluation is carefully chosen to reduce cost and increase
accuracy. This paper introduces the curvature aligned simplex gradient (CASG),
which provably selects the optimal sample set under a mean squared error
objective. As CASG requires function-dependent information often not available
in practice, we additionally introduce a framework which exploits a history of
function evaluations often present in practical applications. Our numerical
results, focusing on applications in sensitivity analysis and derivative free
optimization, show that our methodology significantly outperforms or matches
the performance of the benchmark gradient estimator given by forward
differences (FD) which is given exact function-dependent information that is
not available in practice. Furthermore, our methodology is comparable to the
performance of central differences (CD) that requires twice the number of
function evaluations.Comment: 31 Pages, 5 Figures, Submitted to IMA Numerical Analysi
An algorithm for maximum likelihood estimation using an efficient method for approximating sensitivities
An algorithm for maximum likelihood (ML) estimation is developed primarily for multivariable dynamic systems. The algorithm relies on a new optimization method referred to as a modified Newton-Raphson with estimated sensitivities (MNRES). The method determines sensitivities by using slope information from local surface approximations of each output variable in parameter space. The fitted surface allows sensitivity information to be updated at each iteration with a significant reduction in computational effort compared with integrating the analytically determined sensitivity equations or using a finite-difference method. Different surface-fitting methods are discussed and demonstrated. Aircraft estimation problems are solved by using both simulated and real-flight data to compare MNRES with commonly used methods; in these solutions MNRES is found to be equally accurate and substantially faster. MNRES eliminates the need to derive sensitivity equations, thus producing a more generally applicable algorithm
- …