46,092 research outputs found
Fast Cross-Validation via Sequential Testing
With the increasing size of today's data sets, finding the right parameter
configuration in model selection via cross-validation can be an extremely
time-consuming task. In this paper we propose an improved cross-validation
procedure which uses nonparametric testing coupled with sequential analysis to
determine the best parameter set on linearly increasing subsets of the data. By
eliminating underperforming candidates quickly and keeping promising candidates
as long as possible, the method speeds up the computation while preserving the
capability of the full cross-validation. Theoretical considerations underline
the statistical power of our procedure. The experimental evaluation shows that
our method reduces the computation time by a factor of up to 120 compared to a
full cross-validation with a negligible impact on the accuracy
Approximate Bayesian Computation for a Class of Time Series Models
In the following article we consider approximate Bayesian computation (ABC)
for certain classes of time series models. In particular, we focus upon
scenarios where the likelihoods of the observations and parameter are
intractable, by which we mean that one cannot evaluate the likelihood even
up-to a positive unbiased estimate. This paper reviews and develops a class of
approximation procedures based upon the idea of ABC, but, specifically
maintains the probabilistic structure of the original statistical model. This
idea is useful, in that it can facilitate an analysis of the bias of the
approximation and the adaptation of established computational methods for
parameter inference. Several existing results in the literature are surveyed
and novel developments with regards to computation are given
Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation
Feature selection (FS) has become an indispensable task in dealing with
today's highly complex pattern recognition problems with massive number of
features. In this study, we propose a new wrapper approach for FS based on
binary simultaneous perturbation stochastic approximation (BSPSA). This
pseudo-gradient descent stochastic algorithm starts with an initial feature
vector and moves toward the optimal feature vector via successive iterations.
In each iteration, the current feature vector's individual components are
perturbed simultaneously by random offsets from a qualified probability
distribution. We present computational experiments on datasets with numbers of
features ranging from a few dozens to thousands using three widely-used
classifiers as wrappers: nearest neighbor, decision tree, and linear support
vector machine. We compare our methodology against the full set of features as
well as a binary genetic algorithm and sequential FS methods using
cross-validated classification error rate and AUC as the performance criteria.
Our results indicate that features selected by BSPSA compare favorably to
alternative methods in general and BSPSA can yield superior feature sets for
datasets with tens of thousands of features by examining an extremely small
fraction of the solution space. We are not aware of any other wrapper FS
methods that are computationally feasible with good convergence properties for
such large datasets.Comment: This is the Istanbul Sehir University Technical Report
#SHR-ISE-2016.01. A short version of this report has been accepted for
publication at Pattern Recognition Letter
Compression for Smooth Shape Analysis
Most 3D shape analysis methods use triangular meshes to discretize both the
shape and functions on it as piecewise linear functions. With this
representation, shape analysis requires fine meshes to represent smooth shapes
and geometric operators like normals, curvatures, or Laplace-Beltrami
eigenfunctions at large computational and memory costs.
We avoid this bottleneck with a compression technique that represents a
smooth shape as subdivision surfaces and exploits the subdivision scheme to
parametrize smooth functions on that shape with a few control parameters. This
compression does not affect the accuracy of the Laplace-Beltrami operator and
its eigenfunctions and allow us to compute shape descriptors and shape
matchings at an accuracy comparable to triangular meshes but a fraction of the
computational cost.
Our framework can also compress surfaces represented by point clouds to do
shape analysis of 3D scanning data
- …