46,092 research outputs found

    Fast Cross-Validation via Sequential Testing

    Full text link
    With the increasing size of today's data sets, finding the right parameter configuration in model selection via cross-validation can be an extremely time-consuming task. In this paper we propose an improved cross-validation procedure which uses nonparametric testing coupled with sequential analysis to determine the best parameter set on linearly increasing subsets of the data. By eliminating underperforming candidates quickly and keeping promising candidates as long as possible, the method speeds up the computation while preserving the capability of the full cross-validation. Theoretical considerations underline the statistical power of our procedure. The experimental evaluation shows that our method reduces the computation time by a factor of up to 120 compared to a full cross-validation with a negligible impact on the accuracy

    Approximate Bayesian Computation for a Class of Time Series Models

    Full text link
    In the following article we consider approximate Bayesian computation (ABC) for certain classes of time series models. In particular, we focus upon scenarios where the likelihoods of the observations and parameter are intractable, by which we mean that one cannot evaluate the likelihood even up-to a positive unbiased estimate. This paper reviews and develops a class of approximation procedures based upon the idea of ABC, but, specifically maintains the probabilistic structure of the original statistical model. This idea is useful, in that it can facilitate an analysis of the bias of the approximation and the adaptation of established computational methods for parameter inference. Several existing results in the literature are surveyed and novel developments with regards to computation are given

    Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation

    Full text link
    Feature selection (FS) has become an indispensable task in dealing with today's highly complex pattern recognition problems with massive number of features. In this study, we propose a new wrapper approach for FS based on binary simultaneous perturbation stochastic approximation (BSPSA). This pseudo-gradient descent stochastic algorithm starts with an initial feature vector and moves toward the optimal feature vector via successive iterations. In each iteration, the current feature vector's individual components are perturbed simultaneously by random offsets from a qualified probability distribution. We present computational experiments on datasets with numbers of features ranging from a few dozens to thousands using three widely-used classifiers as wrappers: nearest neighbor, decision tree, and linear support vector machine. We compare our methodology against the full set of features as well as a binary genetic algorithm and sequential FS methods using cross-validated classification error rate and AUC as the performance criteria. Our results indicate that features selected by BSPSA compare favorably to alternative methods in general and BSPSA can yield superior feature sets for datasets with tens of thousands of features by examining an extremely small fraction of the solution space. We are not aware of any other wrapper FS methods that are computationally feasible with good convergence properties for such large datasets.Comment: This is the Istanbul Sehir University Technical Report #SHR-ISE-2016.01. A short version of this report has been accepted for publication at Pattern Recognition Letter

    Compression for Smooth Shape Analysis

    Full text link
    Most 3D shape analysis methods use triangular meshes to discretize both the shape and functions on it as piecewise linear functions. With this representation, shape analysis requires fine meshes to represent smooth shapes and geometric operators like normals, curvatures, or Laplace-Beltrami eigenfunctions at large computational and memory costs. We avoid this bottleneck with a compression technique that represents a smooth shape as subdivision surfaces and exploits the subdivision scheme to parametrize smooth functions on that shape with a few control parameters. This compression does not affect the accuracy of the Laplace-Beltrami operator and its eigenfunctions and allow us to compute shape descriptors and shape matchings at an accuracy comparable to triangular meshes but a fraction of the computational cost. Our framework can also compress surfaces represented by point clouds to do shape analysis of 3D scanning data
    corecore