9,906 research outputs found

    Improving Feature Extraction by Replacing the Fisher Criterion by an Upper Error Bound

    Get PDF
    A lot of alternatives and constraints have been proposed in order to improve the Fisher criterion. But most of them are not linked to the error rate, the primary interest in many applications of classification. By introducing an upper bound for the error rate a criterion is developed which can improve the classification performance. --Fisher criterion,Linear discriminant analysis,Feature extraction

    Improving feature extraction by replacing the Fisher criterion by an upper error bound

    Get PDF
    A lot of alternatives and constraints have been proposed in order to improve the Fisher criterion. But most of them are not linked to the error rate, the primary interest in many applications of classification. By introducing an upper bound for the error rate a criterion is developed which can improve the classification performance

    Design of an Adaptive Classification Procedure for the Analysis of High-Dimensional Data with Limited Training Samples

    Get PDF
    In a typical supervised classification procedure the availability of training samples has a fundamental effect on classifier performance. For a fixed number of training samples classifier performance is degraded as the number of dimensions (features) is increased. This phenomenon has a significant influence on the analysis of hyperspectral data sets where the ratio of training samples to dimensionality is small. Objectives of this research are to develop novel methods for mitigating the detrimental effects arising from this small ratio and to reduce the effort required by an analyst in terms of training sample selection. An iterative method is developed where semi-labeled samples (classification outputs) are used with the original training samples to estimate parameters and establish a positive feedback procedure wherein parameter estimation and classification enhance each other in an iterative fashion. This work is comprised of four discrete phases. First, the role of semi-labeled samples on parameter estimates is investigated. In this phase it is demonstrated that an iterative procedure based on positive feedback is achievable. Second, a maximum likelihood pixel-wise adaptive classifier is designed. Third, a family of adaptive covariance estimators is developed that combines the adaptive classifiers and covariance estimators to deal with cases where the training sample set is extremely small. Finally, to fully utilize the rich spectral and spatial information contained in hyperspectral data and enhance the performance and robustness of the proposed adaptive classifier, an adaptive Bayesian contextual classifier based on the Markov random field is developed

    Fast Selection of Spectral Variables with B-Spline Compression

    Get PDF
    The large number of spectral variables in most data sets encountered in spectral chemometrics often renders the prediction of a dependent variable uneasy. The number of variables hopefully can be reduced, by using either projection techniques or selection methods; the latter allow for the interpretation of the selected variables. Since the optimal approach of testing all possible subsets of variables with the prediction model is intractable, an incremental selection approach using a nonparametric statistics is a good option, as it avoids the computationally intensive use of the model itself. It has two drawbacks however: the number of groups of variables to test is still huge, and colinearities can make the results unstable. To overcome these limitations, this paper presents a method to select groups of spectral variables. It consists in a forward-backward procedure applied to the coefficients of a B-Spline representation of the spectra. The criterion used in the forward-backward procedure is the mutual information, allowing to find nonlinear dependencies between variables, on the contrary of the generally used correlation. The spline representation is used to get interpretability of the results, as groups of consecutive spectral variables will be selected. The experiments conducted on NIR spectra from fescue grass and diesel fuels show that the method provides clearly identified groups of selected variables, making interpretation easy, while keeping a low computational load. The prediction performances obtained using the selected coefficients are higher than those obtained by the same method applied directly to the original variables and similar to those obtained using traditional models, although using significantly less spectral variables

    Kernel methods in machine learning

    Full text link
    We review machine learning methods employing positive definite kernels. These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of functions defined on the data domain, expanded in terms of a kernel. Working in linear spaces of function has the benefit of facilitating the construction and analysis of learning algorithms while at the same time allowing large classes of functions. The latter include nonlinear functions as well as functions defined on nonvectorial data. We cover a wide range of methods, ranging from binary classifiers to sophisticated methods for estimation with structured data.Comment: Published in at http://dx.doi.org/10.1214/009053607000000677 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore