41,354 research outputs found

    Robust linear discriminant analysis for multiple groups: influence and classification efficiencies.

    Get PDF
    Linear discriminant analysis for multiple groups is typically carried out using Fisher's method. This method relies on the sample averages and covariance ma- trices computed from the different groups constituting the training sample. Since sample averages and covariance matrices are not robust, it is proposed to use robust estimators of location and covariance instead, yielding a robust version of Fisher's method. In this paper expressions are derived for the influence that an observation in the training set has on the error rate of the Fisher method for multiple linear discriminant analysis. These influence functions on the error rate turn out to be unbounded for the classical rule, but bounded when using a robust approach. Using these influence functions, we compute relative classification efficiencies of the robust procedures with respect to the classical method. It is shown that, by using an appropriate robust estimator, the loss in classification efficiency at the normal model remains limited. These findings are confirmed by finite sample simulations.Classification; Covariance; Discriminant analysis; Efficiency; Error rate; Estimator; Fisher rule; Functions; Influence function; Model; Multiple groups; Research; Robustness; Simulation; Training;

    Classification efficiencies for robust linear discriminant analysis.

    Get PDF
    Linear discriminant analysis is typically carried out using Fisher’s method. This method relies on the sample averages and covariance matrices computed from the different groups constituting the training sample. Since sample averages and covariance matrices are not robust, it has been proposed to use robust estimators of location and covariance instead, yielding a robust version of Fisher’s method. In this paper relative classification efficiencies of the robust procedures with respect to the classical method are computed. Second order influence functions appear to be useful for computing these classification efficiencies. It turns out that, when using an appropriate robust estimator, the loss in classification efficiency at the normal model remains limited. These findings are confirmed by finite sample simulations.Classification efficiency; Discriminant analysis; Error rate; Fisher rule; Influence function; Robustness;

    How to Solve Classification and Regression Problems on High-Dimensional Data with a Supervised Extension of Slow Feature Analysis

    Get PDF
    Supervised learning from high-dimensional data, e.g., multimedia data, is a challenging task. We propose an extension of slow feature analysis (SFA) for supervised dimensionality reduction called graph-based SFA (GSFA). The algorithm extracts a label-predictive low-dimensional set of features that can be post-processed by typical supervised algorithms to generate the final label or class estimation. GSFA is trained with a so-called training graph, in which the vertices are the samples and the edges represent similarities of the corresponding labels. A new weighted SFA optimization problem is introduced, generalizing the notion of slowness from sequences of samples to such training graphs. We show that GSFA computes an optimal solution to this problem in the considered function space, and propose several types of training graphs. For classification, the most straightforward graph yields features equivalent to those of (nonlinear) Fisher discriminant analysis. Emphasis is on regression, where four different graphs were evaluated experimentally with a subproblem of face detection on photographs. The method proposed is promising particularly when linear models are insufficient, as well as when feature selection is difficult

    A system identification based approach for pulsed eddy current non-destructive evaluation

    Get PDF
    This paper is concerned with the development of a new system identification based approach for pulsed eddy current non-destructive evaluation and the use of the new approach in experimental studies to verify its effectiveness and demonstrate its potential in engineering applications

    A CASE STUDY ON SUPPORT VECTOR MACHINES VERSUS ARTIFICIAL NEURAL NETWORKS

    Get PDF
    The capability of artificial neural networks for pattern recognition of real world problems is well known. In recent years, the support vector machine has been advocated for its structure risk minimization leading to tolerance margins of decision boundaries. Structures and performances of these pattern classifiers depend on the feature dimension and training data size. The objective of this research is to compare these pattern recognition systems based on a case study. The particular case considered is on classification of hypertensive and normotensive right ventricle (RV) shapes obtained from Magnetic Resonance Image (MRI) sequences. In this case, the feature dimension is reasonable, but the available training data set is small, however, the decision surface is highly nonlinear.For diagnosis of congenital heart defects, especially those associated with pressure and volume overload problems, a reliable pattern classifier for determining right ventricle function is needed. RV¡¦s global and regional surface to volume ratios are assessed from an individual¡¦s MRI heart images. These are used as features for pattern classifiers. We considered first two linear classification methods: the Fisher linear discriminant and the linear classifier trained by the Ho-Kayshap algorithm. When the data are not linearly separable, artificial neural networks with back-propagation training and radial basis function networks were then considered, providing nonlinear decision surfaces. Thirdly, a support vector machine was trained which gives tolerance margins on both sides of the decision surface. We have found in this case study that the back-propagation training of an artificial neural network depends heavily on the selection of initial weights, even though randomized. The support vector machine where radial basis function kernels are used is easily trained and provides decision tolerance margins, in spite of only small margins

    The Cumulative Distribution Transform and Linear Pattern Classification

    Full text link
    Discriminating data classes emanating from sensors is an important problem with many applications in science and technology. We describe a new transform for pattern identification that interprets patterns as probability density functions, and has special properties with regards to classification. The transform, which we denote as the Cumulative Distribution Transform (CDT) is invertible, with well defined forward and inverse operations. We show that it can be useful in `parsing out' variations (confounds) that are `Lagrangian' (displacement and intensity variations) by converting these to `Eulerian' (intensity variations) in transform space. This conversion is the basis for our main result that describes when the CDT can allow for linear classification to be possible in transform space. We also describe several properties of the transform and show, with computational experiments that used both real and simulated data, that the CDT can help render a variety of real world problems simpler to solve

    Sparse multinomial kernel discriminant analysis (sMKDA)

    No full text
    Dimensionality reduction via canonical variate analysis (CVA) is important for pattern recognition and has been extended variously to permit more flexibility, e.g. by "kernelizing" the formulation. This can lead to over-fitting, usually ameliorated by regularization. Here, a method for sparse, multinomial kernel discriminant analysis (sMKDA) is proposed, using a sparse basis to control complexity. It is based on the connection between CVA and least-squares, and uses forward selection via orthogonal least-squares to approximate a basis, generalizing a similar approach for binomial problems. Classification can be performed directly via minimum Mahalanobis distance in the canonical variates. sMKDA achieves state-of-the-art performance in terms of accuracy and sparseness on 11 benchmark datasets
    corecore