Search CORE

24 research outputs found

Distribution-free discriminant analysis

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Some intriguing properties of Tukey's half-space depth

Author: Chaudhuri Probal
Dutta Subhajit
Ghosh Anil K.
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2011
Field of study

For multivariate data, Tukey's half-space depth is one of the most popular depth functions available in the literature. It is conceptually simple and satisfies several desirable properties of depth functions. The Tukey median, the multivariate median associated with the half-space depth, is also a well-known measure of center for multivariate data with several interesting properties. In this article, we derive and investigate some interesting properties of half-space depth and its associated multivariate median. These properties, some of which are counterintuitive, have important statistical consequences in multivariate analysis. We also investigate a natural extension of Tukey's half-space depth and the related median for probability distributions on any Banach space (which may be finite- or infinite-dimensional) and prove some results that demonstrate anomalous behavior of half-space depth in infinite-dimensional spaces.Comment: Published in at http://dx.doi.org/10.3150/10-BEJ322 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref

DEPTH-BASED CLASSIFICATION FOR FUNCTIONAL DATA

Author: Juan Romo
Sara Lopez-Pintado
Publication venue
Publication date
Field of study

Classification is an important task when data are curves. Recently, the notion of statistical depth has been extended to deal with functional observations. In this paper, we propose robust procedures based on the concept of depth to classify curves. These techniques are applied to a real data example. An extensive simulation study with contaminated models illustrates the good robustness properties of these depth-based classification methods.

Research Papers in Economics

Nonparametrically consistent depth-based classifiers

Author: Davy Paindaveine
Davy Paindaveine
Germain Van Bever
Germain Van Bever
Publication venue: 'Bernoulli Society for Mathematical Statistics and Probability'
Publication date: 01/01/2015
Field of study

We introduce a class of depth-based classification procedures that are of a nearest-neighbor nature. Depth, after symmetrization, indeed provides the center-outward ordering that is necessary and sufficient to define nearest neighbors. Like all their depth-based competitors, the resulting classifiers are affine-invariant, hence in particular are insensitive to unit changes. Unlike the former, however, the latter achieve Bayes consistency under virtually any absolutely continuous distributions - a concept we call nonparametric consistency, to stress the difference with the stronger universal consistency of the standard

k

NN classifiers. We investigate the finite-sample performances of the proposed classifiers through simulations and show that they outperform affine-invariant nearest-neighbor classifiers obtained through an obvious standardization construction. We illustrate the practical value of our classifiers on two real data examples. Finally, we shortly discuss the possible uses of our depth-based neighbors in other inference problems.Comment: Published at http://dx.doi.org/10.3150/13-BEJ561 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm

arXiv.org e-Print Archive

CiteSeerX

Crossref

DI-fusion

Weighted distance based discriminant analysis: the R package WeDiBaDis

Author: Arenas Solà Concepción
Irigoien Itziar
Mestres i Naval Francesc
Publication venue: 'The R Foundation'
Publication date: 29/09/2017
Field of study

The WeDiBaDis package provides a user friendly environment to perform discriminant analysis (supervised classification). WeDiBaDis is an easy to use package addressed to the biological and medical communities, and in general, to researchers interested in applied studies. It can be suitable when the user is interested in the problem of constructing a discriminant rule on the basis of distances between a relatively small number of instances or units of known unbalanced-class membership measured on many (possibly thousands) features of any type. This is a current situation when analyzing genetic biomedical data. This discriminant rule can then be used both, as a means of explaining differences among classes, but also in the important task of assigning the class membership for new unlabeled units. Our package implements two discriminant analysis procedures in an R environment: the well-known distance-based discriminant analysis (DB-discriminant) and a weighteddistance- based discriminant (WDB-discriminant), a novel classifier rule that we introduce. This new procedure is based on an improvement of the DB rule taking into account the statistical depth of the units. This article presents both classifying procedures and describes the implementation of each in detail. We illustrate the use of the package using an ecological and a genetic experimental example. Finally, we illustrate the effectiveness of the new proposed procedure (WDB), as compared with DB. This comparison is carried out using thirty-eight, high-dimensional, class-unbalanced, cancer data sets, three of which include clinical features

Diposit Digital de la Universitat de Barcelona

FEMDA: a unified framework for discriminant analysis

Author: Houdouin Pierre
Jonckheere Matthieu
Pascal Frederic
Publication venue
Publication date: 13/11/2023
Field of study

Although linear and quadratic discriminant analysis are widely recognized classical methods, they can encounter significant challenges when dealing with non-Gaussian distributions or contaminated datasets. This is primarily due to their reliance on the Gaussian assumption, which lacks robustness. We first explain and review the classical methods to address this limitation and then present a novel approach that overcomes these issues. In this new approach, the model considered is an arbitrary Elliptically Symmetrical (ES) distribution per cluster with its own arbitrary scale parameter. This flexible model allows for potentially diverse and independent samples that may not follow identical distributions. By deriving a new decision rule, we demonstrate that maximum-likelihood parameter estimation and classification are simple, efficient, and robust compared to state-of-the-art methods

arXiv.org e-Print Archive