Search CORE

157,149 research outputs found

Pattern Recognition for Conditionally Independent Data

Author: Ryabko Daniil
Publication venue
Publication date: 18/07/2005
Field of study

In this work we consider the task of relaxing the i.i.d assumption in pattern recognition (or classification), aiming to make existing learning algorithms applicable to a wider range of tasks. Pattern recognition is guessing a discrete label of some object based on a set of given examples (pairs of objects and labels). We consider the case of deterministically defined labels. Traditionally, this task is studied under the assumption that examples are independent and identically distributed. However, it turns out that many results of pattern recognition theory carry over a weaker assumption. Namely, under the assumption of conditional independence and identical distribution of objects, while the only assumption on the distribution of labels is that the rate of occurrence of each label should be above some positive threshold. We find a broad class of learning algorithms for which estimations of the probability of a classification error achieved under the classical i.i.d. assumption can be generalised to the similar estimates for the case of conditionally i.i.d. examples.Comment: parts of results published at ALT'04 and ICML'0

arXiv.org e-Print Archive

CiteSeerX

Exact Distribution-Free Hypothesis Tests for the Regression Function of Binary Classification via Conditional Kernel Mean Embeddings

Author: Csáji Balázs Csanád
Tamás Ambrus
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

In this paper we suggest two statistical hypothesis tests for the regression function of binary classification based on conditional kernel mean embeddings. The regression function is a fundamental object in classification as it determines both the Bayes optimal classifier and the misclassification probabilities. A resampling based framework is presented and combined with consistent point estimators of the conditional kernel mean map, in order to construct distribution-free hypothesis tests. These tests are introduced in a flexible manner allowing us to control the exact probability of type I error for any sample size. We also prove that both proposed techniques are consistent under weak statistical assumptions, i.e., the type II error probabilities pointwise converge to zero

arXiv.org e-Print Archive

Crossref

SZTAKI Publication Repository

Optimal Clustering under Uncertainty

Author: Benalcázar Marco E.
Dalton Lori A.
Dougherty Edward R.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Classical clustering algorithms typically either lack an underlying probability framework to make them predictive or focus on parameter estimation rather than defining and minimizing a notion of error. Recent work addresses these issues by developing a probabilistic framework based on the theory of random labeled point processes and characterizing a Bayes clusterer that minimizes the number of misclustered points. The Bayes clusterer is analogous to the Bayes classifier. Whereas determining a Bayes classifier requires full knowledge of the feature-label distribution, deriving a Bayes clusterer requires full knowledge of the point process. When uncertain of the point process, one would like to find a robust clusterer that is optimal over the uncertainty, just as one may find optimal robust classifiers with uncertain feature-label distributions. Herein, we derive an optimal robust clusterer by first finding an effective random point process that incorporates all randomness within its own probabilistic structure and from which a Bayes clusterer can be derived that provides an optimal robust clusterer relative to the uncertainty. This is analogous to the use of effective class-conditional distributions in robust classification. After evaluating the performance of robust clusterers in synthetic mixtures of Gaussians models, we apply the framework to granular imaging, where we make use of the asymptotic granulometric moment theory for granular images to relate robust clustering theory to the application.Comment: 19 pages, 5 eps figures, 1 tabl

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Discrimination on the Grassmann Manifold: Fundamental Limits of Subspace Classifiers

Author: Calderbank Robert
Nokleby Matthew
Rodrigues Miguel
Publication venue
Publication date: 10/12/2014
Field of study

We present fundamental limits on the reliable classification of linear and affine subspaces from noisy, linear features. Drawing an analogy between discrimination among subspaces and communication over vector wireless channels, we propose two Shannon-inspired measures to characterize asymptotic classifier performance. First, we define the classification capacity, which characterizes necessary and sufficient conditions for the misclassification probability to vanish as the signal dimension, the number of features, and the number of subspaces to be discerned all approach infinity. Second, we define the diversity-discrimination tradeoff which, by analogy with the diversity-multiplexing tradeoff of fading vector channels, characterizes relationships between the number of discernible subspaces and the misclassification probability as the noise power approaches zero. We derive upper and lower bounds on these measures which are tight in many regimes. Numerical results, including a face recognition application, validate the results in practice.Comment: 19 pages, 4 figures. Revised submission to IEEE Transactions on Information Theor

arXiv.org e-Print Archive

Crossref

On label dependence in multilabel classification

Author: Cheng Weiwei
Dembszynski Krzysztof
Hüllermeier Eyke
Waegeman Willem
Publication venue: Ghent University, KERMIT, Department of Applied Mathematics, Biometrics and Process Control
Publication date: 01/01/2010
Field of study

Ghent University Academic Bibliography

Beyond Fano's Inequality: Bounds on the Optimal F-Score, BER, and Cost-Sensitive Risk and Their Implications

Author: Brown Gavin
Edakunni Narayanan
Pocock Adam
Zhao Ming-Jie
Publication venue
Publication date: 14/01/2013
Field of study

The University of Manchester - Institutional Repository