Search CORE

14,431 research outputs found

Adaptive kNN using Expected Accuracy for Classification of Geo-Spatial Data

Author: Atzmueller Martin
Becker Martin
Hotho Andreas
Kibanov Mark
Mueller Juergen
Stumme Gerd
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/12/2017
Field of study

The k-Nearest Neighbor (kNN) classification approach is conceptually simple - yet widely applied since it often performs well in practical applications. However, using a global constant k does not always provide an optimal solution, e.g., for datasets with an irregular density distribution of data points. This paper proposes an adaptive kNN classifier where k is chosen dynamically for each instance (point) to be classified, such that the expected accuracy of classification is maximized. We define the expected accuracy as the accuracy of a set of structurally similar observations. An arbitrary similarity function can be used to find these observations. We introduce and evaluate different similarity functions. For the evaluation, we use five different classification tasks based on geo-spatial data. Each classification task consists of (tens of) thousands of items. We demonstrate, that the presented expected accuracy measures can be a good estimator for kNN performance, and the proposed adaptive kNN classifier outperforms common kNN and previously introduced adaptive kNN algorithms. Also, we show that the range of considered k can be significantly reduced to speed up the algorithm without negative influence on classification accuracy

arXiv.org e-Print Archive

Tilburg University Repository

Maximum Margin Multiclass Nearest Neighbors

Author: Kontorovich Aryeh
Weiss Roi
Publication venue
Publication date: 01/01/2014
Field of study

We develop a general framework for margin-based multicategory classification in metric spaces. The basic work-horse is a margin-regularized version of the nearest-neighbor classifier. We prove generalization bounds that match the state of the art in sample size

n

and significantly improve the dependence on the number of classes

k

. Our point of departure is a nearly Bayes-optimal finite-sample risk bound independent of

k

. Although

k

-free, this bound is unregularized and non-adaptive, which motivates our main result: Rademacher and scale-sensitive margin bounds with a logarithmic dependence on

k

. As the best previous risk estimates in this setting were of order

\sqrt k

, our bound is exponentially sharper. From the algorithmic standpoint, in doubling metric spaces our classifier may be trained on

n

examples in

O(n^2\log n)

time and evaluated on new points in

O(\log n)

time

arXiv.org e-Print Archive

CiteSeerX

Adaptive imputation of missing values for incomplete pattern classification

Author: Dezert Jean
Liu Zhun-Ga
Martin Arnaud
Pan Quan
Publication venue: 'Elsevier BV'
Publication date: 08/02/2016
Field of study

In classification of incomplete pattern, the missing values can either play a crucial role in the class determination, or have only little influence (or eventually none) on the classification results according to the context. We propose a credal classification method for incomplete pattern with adaptive imputation of missing values based on belief function theory. At first, we try to classify the object (incomplete pattern) based only on the available attribute values. As underlying principle, we assume that the missing information is not crucial for the classification if a specific class for the object can be found using only the available information. In this case, the object is committed to this particular class. However, if the object cannot be classified without ambiguity, it means that the missing values play a main role for achieving an accurate classification. In this case, the missing values will be imputed based on the K-nearest neighbor (K-NN) and self-organizing map (SOM) techniques, and the edited pattern with the imputation is then classified. The (original or edited) pattern is respectively classified according to each training class, and the classification results represented by basic belief assignments are fused with proper combination rules for making the credal classification. The object is allowed to belong with different masses of belief to the specific classes and meta-classes (which are particular disjunctions of several single classes). The credal classification captures well the uncertainty and imprecision of classification, and reduces effectively the rate of misclassifications thanks to the introduction of meta-classes. The effectiveness of the proposed method with respect to other classical methods is demonstrated based on several experiments using artificial and real data sets

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1