Search CORE

30,622 research outputs found

Weighted k-Nearest-Neighbor Techniques and Ordinal Classification

Author: Hechenbichler K.
Schliep K.
Publication venue
Publication date: 01/01/2004
Field of study

In the field of statistical discrimination k-nearest neighbor classification is a well-known, easy and successful method. In this paper we present an extended version of this technique, where the distances of the nearest neighbors can be taken into account. In this sense there is a close connection to LOESS, a local regression technique. In addition we show possibilities to use nearest neighbor for classification in the case of an ordinal class structure. Empirical studies show the advantages of the new techniques

Open Access LMU

Nearest Neighbor and Kernel Survival Analysis: Nonasymptotic Error Bounds and Strong Consistency Rates

Author: Chen George H.
Publication venue
Publication date: 13/05/2019
Field of study

We establish the first nonasymptotic error bounds for Kaplan-Meier-based nearest neighbor and kernel survival probability estimators where feature vectors reside in metric spaces. Our bounds imply rates of strong consistency for these nonparametric estimators and, up to a log factor, match an existing lower bound for conditional CDF estimation. Our proof strategy also yields nonasymptotic guarantees for nearest neighbor and kernel variants of the Nelson-Aalen cumulative hazards estimator. We experimentally compare these methods on four datasets. We find that for the kernel survival estimator, a good choice of kernel is one learned using random survival forests.Comment: International Conference on Machine Learning (ICML 2019

arXiv.org e-Print Archive

A Novel Clustering Algorithm Based on Quantum Games

Author: Asuncion A Newman D J
Aïmeur E Brassard G Gambs S
Barabási A L
Du J
Du J Li H Xu X Shi M Zhou X Han R
Erkan G
Fudenberg D
Jing-ping Jiang
MacQueen J
Nielsen M A
Prevedel R
Qiang Li
Schmid C Flitney A P Wieczorek W Kiesel N Weinfurter H Hollenberg L C L
Yan He
Publication venue: 'IOP Publishing'
Publication date: 01/01/2008
Field of study

Enormous successes have been made by quantum algorithms during the last decade. In this paper, we combine the quantum game with the problem of data clustering, and then develop a quantum-game-based clustering algorithm, in which data points in a dataset are considered as players who can make decisions and implement quantum strategies in quantum games. After each round of a quantum game, each player's expected payoff is calculated. Later, he uses a link-removing-and-rewiring (LRR) function to change his neighbors and adjust the strength of links connecting to them in order to maximize his payoff. Further, algorithms are discussed and analyzed in two cases of strategies, two payoff matrixes and two LRR functions. Consequently, the simulation results have demonstrated that data points in datasets are clustered reasonably and efficiently, and the clustering algorithms have fast rates of convergence. Moreover, the comparison with other algorithms also provides an indication of the effectiveness of the proposed approach.Comment: 19 pages, 5 figures, 5 table

arXiv.org e-Print Archive

CiteSeerX

Crossref

Methods to integrate a language model with semantic information for a word prediction component

Author: Antoine Jean-Yves
Wandmacher Tonio
Publication venue
Publication date: 01/06/2007
Field of study

Most current word prediction systems make use of n-gram language models (LM) to estimate the probability of the following word in a phrase. In the past years there have been many attempts to enrich such language models with further syntactic or semantic information. We want to explore the predictive powers of Latent Semantic Analysis (LSA), a method that has been shown to provide reliable information on long-distance semantic dependencies between words in a context. We present and evaluate here several methods that integrate LSA-based information with a standard language model: a semantic cache, partial reranking, and different forms of interpolation. We found that all methods show significant improvements, compared to the 4-gram baseline, and most of them to a simple cache model as well.Comment: 10 pages ; EMNLP'2007 Conference (Prague

arXiv.org e-Print Archive

HAL Université de Tours

Feature Selection and Weighting by Nearest Neighbor Ensembles

Author: Gertheiss Jan
Tutz Gerhard
Publication venue: 'Elsevier BV'
Publication date: 19/06/2008
Field of study

In the field of statistical discrimination nearest neighbor methods are a well known, quite simple but successful nonparametric classification tool. In higher dimensions, however, predictive power normally deteriorates. In general, if some covariates are assumed to be noise variables, variable selection is a promising approach. The paper’s main focus is on the development and evaluation of a nearest neighbor ensemble with implicit variable selection. In contrast to other nearest neighbor approaches we are not primarily interested in classification, but in estimating the (posterior) class probabilities. In simulation studies and for real world data the proposed nearest neighbor ensemble is compared to an extended forward/backward variable selection procedure for nearest neighbor classifiers, and some alternative well established classification tools (that offer probability estimates as well). Despite its simple structure, the proposed method’s performance is quite good - especially if relevant covariates can be separated from noise variables. Another advantage of the presented ensemble is the easy identification of interactions that are usually hard to detect. So not simply variable selection but rather some kind of feature selection is performed. The paper is a preprint of an article published in Chemometrics and Intelligent Laboratory Systems. Please use the journal version for citation

CiteSeerX

Open Access LMU