Search CORE

5,981 research outputs found

A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data

Author: A Enright
A Gavin
A Grigoriev
A Hoerl
AJ Dobson
EG WS Cleveland
G GH
GRG Lanckriet
H Ge
M Deng
M Eisen
M Fellenberg
MPS Brown
O Troyanskaya
P Liang
P Pavlidis
P Pavlidis
R Overbeek
R Tibshirani
Walter L Ruzzo
WS Noble
Y Zheng
Zizhen Yao
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. METHODS: In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN) algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. RESULTS: We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM) algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. CONCLUSION: Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Weighted k-Nearest-Neighbor Techniques and Ordinal Classification

Author: Hechenbichler K.
Schliep K.
Publication venue
Publication date: 01/01/2004
Field of study

In the field of statistical discrimination k-nearest neighbor classification is a well-known, easy and successful method. In this paper we present an extended version of this technique, where the distances of the nearest neighbors can be taken into account. In this sense there is a close connection to LOESS, a local regression technique. In addition we show possibilities to use nearest neighbor for classification in the case of an ordinal class structure. Empirical studies show the advantages of the new techniques

Open Access LMU

Low-shot learning with large-scale diffusion

Author: Douze Matthijs
Hariharan Bharath
Jégou Hervé
Szlam Arthur
Publication venue
Publication date: 15/06/2018
Field of study

This paper considers the problem of inferring image labels from images when only a few annotated examples are available at training time. This setup is often referred to as low-shot learning, where a standard approach is to re-train the last few layers of a convolutional neural network learned on separate classes for which training examples are abundant. We consider a semi-supervised setting based on a large collection of images to support label propagation. This is possible by leveraging the recent advances on large-scale similarity graph construction. We show that despite its conceptual simplicity, scaling label propagation up to hundred millions of images leads to state of the art accuracy in the low-shot learning regime

arXiv.org e-Print Archive

Crossref

Combining inertial and visual sensing for human action recognition in tennis

Author: Connaghan Damien
Kelly Philip
O'Connor Noel E.
Ó Conaire Ciarán
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

In this paper, we present a framework for both the automatic extraction of the temporal location of tennis strokes within a match and the subsequent classification of these as being either a serve, forehand or backhand. We employ the use of low-cost visual sensing and low-cost inertial sensing to achieve these aims, whereby a single modality can be used or a fusion of both classification strategies can be adopted if both modalities are available within a given capture scenario. This flexibility allows the framework to be applicable to a variety of user scenarios and hardware infrastructures. Our proposed approach is quantitatively evaluated using data captured from elite tennis players. Results point to the extremely accurate performance of the proposed approach irrespective of input modality configuration

Crossref

Irish Universities

DCU Online Research Access Service