32,141 research outputs found
Shortest path distance in random k-nearest neighbor graphs
Consider a weighted or unweighted k-nearest neighbor graph that has been
built on n data points drawn randomly according to some density p on R^d. We
study the convergence of the shortest path distance in such graphs as the
sample size tends to infinity. We prove that for unweighted kNN graphs, this
distance converges to an unpleasant distance function on the underlying space
whose properties are detrimental to machine learning. We also study the
behavior of the shortest path distance in weighted kNN graphs.Comment: Appears in Proceedings of the 29th International Conference on
Machine Learning (ICML 2012
Semi-supervised transductive speaker identification
We present an application of transductive semi-supervised learning to the problem of speaker identification. Formulating this problem as one of transduction is the most natural choice in some scenarios, such as when annotating archived speech data. Experiments with the CHAINS corpus show that, using the basic MFCC-encoding of recorded utterances, a well known simple semi-supervised algorithm, label spread, can solve this problem well. With only a small number of labelled utterances, the semi-supervised algorithm drastically outperforms a state of the art supervised support vector machine algorithm. Although we restrict ourselves to the transductive setting in this paper, the results encourage future work on semi-supervised learning for inductive speaker identification
Semi-Supervised Radio Signal Identification
Radio emitter recognition in dense multi-user environments is an important
tool for optimizing spectrum utilization, identifying and minimizing
interference, and enforcing spectrum policy. Radio data is readily available
and easy to obtain from an antenna, but labeled and curated data is often
scarce making supervised learning strategies difficult and time consuming in
practice. We demonstrate that semi-supervised learning techniques can be used
to scale learning beyond supervised datasets, allowing for discerning and
recalling new radio signals by using sparse signal representations based on
both unsupervised and supervised methods for nonlinear feature learning and
clustering methods
- …