Search CORE

13,172 research outputs found

S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification

Author: Dasarathy Gautam
Nowak Robert
Zhu Xiaojin
Publication venue
Publication date: 29/06/2015
Field of study

This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S2 for this task. At each step, S2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S2 queries for the label of the vertex that bisects the *shortest shortest* path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S2 algorithm to the theory of nonparametric active learning. In particular, we show that S2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.Comment: A version of this paper appears in the Conference on Learning Theory (COLT) 201

arXiv.org e-Print Archive

CiteSeerX

Density-sensitive semisupervised inference

Author: Azizyan Martin
Singh Aarti
Wasserman Larry
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 24/05/2013
Field of study

Semisupervised methods are techniques for using labeled data

(X_1,Y_1),\ldots,(X_n,Y_n)

together with unlabeled data

X_{n+1},\ldots,X_N

to make predictions. These methods invoke some assumptions that link the marginal distribution

P_X

of X to the regression function f(x). For example, it is common to assume that f is very smooth over high density regions of

P_X

. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution

P_X

. Our model includes a parameter

\alpha

that controls the strength of the semisupervised assumption. We then use the data to adapt to

\alpha

.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1092 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics

Author: Guo Lijiang
Kim Minje
Publication venue
Publication date: 01/12/2017
Field of study

This paper proposes an efficient bitwise solution to the single-channel source separation task. Most dictionary-based source separation algorithms rely on iterative update rules during the run time, which becomes computationally costly especially when we employ an overcomplete dictionary and sparse encoding that tend to give better separation results. To avoid such cost we propose a bitwise scheme on hashed spectra that leads to an efficient posterior probability calculation. For each source, the algorithm uses a partial rank order metric to extract robust features that form a binarized dictionary of hashed spectra. Then, for a mixture spectrum, its hash code is compared with each source's hashed dictionary in one pass. This simple voting-based dictionary search allows a fast and iteration-free estimation of ratio masking at each bin of a signal spectrogram. We verify that the proposed BitWise Source Separation (BWSS) algorithm produces sensible source separation results for the single-channel speech denoising task, with 6-8 dB mean SDR. To our knowledge, this is the first dictionary based algorithm for this task that is completely iteration-free in both training and testing

arXiv.org e-Print Archive

Crossref

Generative Supervised Classification Using Dirichlet Process Priors.

Author: Davy Manuel
Tourneret Jean-Yves
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2010
Field of study

Choosing the appropriate parameter prior distributions associated to a given Bayesian model is a challenging problem. Conjugate priors can be selected for simplicity motivations. However, conjugate priors can be too restrictive to accurately model the available prior information. This paper studies a new generative supervised classifier which assumes that the parameter prior distributions conditioned on each class are mixtures of Dirichlet processes. The motivations for using mixtures of Dirichlet processes is their known ability to model accurately a large class of probability distributions. A Monte Carlo method allowing one to sample according to the resulting class-conditional posterior distributions is then studied. The parameters appearing in the class-conditional densities can then be estimated using these generated samples (following Bayesian learning). The proposed supervised classifier is applied to the classification of altimetric waveforms backscattered from different surfaces (oceans, ices, forests, and deserts). This classification is a first step before developing tools allowing for the extraction of useful geophysical information from altimetric waveforms backscattered from nonoceanic surfaces

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte