Search CORE

16,909 research outputs found

When can unlabeled data improve the learning rate?

Author: Ben-David Shai
Bousquet Olivier
Gelly Sylvai
Göpfert Christina
Tolstikhin Ilya
Urner Ruth
Publication venue
Publication date: 01/01/2019
Field of study

Göpfert C, Ben-David S, Bousquet O, Gelly S, Tolstikhin I, Urner R. When can unlabeled data improve the learning rate? In: Conference on Learning Theory (COLT). 2019

Publications at Bielefeld University

A Convex Formulation for Mixed Regression with Two Components: Minimax Optimal Rates

Author: Caramanis Constantine
Chen Yudong
Yi Xinyang
Publication venue
Publication date: 13/02/2015
Field of study

We consider the mixed regression problem with two components, under adversarial and stochastic noise. We give a convex optimization formulation that provably recovers the true solution, and provide upper bounds on the recovery errors for both arbitrary noise and stochastic noise settings. We also give matching minimax lower bounds (up to log factors), showing that under certain assumptions, our algorithm is information-theoretically optimal. Our results represent the first tractable algorithm guaranteeing successful recovery with tight bounds on recovery errors and sample complexity.Comment: Added results on minimax lower bounds, which match our upper bounds on recovery errors up to log factors. Appeared in the Conference on Learning Theory (COLT), 2014. (JMLR W&CP 35 :560-604, 2014

arXiv.org e-Print Archive

CiteSeerX

Truthful Linear Regression

Author: Cummings Rachel
Ioannidis Stratis
Ligett Katrina
Publication venue
Publication date: 10/06/2015
Field of study

We consider the problem of fitting a linear model to data held by individuals who are concerned about their privacy. Incentivizing most players to truthfully report their data to the analyst constrains our design to mechanisms that provide a privacy guarantee to the participants; we use differential privacy to model individuals' privacy losses. This immediately poses a problem, as differentially private computation of a linear model necessarily produces a biased estimation, and existing approaches to design mechanisms to elicit data from privacy-sensitive individuals do not generalize well to biased estimators. We overcome this challenge through an appropriate design of the computation and payment scheme.Comment: To appear in Proceedings of the 28th Annual Conference on Learning Theory (COLT 2015

arXiv.org e-Print Archive

CiteSeerX

Caltech Authors

S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification

Author: Dasarathy Gautam
Nowak Robert
Zhu Xiaojin
Publication venue
Publication date: 29/06/2015
Field of study

This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S2 for this task. At each step, S2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S2 queries for the label of the vertex that bisects the *shortest shortest* path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S2 algorithm to the theory of nonparametric active learning. In particular, we show that S2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.Comment: A version of this paper appears in the Conference on Learning Theory (COLT) 201

arXiv.org e-Print Archive

CiteSeerX

Generalization bounds for learning the kernel

Author: Campbell Colin
Ying Yiming
Publication venue
Publication date: 22/07/2013
Field of study

22nd Annual Conference on Learning Theory (COLT 2009), Montreal, Canada, 18-21 June 2009eralization bound for learning the kernel problem. First, we show that the generalization analysis of the kernel learning algorithms reduces to investigation of the suprema of the Rademacher chaos process of order two over candidate kernels, which we refer to as Rademacher chaos complexity. Next, we show how to estimate the empirical Rademacher chaos complexity by well-established metric entropy integrals and pseudo-dimension of the set of candidate kernels. Our new methodology mainly depends on the principal theory of U-processes. Finally, we establish satisfactory excess generalization bounds and misclassification error rates for learning Gaussian kernels and general radial basis kernels

Open Research Exeter