Search CORE

26,463 research outputs found

Learning Output Kernels for Multi-Task Problems

Author: Dinuzzo Francesco
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

Simultaneously solving multiple related learning tasks is beneficial under a variety of circumstances, but the prior knowledge necessary to correctly model task relationships is rarely available in practice. In this paper, we develop a novel kernel-based multi-task learning technique that automatically reveals structural inter-task relationships. Building over the framework of output kernel learning (OKL), we introduce a method that jointly learns multiple functions and a low-rank multi-task kernel by solving a non-convex regularization problem. Optimization is carried out via a block coordinate descent strategy, where each subproblem is solved using suitable conjugate gradient (CG) type iterative methods for linear operator equations. The effectiveness of the proposed approach is demonstrated on pharmacological and collaborative filtering data

arXiv.org e-Print Archive

CiteSeerX

Publikationsserver der Universität Tübingen

MPG.PuRe

Gradient descent for sparse rank-one matrix completion for crowd-sourced aggregation of sparsely interacting workers

Author: Czepesvari Csaba
Ma Yao
Olshevsky Alexander
Saligrama Venkatesh
Publication venue
Publication date: 10/07/2018
Field of study

We consider worker skill estimation for the singlecoin Dawid-Skene crowdsourcing model. In practice skill-estimation is challenging because worker assignments are sparse and irregular due to the arbitrary, and uncontrolled availability of workers. We formulate skill estimation as a rank-one correlation-matrix completion problem, where the observed components correspond to observed label correlation between workers. We show that the correlation matrix can be successfully recovered and skills identifiable if and only if the sampling matrix (observed components) is irreducible and aperiodic. We then propose an efficient gradient descent scheme and show that skill estimates converges to the desired global optima for such sampling matrices. Our proof is original and the results are surprising in light of the fact that even the weighted rank-one matrix factorization problem is NP hard in general. Next we derive sample complexity bounds for the noisy case in terms of spectral properties of the signless Laplacian of the sampling matrix. Our proposed scheme achieves state-of-art performance on a number of real-world datasets.Published versio

Boston University Institutional Repository (OpenBU)

Large-scale Multi-label Learning with Missing Labels

Author: Dhillon Inderjit S.
Jain Prateek
Kar Purushottam
Yu Hsiang-Fu
Publication venue
Publication date: 25/11/2013
Field of study

The multi-label classification problem has generated significant interest in recent years. However, existing approaches do not adequately address two key challenges: (a) the ability to tackle problems with a large number (say millions) of labels, and (b) the ability to handle data with missing labels. In this paper, we directly address both these problems by studying the multi-label problem in a generic empirical risk minimization (ERM) framework. Our framework, despite being simple, is surprisingly able to encompass several recent label-compression based methods which can be derived as special cases of our method. To optimize the ERM problem, we develop techniques that exploit the structure of specific loss functions - such as the squared loss function - to offer efficient algorithms. We further show that our learning framework admits formal excess risk bounds even in the presence of missing labels. Our risk bounds are tight and demonstrate better generalization performance for low-rank promoting trace-norm regularization when compared to (rank insensitive) Frobenius norm regularization. Finally, we present extensive empirical results on a variety of benchmark datasets and show that our methods perform significantly better than existing label compression based methods and can scale up to very large datasets such as the Wikipedia dataset

arXiv.org e-Print Archive

CiteSeerX

Learning with the Weighted Trace-norm under Arbitrary Sampling Distributions

Author: E Qt Qt
E Qtq T
Nathan Srebro
Ohad Shamir
Qt Σt
Rina Foygel
Ruslan Salakhutdinov
Publication venue
Publication date: 01/01/2011
Field of study

We provide rigorous guarantees on learning with the weighted trace-norm under arbitrary sampling distributions. We show that the standard weighted trace-norm might fail when the sampling distribution is not a product distribution (i.e. when row and column indexes are not selected independently), present a corrected variant for which we establish strong learning guarantees, and demonstrate that it works better in practice. We provide guarantees when weighting by either the true or empirical sampling distribution, and suggest that even if the true distribution is known (or is uniform), weighting by the empirical distribution may be beneficial

arXiv.org e-Print Archive

CiteSeerX