21 research outputs found
Learning Output Kernels for Multi-Task Problems
Simultaneously solving multiple related learning tasks is beneficial under a
variety of circumstances, but the prior knowledge necessary to correctly model
task relationships is rarely available in practice. In this paper, we develop a
novel kernel-based multi-task learning technique that automatically reveals
structural inter-task relationships. Building over the framework of output
kernel learning (OKL), we introduce a method that jointly learns multiple
functions and a low-rank multi-task kernel by solving a non-convex
regularization problem. Optimization is carried out via a block coordinate
descent strategy, where each subproblem is solved using suitable conjugate
gradient (CG) type iterative methods for linear operator equations. The
effectiveness of the proposed approach is demonstrated on pharmacological and
collaborative filtering data
Fixed-point and coordinate descent algorithms for regularized kernel methods
In this paper, we study two general classes of optimization algorithms for
kernel methods with convex loss function and quadratic norm regularization, and
analyze their convergence. The first approach, based on fixed-point iterations,
is simple to implement and analyze, and can be easily parallelized. The second,
based on coordinate descent, exploits the structure of additively separable
loss functions to compute solutions of line searches in closed form. Instances
of these general classes of algorithms are already incorporated into state of
the art machine learning software for large scale problems. We start from a
solution characterization of the regularized problem, obtained using
sub-differential calculus and resolvents of monotone operators, that holds for
general convex loss functions regardless of differentiability. The two
methodologies described in the paper can be regarded as instances of non-linear
Jacobi and Gauss-Seidel algorithms, and are both well-suited to solve large
scale problems
The representer theorem for Hilbert spaces: a necessary and sufficient condition
A family of regularization functionals is said to admit a linear representer
theorem if every member of the family admits minimizers that lie in a fixed
finite dimensional subspace. A recent characterization states that a general
class of regularization functionals with differentiable regularizer admits a
linear representer theorem if and only if the regularization term is a
non-decreasing function of the norm. In this report, we improve over such
result by replacing the differentiability assumption with lower semi-continuity
and deriving a proof that is independent of the dimensionality of the space
Client-server multi-task learning from distributed datasets
A client-server architecture to simultaneously solve multiple learning tasks
from distributed datasets is described. In such architecture, each client is
associated with an individual learning task and the associated dataset of
examples. The goal of the architecture is to perform information fusion from
multiple datasets while preserving privacy of individual data. The role of the
server is to collect data in real-time from the clients and codify the
information in a common database. The information coded in this database can be
used by all the clients to solve their individual learning task, so that each
client can exploit the informative content of all the datasets without actually
having access to private data of others. The proposed algorithmic framework,
based on regularization theory and kernel methods, uses a suitable class of
mixed effect kernels. The new method is illustrated through a simulated music
recommendation system
Learning from Distributions via Support Measure Machines
This paper presents a kernel-based discriminative learning framework on
probability measures. Rather than relying on large collections of vectorial
training examples, our framework learns using a collection of probability
distributions that have been constructed to meaningfully represent training
data. By representing these probability distributions as mean embeddings in the
reproducing kernel Hilbert space (RKHS), we are able to apply many standard
kernel-based learning techniques in straightforward fashion. To accomplish
this, we construct a generalization of the support vector machine (SVM) called
a support measure machine (SMM). Our analyses of SMMs provides several insights
into their relationship to traditional SVMs. Based on such insights, we propose
a flexible SVM (Flex-SVM) that places different kernel functions on each
training example. Experimental results on both synthetic and real-world data
demonstrate the effectiveness of our proposed framework.Comment: Advances in Neural Information Processing Systems 2