9,057 research outputs found
Two-Stage Metric Learning
In this paper, we present a novel two-stage metric learning algorithm. We
first map each learning instance to a probability distribution by computing its
similarities to a set of fixed anchor points. Then, we define the distance in
the input data space as the Fisher information distance on the associated
statistical manifold. This induces in the input data space a new family of
distance metric with unique properties. Unlike kernelized metric learning, we
do not require the similarity measure to be positive semi-definite. Moreover,
it can also be interpreted as a local metric learning algorithm with well
defined distance approximation. We evaluate its performance on a number of
datasets. It outperforms significantly other metric learning methods and SVM.Comment: Accepted for publication in ICML 201
Structured Learning of Tree Potentials in CRF for Image Segmentation
We propose a new approach to image segmentation, which exploits the
advantages of both conditional random fields (CRFs) and decision trees. In the
literature, the potential functions of CRFs are mostly defined as a linear
combination of some pre-defined parametric models, and then methods like
structured support vector machines (SSVMs) are applied to learn those linear
coefficients. We instead formulate the unary and pairwise potentials as
nonparametric forests---ensembles of decision trees, and learn the ensemble
parameters and the trees in a unified optimization problem within the
large-margin framework. In this fashion, we easily achieve nonlinear learning
of potential functions on both unary and pairwise terms in CRFs. Moreover, we
learn class-wise decision trees for each object that appears in the image. Due
to the rich structure and flexibility of decision trees, our approach is
powerful in modelling complex data likelihoods and label relationships. The
resulting optimization problem is very challenging because it can have
exponentially many variables and constraints. We show that this challenging
optimization can be efficiently solved by combining a modified column
generation and cutting-planes techniques. Experimental results on both binary
(Graz-02, Weizmann horse, Oxford flower) and multi-class (MSRC-21, PASCAL VOC
2012) segmentation datasets demonstrate the power of the learned nonlinear
nonparametric potentials.Comment: 10 pages. Appearing in IEEE Transactions on Neural Networks and
Learning System
Consistent Multitask Learning with Nonlinear Output Relations
Key to multitask learning is exploiting relationships between different tasks
to improve prediction performance. If the relations are linear, regularization
approaches can be used successfully. However, in practice assuming the tasks to
be linearly related might be restrictive, and allowing for nonlinear structures
is a challenge. In this paper, we tackle this issue by casting the problem
within the framework of structured prediction. Our main contribution is a novel
algorithm for learning multiple tasks which are related by a system of
nonlinear equations that their joint outputs need to satisfy. We show that the
algorithm is consistent and can be efficiently implemented. Experimental
results show the potential of the proposed method.Comment: 25 pages, 1 figure, 2 table
Bayesian Approximate Kernel Regression with Variable Selection
Nonlinear kernel regression models are often used in statistics and machine
learning because they are more accurate than linear models. Variable selection
for kernel regression models is a challenge partly because, unlike the linear
regression setting, there is no clear concept of an effect size for regression
coefficients. In this paper, we propose a novel framework that provides an
effect size analog of each explanatory variable for Bayesian kernel regression
models when the kernel is shift-invariant --- for example, the Gaussian kernel.
We use function analytic properties of shift-invariant reproducing kernel
Hilbert spaces (RKHS) to define a linear vector space that: (i) captures
nonlinear structure, and (ii) can be projected onto the original explanatory
variables. The projection onto the original explanatory variables serves as an
analog of effect sizes. The specific function analytic property we use is that
shift-invariant kernel functions can be approximated via random Fourier bases.
Based on the random Fourier expansion we propose a computationally efficient
class of Bayesian approximate kernel regression (BAKR) models for both
nonlinear regression and binary classification for which one can compute an
analog of effect sizes. We illustrate the utility of BAKR by examining two
important problems in statistical genetics: genomic selection (i.e. phenotypic
prediction) and association mapping (i.e. inference of significant variants or
loci). State-of-the-art methods for genomic selection and association mapping
are based on kernel regression and linear models, respectively. BAKR is the
first method that is competitive in both settings.Comment: 22 pages, 3 figures, 3 tables; theory added; new simulations
presented; references adde
- …