32,403 research outputs found
Tensor Regression Meets Gaussian Processes
Low-rank tensor regression, a new model class that learns high-order
correlation from data, has recently received considerable attention. At the
same time, Gaussian processes (GP) are well-studied machine learning models for
structure learning. In this paper, we demonstrate interesting connections
between the two, especially for multi-way data analysis. We show that low-rank
tensor regression is essentially learning a multi-linear kernel in Gaussian
processes, and the low-rank assumption translates to the constrained Bayesian
inference problem. We prove the oracle inequality and derive the average case
learning curve for the equivalent GP model. Our finding implies that low-rank
tensor regression, though empirically successful, is highly dependent on the
eigenvalues of covariance functions as well as variable correlations.Comment: 17 page
Infinite Shift-invariant Grouped Multi-task Learning for Gaussian Processes
Multi-task learning leverages shared information among data sets to improve
the learning performance of individual tasks. The paper applies this framework
for data where each task is a phase-shifted periodic time series. In
particular, we develop a novel Bayesian nonparametric model capturing a mixture
of Gaussian processes where each task is a sum of a group-specific function and
a component capturing individual variation, in addition to each task being
phase shifted. We develop an efficient \textsc{em} algorithm to learn the
parameters of the model. As a special case we obtain the Gaussian mixture model
and \textsc{em} algorithm for phased-shifted periodic time series. Furthermore,
we extend the proposed model by using a Dirichlet Process prior and thereby
leading to an infinite mixture model that is capable of doing automatic model
selection. A Variational Bayesian approach is developed for inference in this
model. Experiments in regression, classification and class discovery
demonstrate the performance of the proposed models using both synthetic data
and real-world time series data from astrophysics. Our methods are particularly
useful when the time series are sparsely and non-synchronously sampled.Comment: This is an extended version of our ECML 2010 paper entitled
"Shift-invariant Grouped Multi-task Learning for Gaussian Processes"; ECML
PKDD'10 Proceedings of the 2010 European conference on Machine learning and
knowledge discovery in databases: Part II
Bayesian Optimization for Policy Search via Online-Offline Experimentation
Online field experiments are the gold-standard way of evaluating changes to
real-world interactive machine learning systems. Yet our ability to explore
complex, multi-dimensional policy spaces - such as those found in
recommendation and ranking problems - is often constrained by the limited
number of experiments that can be run simultaneously. To alleviate these
constraints, we augment online experiments with an offline simulator and apply
multi-task Bayesian optimization to tune live machine learning systems. We
describe practical issues that arise in these types of applications, including
biases that arise from using a simulator and assumptions for the multi-task
kernel. We measure empirical learning curves which show substantial gains from
including data from biased offline experiments, and show how these learning
curves are consistent with theoretical results for multi-task Gaussian process
generalization. We find that improved kernel inference is a significant driver
of multi-task generalization. Finally, we show several examples of Bayesian
optimization efficiently tuning a live machine learning system by combining
offline and online experiments
Leveraging Robotic Prior Tactile Exploratory Action Experiences For Learning New Objects's Physical Properties
Reusing the tactile knowledge of some previously-explored objects helps us
humans to easily recognize the tactual properties of new objects. In this
master thesis, we enable arobotic arm equipped with multi-modal artificial
skin, like humans, to actively transfer the prior tactile exploratory action
experiences when it learns the detailed physical properties of new objects.
These prior tactile experiences are built when the robot applies the pressing,
sliding and static contact movements on objects with different action
parameters and perceives the tactile feedbacks from multiple sensory
modalities. Our method was systematically evaluated by several experiments.
Results show that the robot could consistently improve the discrimination
accuracy by over 10% when it exploited the prior tactile knowledge compared
with using no transfer method, and 25% when it used only one training sample.
The results also show that the proposed method was robust against transferring
irrelevant prior tactile knowledge.Comment: Master's thesis in the Faculty of Electrical and Computer
Engineering, Technical University of Munic
Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes
Predicated on the increasing abundance of electronic health records, we
investi- gate the problem of inferring individualized treatment effects using
observational data. Stemming from the potential outcomes model, we propose a
novel multi- task learning framework in which factual and counterfactual
outcomes are mod- eled as the outputs of a function in a vector-valued
reproducing kernel Hilbert space (vvRKHS). We develop a nonparametric Bayesian
method for learning the treatment effects using a multi-task Gaussian process
(GP) with a linear coregion- alization kernel as a prior over the vvRKHS. The
Bayesian approach allows us to compute individualized measures of confidence in
our estimates via pointwise credible intervals, which are crucial for realizing
the full potential of precision medicine. The impact of selection bias is
alleviated via a risk-based empirical Bayes method for adapting the multi-task
GP prior, which jointly minimizes the empirical error in factual outcomes and
the uncertainty in (unobserved) counter- factual outcomes. We conduct
experiments on observational datasets for an inter- ventional social program
applied to premature infants, and a left ventricular assist device applied to
cardiac patients wait-listed for a heart transplant. In both experi- ments, we
show that our method significantly outperforms the state-of-the-art
Hyper-Process Model: A Zero-Shot Learning algorithm for Regression Problems based on Shape Analysis
Zero-shot learning (ZSL) can be defined by correctly solving a task where no
training data is available, based on previous acquired knowledge from
different, but related tasks. So far, this area has mostly drawn the attention
from computer vision community where a new unseen image needs to be correctly
classified, assuming the target class was not used in the training procedure.
Apart from image classification, only a couple of generic methods were proposed
that are applicable to both classification and regression. These learn the
relation among model coefficients so new ones can be predicted according to
provided conditions. So far, up to our knowledge, no methods exist that are
applicable only to regression, and take advantage from such setting. Therefore,
the present work proposes a novel algorithm for regression problems that uses
data drawn from trained models, instead of model coefficients. In this case, a
shape analyses on the data is performed to create a statistical shape model and
generate new shapes to train new models. The proposed algorithm is tested in a
theoretical setting using the beta distribution where main problem to solve is
to estimate a function that predicts curves, based on already learned
different, but related ones.Comment: 36 pages, 4 figures, 2 tables, submitted to JML
The trace norm constrained matrix-variate Gaussian process for multitask bipartite ranking
We propose a novel hierarchical model for multitask bipartite ranking. The
proposed approach combines a matrix-variate Gaussian process with a generative
model for task-wise bipartite ranking. In addition, we employ a novel trace
constrained variational inference approach to impose low rank structure on the
posterior matrix-variate Gaussian process. The resulting posterior covariance
function is derived in closed form, and the posterior mean function is the
solution to a matrix-variate regression with a novel spectral elastic net
regularizer. Further, we show that variational inference for the trace
constrained matrix-variate Gaussian process combined with maximum likelihood
parameter estimation for the bipartite ranking model is jointly convex. Our
motivating application is the prioritization of candidate disease genes. The
goal of this task is to aid the identification of unobserved associations
between human genes and diseases using a small set of observed associations as
well as kernels induced by gene-gene interaction networks and disease
ontologies. Our experimental results illustrate the performance of the proposed
model on real world datasets. Moreover, we find that the resulting low rank
solution improves the computational scalability of training and testing as
compared to baseline models.Comment: 14 pages, 9 figures, 5 table
Multifidelity Bayesian Optimization for Binomial Output
The key idea of Bayesian optimization is replacing an expensive target
function with a cheap surrogate model. By selection of an acquisition function
for Bayesian optimization, we trade off between exploration and exploitation.
The acquisition function typically depends on the mean and the variance of the
surrogate model at a given point.
The most common Gaussian process-based surrogate model assumes that the
target with fixed parameters is a realization of a Gaussian process. However,
often the target function doesn't satisfy this approximation. Here we consider
target functions that come from the binomial distribution with the parameter
that depends on inputs. Typically we can vary how many Bernoulli samples we
obtain during each evaluation.
We propose a general Gaussian process model that takes into account Bernoulli
outputs. To make things work we consider a simple acquisition function based on
Expected Improvement and a heuristic strategy to choose the number of samples
at each point thus taking into account precision of the obtained output
Beyond expectation: Deep joint mean and quantile regression for spatio-temporal problems
Spatio-temporal problems are ubiquitous and of vital importance in many
research fields. Despite the potential already demonstrated by deep learning
methods in modeling spatio-temporal data, typical approaches tend to focus
solely on conditional expectations of the output variables being modeled. In
this paper, we propose a multi-output multi-quantile deep learning approach for
jointly modeling several conditional quantiles together with the conditional
expectation as a way to provide a more complete "picture" of the predictive
density in spatio-temporal problems. Using two large-scale datasets from the
transportation domain, we empirically demonstrate that, by approaching the
quantile regression problem from a multi-task learning perspective, it is
possible to solve the embarrassing quantile crossings problem, while
simultaneously significantly outperforming state-of-the-art quantile regression
methods. Moreover, we show that jointly modeling the mean and several
conditional quantiles not only provides a rich description about the predictive
density that can capture heteroscedastic properties at a neglectable
computational overhead, but also leads to improved predictions of the
conditional expectation due to the extra information and a regularization
effect induced by the added quantiles.Comment: 12 pages, 9 figure
A Robust t-process Regression Model with Independent Errors
Gaussian process regression (GPR) model is well-known to be susceptible to
outliers. Robust process regression models based on t-process or other
heavy-tailed processes have been developed to address the problem. However, due
to the nature of the current definition for heavy-tailed processes, the unknown
process regression function and the random errors are always defined jointly
and thus dependently. This definition, mainly owing to the dependence
assumption involved, is not justified in many practical problems and thus
limits the application of those robust approaches. It also results in a
limitation of the theory of robust analysis. In this paper, we propose a new
robust process regression model enabling independent random errors. An
efficient estimation procedure is developed. Statistical properties, such as
unbiasness and information consistency, are provided. Numerical studies show
that the proposed method is robust against outliers and has a better
performance in prediction compared with the existing models. We illustrate that
the estimated random-effects are useful in detecting outlying curves.Comment: 27 pages, 3 figure
- …