32,403 research outputs found

    Tensor Regression Meets Gaussian Processes

    Full text link
    Low-rank tensor regression, a new model class that learns high-order correlation from data, has recently received considerable attention. At the same time, Gaussian processes (GP) are well-studied machine learning models for structure learning. In this paper, we demonstrate interesting connections between the two, especially for multi-way data analysis. We show that low-rank tensor regression is essentially learning a multi-linear kernel in Gaussian processes, and the low-rank assumption translates to the constrained Bayesian inference problem. We prove the oracle inequality and derive the average case learning curve for the equivalent GP model. Our finding implies that low-rank tensor regression, though empirically successful, is highly dependent on the eigenvalues of covariance functions as well as variable correlations.Comment: 17 page

    Infinite Shift-invariant Grouped Multi-task Learning for Gaussian Processes

    Full text link
    Multi-task learning leverages shared information among data sets to improve the learning performance of individual tasks. The paper applies this framework for data where each task is a phase-shifted periodic time series. In particular, we develop a novel Bayesian nonparametric model capturing a mixture of Gaussian processes where each task is a sum of a group-specific function and a component capturing individual variation, in addition to each task being phase shifted. We develop an efficient \textsc{em} algorithm to learn the parameters of the model. As a special case we obtain the Gaussian mixture model and \textsc{em} algorithm for phased-shifted periodic time series. Furthermore, we extend the proposed model by using a Dirichlet Process prior and thereby leading to an infinite mixture model that is capable of doing automatic model selection. A Variational Bayesian approach is developed for inference in this model. Experiments in regression, classification and class discovery demonstrate the performance of the proposed models using both synthetic data and real-world time series data from astrophysics. Our methods are particularly useful when the time series are sparsely and non-synchronously sampled.Comment: This is an extended version of our ECML 2010 paper entitled "Shift-invariant Grouped Multi-task Learning for Gaussian Processes"; ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II

    Bayesian Optimization for Policy Search via Online-Offline Experimentation

    Full text link
    Online field experiments are the gold-standard way of evaluating changes to real-world interactive machine learning systems. Yet our ability to explore complex, multi-dimensional policy spaces - such as those found in recommendation and ranking problems - is often constrained by the limited number of experiments that can be run simultaneously. To alleviate these constraints, we augment online experiments with an offline simulator and apply multi-task Bayesian optimization to tune live machine learning systems. We describe practical issues that arise in these types of applications, including biases that arise from using a simulator and assumptions for the multi-task kernel. We measure empirical learning curves which show substantial gains from including data from biased offline experiments, and show how these learning curves are consistent with theoretical results for multi-task Gaussian process generalization. We find that improved kernel inference is a significant driver of multi-task generalization. Finally, we show several examples of Bayesian optimization efficiently tuning a live machine learning system by combining offline and online experiments

    Leveraging Robotic Prior Tactile Exploratory Action Experiences For Learning New Objects's Physical Properties

    Full text link
    Reusing the tactile knowledge of some previously-explored objects helps us humans to easily recognize the tactual properties of new objects. In this master thesis, we enable arobotic arm equipped with multi-modal artificial skin, like humans, to actively transfer the prior tactile exploratory action experiences when it learns the detailed physical properties of new objects. These prior tactile experiences are built when the robot applies the pressing, sliding and static contact movements on objects with different action parameters and perceives the tactile feedbacks from multiple sensory modalities. Our method was systematically evaluated by several experiments. Results show that the robot could consistently improve the discrimination accuracy by over 10% when it exploited the prior tactile knowledge compared with using no transfer method, and 25% when it used only one training sample. The results also show that the proposed method was robust against transferring irrelevant prior tactile knowledge.Comment: Master's thesis in the Faculty of Electrical and Computer Engineering, Technical University of Munic

    Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes

    Full text link
    Predicated on the increasing abundance of electronic health records, we investi- gate the problem of inferring individualized treatment effects using observational data. Stemming from the potential outcomes model, we propose a novel multi- task learning framework in which factual and counterfactual outcomes are mod- eled as the outputs of a function in a vector-valued reproducing kernel Hilbert space (vvRKHS). We develop a nonparametric Bayesian method for learning the treatment effects using a multi-task Gaussian process (GP) with a linear coregion- alization kernel as a prior over the vvRKHS. The Bayesian approach allows us to compute individualized measures of confidence in our estimates via pointwise credible intervals, which are crucial for realizing the full potential of precision medicine. The impact of selection bias is alleviated via a risk-based empirical Bayes method for adapting the multi-task GP prior, which jointly minimizes the empirical error in factual outcomes and the uncertainty in (unobserved) counter- factual outcomes. We conduct experiments on observational datasets for an inter- ventional social program applied to premature infants, and a left ventricular assist device applied to cardiac patients wait-listed for a heart transplant. In both experi- ments, we show that our method significantly outperforms the state-of-the-art

    Hyper-Process Model: A Zero-Shot Learning algorithm for Regression Problems based on Shape Analysis

    Full text link
    Zero-shot learning (ZSL) can be defined by correctly solving a task where no training data is available, based on previous acquired knowledge from different, but related tasks. So far, this area has mostly drawn the attention from computer vision community where a new unseen image needs to be correctly classified, assuming the target class was not used in the training procedure. Apart from image classification, only a couple of generic methods were proposed that are applicable to both classification and regression. These learn the relation among model coefficients so new ones can be predicted according to provided conditions. So far, up to our knowledge, no methods exist that are applicable only to regression, and take advantage from such setting. Therefore, the present work proposes a novel algorithm for regression problems that uses data drawn from trained models, instead of model coefficients. In this case, a shape analyses on the data is performed to create a statistical shape model and generate new shapes to train new models. The proposed algorithm is tested in a theoretical setting using the beta distribution where main problem to solve is to estimate a function that predicts curves, based on already learned different, but related ones.Comment: 36 pages, 4 figures, 2 tables, submitted to JML

    The trace norm constrained matrix-variate Gaussian process for multitask bipartite ranking

    Full text link
    We propose a novel hierarchical model for multitask bipartite ranking. The proposed approach combines a matrix-variate Gaussian process with a generative model for task-wise bipartite ranking. In addition, we employ a novel trace constrained variational inference approach to impose low rank structure on the posterior matrix-variate Gaussian process. The resulting posterior covariance function is derived in closed form, and the posterior mean function is the solution to a matrix-variate regression with a novel spectral elastic net regularizer. Further, we show that variational inference for the trace constrained matrix-variate Gaussian process combined with maximum likelihood parameter estimation for the bipartite ranking model is jointly convex. Our motivating application is the prioritization of candidate disease genes. The goal of this task is to aid the identification of unobserved associations between human genes and diseases using a small set of observed associations as well as kernels induced by gene-gene interaction networks and disease ontologies. Our experimental results illustrate the performance of the proposed model on real world datasets. Moreover, we find that the resulting low rank solution improves the computational scalability of training and testing as compared to baseline models.Comment: 14 pages, 9 figures, 5 table

    Multifidelity Bayesian Optimization for Binomial Output

    Full text link
    The key idea of Bayesian optimization is replacing an expensive target function with a cheap surrogate model. By selection of an acquisition function for Bayesian optimization, we trade off between exploration and exploitation. The acquisition function typically depends on the mean and the variance of the surrogate model at a given point. The most common Gaussian process-based surrogate model assumes that the target with fixed parameters is a realization of a Gaussian process. However, often the target function doesn't satisfy this approximation. Here we consider target functions that come from the binomial distribution with the parameter that depends on inputs. Typically we can vary how many Bernoulli samples we obtain during each evaluation. We propose a general Gaussian process model that takes into account Bernoulli outputs. To make things work we consider a simple acquisition function based on Expected Improvement and a heuristic strategy to choose the number of samples at each point thus taking into account precision of the obtained output

    Beyond expectation: Deep joint mean and quantile regression for spatio-temporal problems

    Full text link
    Spatio-temporal problems are ubiquitous and of vital importance in many research fields. Despite the potential already demonstrated by deep learning methods in modeling spatio-temporal data, typical approaches tend to focus solely on conditional expectations of the output variables being modeled. In this paper, we propose a multi-output multi-quantile deep learning approach for jointly modeling several conditional quantiles together with the conditional expectation as a way to provide a more complete "picture" of the predictive density in spatio-temporal problems. Using two large-scale datasets from the transportation domain, we empirically demonstrate that, by approaching the quantile regression problem from a multi-task learning perspective, it is possible to solve the embarrassing quantile crossings problem, while simultaneously significantly outperforming state-of-the-art quantile regression methods. Moreover, we show that jointly modeling the mean and several conditional quantiles not only provides a rich description about the predictive density that can capture heteroscedastic properties at a neglectable computational overhead, but also leads to improved predictions of the conditional expectation due to the extra information and a regularization effect induced by the added quantiles.Comment: 12 pages, 9 figure

    A Robust t-process Regression Model with Independent Errors

    Full text link
    Gaussian process regression (GPR) model is well-known to be susceptible to outliers. Robust process regression models based on t-process or other heavy-tailed processes have been developed to address the problem. However, due to the nature of the current definition for heavy-tailed processes, the unknown process regression function and the random errors are always defined jointly and thus dependently. This definition, mainly owing to the dependence assumption involved, is not justified in many practical problems and thus limits the application of those robust approaches. It also results in a limitation of the theory of robust analysis. In this paper, we propose a new robust process regression model enabling independent random errors. An efficient estimation procedure is developed. Statistical properties, such as unbiasness and information consistency, are provided. Numerical studies show that the proposed method is robust against outliers and has a better performance in prediction compared with the existing models. We illustrate that the estimated random-effects are useful in detecting outlying curves.Comment: 27 pages, 3 figure
    corecore