87 research outputs found
Gaussian Process Conditional Copulas with Applications to Financial Time Series
The estimation of dependencies between multiple variables is a central
problem in the analysis of financial time series. A common approach is to
express these dependencies in terms of a copula function. Typically the copula
function is assumed to be constant but this may be inaccurate when there are
covariates that could have a large influence on the dependence structure of the
data. To account for this, a Bayesian framework for the estimation of
conditional copulas is proposed. In this framework the parameters of a copula
are non-linearly related to some arbitrary conditioning variables. We evaluate
the ability of our method to predict time-varying dependencies on several
equities and currencies and observe consistent performance gains compared to
static copula models and other time-varying copula methods
Learning feature selection dependencies in multi-task learning
This is an electronic version of the paper presented at the 27 Annual Conference on Neural Information Processing Systems, held in Lake Tahoe on 2013A probabilistic model based on the horseshoe prior is proposed for learning dependencies in the process of identifying relevant features for prediction. Exact inference is intractable in this model. However, expectation propagation offers an approximate alternative. Because the process of estimating feature selection dependencies may suffer from over-fitting in the model proposed, additional data from a multi-task learning scenario are considered for induction. The same model can be used in this setting with few modifications. Furthermore, the assumptions made are less restrictive than in other multi-task methods: The different tasks must share feature selection dependencies, but can have different relevant features and model coefficients. Experiments with real and synthetic data show that this model performs better than other multi-task alternatives from the literature. The experiments also show that the model is able to induce suitable feature selection dependencies for the problems considered, only from the training data
Dealing with Integer-valued Variables in Bayesian Optimization with Gaussian Processes
Bayesian optimization (BO) methods are useful for optimizing functions that
are expensive to evaluate, lack an analytical expression and whose evaluations
can be contaminated by noise. These methods rely on a probabilistic model of
the objective function, typically a Gaussian process (GP), upon which an
acquisition function is built. This function guides the optimization process
and measures the expected utility of performing an evaluation of the objective
at a new point. GPs assume continous input variables. When this is not the
case, such as when some of the input variables take integer values, one has to
introduce extra approximations. A common approach is to round the suggested
variable value to the closest integer before doing the evaluation of the
objective. We show that this can lead to problems in the optimization process
and describe a more principled approach to account for input variables that are
integer-valued. We illustrate in both synthetic and a real experiments the
utility of our approach, which significantly improves the results of standard
BO methods on problems involving integer-valued variables.Comment: 7 page
A Probabilistic Model for Dirty Multi-task Feature Selection
Multi-task feature selection methods often make
the hypothesis that learning tasks share relevant
and irrelevant features. However, this hypothesis
may be too restrictive in practice. For example,
there may be a few tasks with specific relevant
and irrelevant features (outlier tasks). Similarly,
a few of the features may be relevant for
only some of the tasks (outlier features). To account
for this, we propose a model for multi-task
feature selection based on a robust prior distribution
that introduces a set of binary latent variables
to identify outlier tasks and outlier features.
Expectation propagation can be used for efficient
approximate inference under the proposed prior.
Several experiments show that a model based on
the new robust prior provides better predictive
performance than other benchmark methods.Daniel Hernández-Lobato gratefully acknowledges the use of the facilities of Centro de Computacin CientÃfica (CCC) at Universidad Autónoma de Madrid. This author also acknowledges financial support from Spanish Plan Nacional I+D+i, Grant TIN2013-42351-P, and from Comunidad de Madrid, Grant S2013/ICE-2845 CASI-CAM-CM. José Miguel Hernández-Lobato acknowledges financial support from the Rafael del Pino Fundation
Non-linear Causal Inference using Gaussianity Measures
We provide theoretical and empirical evidence for a type of asymmetry between
causes and effects that is present when these are related via linear models
contaminated with additive non-Gaussian noise. Assuming that the causes and the
effects have the same distribution, we show that the distribution of the
residuals of a linear fit in the anti-causal direction is closer to a Gaussian
than the distribution of the residuals in the causal direction. This
Gaussianization effect is characterized by reduction of the magnitude of the
high-order cumulants and by an increment of the differential entropy of the
residuals. The problem of non-linear causal inference is addressed by
performing an embedding in an expanded feature space, in which the relation
between causes and effects can be assumed to be linear. The effectiveness of a
method to discriminate between causes and effects based on this type of
asymmetry is illustrated in a variety of experiments using different measures
of Gaussianity. The proposed method is shown to be competitive with
state-of-the-art techniques for causal inference.Comment: 35 pages, 9 figure
- …