124 research outputs found
Bayesian Forecasting in Economics and Finance: A Modern Review
The Bayesian statistical paradigm provides a principled and coherent approach
to probabilistic forecasting. Uncertainty about all unknowns that characterize
any forecasting problem -- model, parameters, latent states -- is able to be
quantified explicitly, and factored into the forecast distribution via the
process of integration or averaging. Allied with the elegance of the method,
Bayesian forecasting is now underpinned by the burgeoning field of Bayesian
computation, which enables Bayesian forecasts to be produced for virtually any
problem, no matter how large, or complex. The current state of play in Bayesian
forecasting in economics and finance is the subject of this review. The aim is
to provide the reader with an overview of modern approaches to the field, set
in some historical context; and with sufficient computational detail given to
assist the reader with implementation.Comment: The paper is now published online at:
https://doi.org/10.1016/j.ijforecast.2023.05.00
Data Science: Measuring Uncertainties
With the increase in data processing and storage capacity, a large amount of data is available. Data without analysis does not have much value. Thus, the demand for data analysis is increasing daily, and the consequence is the appearance of a large number of jobs and published articles. Data science has emerged as a multidisciplinary field to support data-driven activities, integrating and developing ideas, methods, and processes to extract information from data. This includes methods built from different knowledge areas: Statistics, Computer Science, Mathematics, Physics, Information Science, and Engineering. This mixture of areas has given rise to what we call Data Science. New solutions to the new problems are reproducing rapidly to generate large volumes of data. Current and future challenges require greater care in creating new solutions that satisfy the rationality for each type of problem. Labels such as Big Data, Data Science, Machine Learning, Statistical Learning, and Artificial Intelligence are demanding more sophistication in the foundations and how they are being applied. This point highlights the importance of building the foundations of Data Science. This book is dedicated to solutions and discussions of measuring uncertainties in data analysis problems
Probabilistic multiple kernel learning
The integration of multiple and possibly heterogeneous information sources for an overall decision-making process has been an open and unresolved research direction in computing science since its very beginning. This thesis attempts to address parts of that direction by proposing probabilistic data integration algorithms for multiclass decisions where an observation of interest is assigned to one of many categories based on a plurality of information channels
Probabilistic Modelling of Uncertainty with Bayesian nonparametric Machine Learning
This thesis addresses the use of probabilistic predictive modelling and machine learning for quantifying uncertainties. Predictive modelling makes inferences of a process from observations obtained using computational modelling, simulation, or experimentation. This is often achieved using statistical machine learning models which predict the outcome as a function of variable predictors and given process observations. Towards this end Bayesian nonparametric regression is used, which is a highly flexible and probabilistic type of statistical model and provides a natural framework in which uncertainties can be included.
The contributions of this thesis are threefold. Firstly, a novel approach to quantify parametric uncertainty in the Gaussian process latent variable model is presented, which is shown to improve predictive performance when compared with the commonly used variational expectation maximisation approach. Secondly, an emulator using manifold learning (local tangent space alignment) is developed for the purpose of dealing with problems where outputs lie in a high dimensional manifold.
Using this, a framework is proposed to solve the forward problem for uncertainty quantification and applied to two fluid dynamics simulations. Finally, an enriched clustering model for generalised mixtures of Gaussian process experts is presented, which improves clustering, scaling with the number of covariates, and prediction when compared with what is known as the alternative model. This is then applied to a study of Alzheimer’s disease, with the aim of improving prediction of disease progression
Recommended from our members
Toward a scalable Bayesian workflow
A scalable Bayesian workflow needs the combination of fast but reliable computing, efficient but targeted model evaluation, and extensive but directed model building and expansion. In this thesis, I develop a sequence of methods to push the scalability frontier of the workflow.
First, I study diagnostics of Bayesian computing. The Pareto smoothed importance sampling stabilizes importance weights using a generalized Pareto distribution fit to the upper tail of the distribution of the simulated importance ratios. The method, which empirically performs better than existing methods for stabilizing importance sampling estimates, includes stabilized effective sample size estimates, Monte Carlo error estimates and convergence diagnostics. For variational inference, I propose two diagnostic algorithms. The Pareto smoothed importance sampling diagnostic gives a goodness of fit measurement for joint distributions, while the variational simulation-based calibration assesses the average performance of point estimates. I further apply this importance sampling strategy to causal inference and develop diagnostics for covariate imbalance in observational studies.
Second, I develop a solution to continuous model expansion using adaptive path sampling and tempering. This development is relevant to both model-building and computing in the workflow. For the former, I provide an automated way to connect models via a geometric bridge such that a supermodel encompasses individual models as a special case. For the latter, I use adaptive path sampling as a preferred strategy to estimating the normalizing constant and marginal density, based on which I propose two metastable sampling schemes. The continuous simulated tempering aims at multimodal posterior sampling, and the implicit divide-and-conquer sampler aims for a funnel-shaped entropic barrier. Both schemes are highly automated and empirically perform better than existing methods for sampling from metastable distributions.
Last, a complete Bayesian workflow distinguishes itself from a one-shot data analysis by its enthusiasm for multiple model fittings, and open-mindedness to model misspecification. I take the idea of stacking from the point estimation literature and generalize to the combination of Bayesian predictive distributions. Using importance sampling based leave-one-out approximation, stacking is computationally efficient. I compare stacking, Bayesian model averaging, and several variants in a decision theory framework. I further apply the stacking strategy to multimodal sampling in which Markov chain Monte Carlo algorithms can have difficulty moving between modes. The result from stacking is not necessarily equivalent, even asymptotically, to fully Bayesian inference, but it serves many of the same goals. Under misspecified models, stacking can give better predictive performance than full Bayesian inference, hence the multimodality can be considered a blessing rather than a curse. Furthermore, I show that stacking is most effective when the model predictive performance is heterogeneous in inputs, such that it can be further improved by hierarchical modeling. To this end, I develop hierarchical stacking, in which the model weights are input-varying yet partially-pooled, and further generalize this method to incorporate discrete and continuous inputs, other structured priors, and time-series and longitudinal data—big data need big models, and big models need big model evaluation, and big model evaluation itself needs extra data collection and model building
Applications of Approximate Learning and Inference for Probabilistic Models
We develop approximate inference and learning methods for facilitating the use of probabilistic modeling techniques motivated by applications in two different areas. First, we consider the ill-posed inverse problem of recovering an image from an underdetermined system of linear measurements corrupted by noise. Second, we consider the problem of inferring user preferences for items from counts, pairwise comparisons and user activity logs, instances of implicit feedback. Plausible models for images and the noise, incurred when recording them, render posterior inference intractable, while the scale of the inference problem makes sampling based approximations ineffective. Therefore, we develop deterministic approximate inference algorithms for two different augmentations of a typical sparse linear model: first, for the rectified-linear Poisson likelihood, and second, for tree-structured super-Gaussian mixture models. The rectified-linear Poisson likelihood is an alternative noise model, applicable in astronomical and biomedical imaging applications, that operate in intensity regimes in which quantum effects lead to observations that are best described by counts of particles arriving at a sensor, as well as in general Poisson regression problems arising in various fields. In this context we show, that the model-specific computations for Expectation Propagation can be robustly solved by a simple dynamic program. Next, we develop a scalable approximate inference algorithm for structured mixture models, that uses a discrete graphical model to represent dependencies between the latent mixture components of a collection of mixture models. Specifically, we use tree-structured mixtures of super-Gaussians to model the persistence across scales of large coefficients of the Wavelet transform of an image for improved reconstruction. In the second part on models of user preference, we consider two settings: the global static and the contextual dynamic setting. In the global static setting, we represent user-item preferences by a latent low-rank matrix. Instead of using numeric ratings we develop methods to infer this latent representation for two types of implicit feedback: aggregate counts of users interacting with a service and the binary outcomes of pairwise comparisons. We model count data using a latent Gaussian bilinear model with Poisson likelihoods. For this model, we show that the Variational Gaussian approximation can be further relaxed to be available in closed-form by adding additional constraints, leading to an efficient inference algorithm. In the second implicit feedback scenario, we infer the latent preference matrix from pairwise preference statements. We combine a low-rank bilinear model with non-parameteric item- feature regression and develop a novel approximate variational Expectation Maximization algorithm that mitigates the computational challenges due to latent couplings induced by the pairwise comparisons. Finally, in the contextual dynamic setting, we model sequences of user activity at the granularity of single interaction events instead of aggregate counts. Routinely gathered in the background at a large scale in many applications, such sequences can reveal temporal and contextual aspects of user behavior through recurrent patterns. To describe such data, we propose a generic collaborative sequence model based on recurrent neural networks, that combines ideas from collaborative filtering and language modeling
Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations
We generalize the well-known mixtures of Gaussians approach to density
estimation and the accompanying Expectation--Maximization technique for finding
the maximum likelihood parameters of the mixture to the case where each data
point carries an individual -dimensional uncertainty covariance and has
unique missing data properties. This algorithm reconstructs the
error-deconvolved or "underlying" distribution function common to all samples,
even when the individual data points are samples from different distributions,
obtained by convolving the underlying distribution with the heteroskedastic
uncertainty distribution of the data point and projecting out the missing data
directions. We show how this basic algorithm can be extended with conjugate
priors on all of the model parameters and a "split-and-merge" procedure
designed to avoid local maxima of the likelihood. We demonstrate the full
method by applying it to the problem of inferring the three-dimensional
velocity distribution of stars near the Sun from noisy two-dimensional,
transverse velocity measurements from the Hipparcos satellite.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS439 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …