15 research outputs found

    PyOED: An Extensible Suite for Data Assimilation and Model-Constrained Optimal Design of Experiments

    Full text link
    This paper describes the first version (v1.0) of PyOED, a highly extensible scientific package that enables developing and testing model-constrained optimal experimental design (OED) for inverse problems. Specifically, PyOED aims to be a comprehensive Python toolkit for model-constrained OED. The package targets scientists and researchers interested in understanding the details of OED formulations and approaches. It is also meant to enable researchers to experiment with standard and innovative OED technologies with a wide range of test problems (e.g., simulation models). Thus, PyOED is continuously being expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. These pieces are added such that they can be permuted to enable testing OED methods in various settings of varying complexities. The PyOED core is completely written in Python and utilizes the inherent object-oriented capabilities; however, the current version of PyOED is meant to be extensible rather than scalable. Specifically, PyOED is developed to ``enable rapid development and benchmarking of OED methods with minimal coding effort and to maximize code reutilization.'' PyOED will be continuously expanded with a plethora of Bayesian inversion, DA, and OED methods as well as new scientific simulation models, observation error models, and observation operators. This paper provides a brief description of the PyOED layout and philosophy and provides a set of exemplary test cases and tutorials to demonstrate how the package can be utilized.Comment: 26 pages, 7 figures, 21 code snippet

    Sensor Clusterization in D-optimal Design in Infinite Dimensional Bayesian Inverse Problems

    Full text link
    We investigate the problem of sensor clusterization in optimal experimental design for infinite-dimensional Bayesian inverse problems. We suggest an analytically tractable model for such designs and reason how it may lead to sensor clusterization in the case of iid measurement noise. We also show that in the case of spatially correlated measurement error clusterization does not occur. As a part of the analysis we prove a matrix determinant lemma analog in infinite dimensions, as well as a lemma for calculating derivatives of logdet\log \det of operators.Comment: 19 pages, two figure

    Randomized low-rank approximation of monotone matrix functions

    Full text link
    This work is concerned with computing low-rank approximations of a matrix function f(A)f(A) for a large symmetric positive semi-definite matrix AA, a task that arises in, e.g., statistical learning and inverse problems. The application of popular randomized methods, such as the randomized singular value decomposition or the Nystr\"om approximation, to f(A)f(A) requires multiplying f(A)f(A) with a few random vectors. A significant disadvantage of such an approach, matrix-vector products with f(A)f(A) are considerably more expensive than matrix-vector products with AA, even when carried out only approximately via, e.g., the Lanczos method. In this work, we present and analyze funNystr\"om, a simple and inexpensive method that constructs a low-rank approximation of f(A)f(A) directly from a Nystr\"om approximation of AA, completely bypassing the need for matrix-vector products with f(A)f(A). It is sensible to use funNystr\"om whenever ff is monotone and satisfies f(0)=0f(0) = 0. Under the stronger assumption that ff is operator monotone, which includes the matrix square root A1/2A^{1/2} and the matrix logarithm log(I+A)\log(I+A), we derive probabilistic bounds for the error in the Frobenius, nuclear, and operator norms. These bounds confirm the numerical observation that funNystr\"om tends to return an approximation that compares well with the best low-rank approximation of f(A)f(A). Our method is also of interest when estimating quantities associated with f(A)f(A), such as the trace or the diagonal entries of f(A)f(A). In particular, we propose and analyze funNystr\"om++, a combination of funNystr\"om with the recently developed Hutch++ method for trace estimation

    Learning to compress and search visual data in large-scale systems

    Full text link
    The problem of high-dimensional and large-scale representation of visual data is addressed from an unsupervised learning perspective. The emphasis is put on discrete representations, where the description length can be measured in bits and hence the model capacity can be controlled. The algorithmic infrastructure is developed based on the synthesis and analysis prior models whose rate-distortion properties, as well as capacity vs. sample complexity trade-offs are carefully optimized. These models are then extended to multi-layers, namely the RRQ and the ML-STC frameworks, where the latter is further evolved as a powerful deep neural network architecture with fast and sample-efficient training and discrete representations. For the developed algorithms, three important applications are developed. First, the problem of large-scale similarity search in retrieval systems is addressed, where a double-stage solution is proposed leading to faster query times and shorter database storage. Second, the problem of learned image compression is targeted, where the proposed models can capture more redundancies from the training images than the conventional compression codecs. Finally, the proposed algorithms are used to solve ill-posed inverse problems. In particular, the problems of image denoising and compressive sensing are addressed with promising results.Comment: PhD thesis dissertatio

    High-dimensional Bayesian methods for interpretable nowcasting and risk estimation

    Get PDF
    This thesis presents new models for nowcasting and macro risk estimation using frontier Bayesian methods that enable incorporating Big Data into policy relevant prediction problems. We propose variable selection algorithms motivated from Bayesian decision theory to make model outcomes interpretable to the policy maker. In chapter 2, we propose a Bayesian Structural Time Series (BSTS) model for nowcasting GDP growth. This model jointly estimates latent time trends to capture slow moving changes in economic conditions along-side a high dimensional mixed frequency component that is extracted from higher frequency (monthly) cyclical information. We extend on previous implementations of the BSTS with priors and variable selection methods which facilitate selection over latent time trends as well as mixed-frequency information that remain tractable to the policy maker. Empirically, we provide a novel nowcast application where we use a large dimensional set of Internet search terms to gain advance information about supply and demand sentiment for the US economy before more commonly considered macro information are available to the nowcaster. We find that our proposed BSTS model offers large improvements over competing models and that Internet search terms matter for nowcasts before hard information about the macro economy have been published. A simulation exercise confirms the good performance of the proposed model. Chapter 3 presents the T-SV-t-BMIDAS (Bayesian Mixed Data Sampling) model for nowcasting quarterly GDP growth. The model incorporates a long-run time-varying trend (T) and t-distributed stochastic volatility accounting for outliers (SV-t) into a Bayesian multivariate MIDAS. To address the high-dimensionality of the model, to account for group-correlation in mixed frequency data, and to make the model interpretable to the policy maker, we propose a new combination of group-shrinkage prior with sparsification algorithm for variable selection. The prior flexibly accommodates between-group sparsity and within-group correlation and allows to communicate the joint importance of predictors over the data release cycle. We evaluate the model for UK GDP growth nowcasts covering also the time-span of the Covid-19 recession. The model is competitive prior to the pandemic relative to various benchmark models, while yielding substantial nowcast improvements during the pandemic. Contrary to many previous nowcasting approaches, the model reads in sparse group signals from the data. Simulations show competitive performance of the variable selection methodology, with particularly good performance to be expected for highly correlated data as well as dense data-generating-processes. Chapter 4 presents a new Bayesian Quantile Regression (BQR) model for high dimensional risk estimation. It extends the horseshoe prior to the BQR framework and provides a fast sampling algorithm for computation that makes it efficient for high-dimensional problems. A large scale simulation exercise reveals that compared to alternative shrinkage priors, the proposed methods yield better performance in coefficient bias and forecast error, especially in sparse data-generating processes and in estimating extreme quantiles. In a high dimensional Growth-at-Risk forecasting application, we forecast tail risks as well as complete forecast densities using a database covering over 200 variables related to the U.S. economy. Quantile specific and density calibration score functions show that the horseshoe prior provides the best performance compared to competing Bayesian quantile regression priors, especially at short and medium run horizons. Bayesian quantile regression models with continuous shrinkage priors are known to predict well but are hard to interpret due to lack of exact posterior sparsity. Chapter 5 bridges this gap by extending the idea of decoupling shrinkage and sparsity. The proposed procedure follows two steps: First, the quantile regression posterior is shrunk via state of the art continuous shrinkage priors; then, the posterior is sparsified by taking the Bayes optimal solution to maximising a policy maker’s utility function with joint preference for predictive accuracy as well as sparsity. For the sparsification component, we propose a new variant of the signal adaptive variable selection algorithm that automates the choice of penalization in the integrated tility through a quantile specific loss-function that works well in high dimensions. Large scale simulations show that, compared to the un-sparsified regression posterior, the selection procedure decreases coefficient bias irrespective of the true underlying degree of sparsity in the data, and goodness of variable selection is competitive with traditional variable selection priors. A high dimensional Growth-at-Risk forecasting application to the US shows that the method detects varying degrees of sparsity across the conditional GDP distribution and that the sources to downside risk vary substantially over time. Inspired by the work of Giannone et al. (2021) on the “illusion of sparsity” from sparse modelling techniques, this chapter (6) investigates whether the recently popularised global-local priors, firstly, are implicitly informative about sparsity and, secondly, whether they are able to communicate the true degree of sparsity from the data. We consider two methods of analysis: implicit model size distributions and sparsification techniques which are tested on a host of economic data sets and simulations. The findings motivate a new horseshoe type model to which we add a prior that makes it a-priori agnostic about the degree of sparsity and is shown to be competitive to the spike-and-slab of Giannone et al. (2021) for forecasting as well as sparsity detection. Chapter 7 concludes with summaries, limitations of the thesis, as well as directions for future research

    Scalable Bayesian sparse learning in high-dimensional model

    Full text link
    Nowadays, high-dimensional models, where the number of parameters or features can even be larger than the number of observations are encountered on a fairly regular basis due to advancements in modern computation. For example, in gene expression datasets, we often encounter datasets with observations in the order of at most a few hundred and with predictors from thousands of genes. One of the goals is to identify the genes which are relevant to the expression. Another example is model compression, which aims to alleviate the costs of large model sizes. The former example is the variable or feature selection problem, while the latter is the model selection problem. In the Bayesian framework, we often specify shrinkage priors that induce sparsity in the model. The sparsity-inducing prior will have a high concentration around zero to identify the zero coefficient and heavy tails to capture the non-zero element. In this thesis, we first provide an overview of the most well-known sparsity-inducing priors. Then we propose to use L12L_{\frac{1}{2}} prior with a partially collapsed Gibbs (PCG) sampler 2 to explore the high dimensional parameter space in linear regression models and variable selection is achieved through credible intervals. We also develop a coordinate-wise optimization for posterior mode search with theoretical guarantees. We then extend the PCG sampler to develop a scalable ordinal regression model with a real application in the study of student evaluation of surveys. Next, we move to modern deep learning. A constrained variational Adam (CVA) algorithm has been introduced to optimize the Bayesian neural network and its connection to stochastic gradient Hamiltonian Monte Carlo has been discussed. We then generalize our algorithm to constrained variational Adam with expectation maximization (CVA-EM), which incorporates the spike-and-slab prior to capturing the sparsity of the neural network. Both nonlinear high dimensional variable selection and network pruning can be achieved by this algorithm. We further show that the CVA-EM algorithm can extend to the graph neural networks to produce both sparse graphs and sparse weights. Finally, we discuss the sparse VAE with L12L_{\frac{1}{2}} prior as potential future work

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum
    corecore