176 research outputs found
Copula-like Variational Inference
This paper considers a new family of variational distributions motivated by
Sklar's theorem. This family is based on new copula-like densities on the
hypercube with non-uniform marginals which can be sampled efficiently, i.e.
with a complexity linear in the dimension of state space. Then, the proposed
variational densities that we suggest can be seen as arising from these
copula-like densities used as base distributions on the hypercube with Gaussian
quantile functions and sparse rotation matrices as normalizing flows. The
latter correspond to a rotation of the marginals with complexity . We provide some empirical evidence that such a variational family can
also approximate non-Gaussian posteriors and can be beneficial compared to
Gaussian approximations. Our method performs largely comparably to
state-of-the-art variational approximations on standard regression and
classification benchmarks for Bayesian Neural Networks.Comment: 33rd Conference on Neural Information Processing Systems (NeurIPS
2019), Vancouver, Canad
Approximating multivariate posterior distribution functions from Monte Carlo samples for sequential Bayesian inference
An important feature of Bayesian statistics is the opportunity to do
sequential inference: the posterior distribution obtained after seeing a
dataset can be used as prior for a second inference. However, when Monte Carlo
sampling methods are used for inference, we only have a set of samples from the
posterior distribution. To do sequential inference, we then either have to
evaluate the second posterior at only these locations and reweight the samples
accordingly, or we can estimate a functional description of the posterior
probability distribution from the samples and use that as prior for the second
inference. Here, we investigated to what extent we can obtain an accurate joint
posterior from two datasets if the inference is done sequentially rather than
jointly, under the condition that each inference step is done using Monte Carlo
sampling. To test this, we evaluated the accuracy of kernel density estimates,
Gaussian mixtures, vine copulas and Gaussian processes in approximating
posterior distributions, and then tested whether these approximations can be
used in sequential inference. In low dimensionality, Gaussian processes are
more accurate, whereas in higher dimensionality Gaussian mixtures or vine
copulas perform better. In our test cases, posterior approximations are
preferable over direct sample reweighting, although joint inference is still
preferable over sequential inference. Since the performance is case-specific,
we provide an R package mvdens with a unified interface for the density
approximation methods
A Computational Framework for Efficient Reliability Analysis of Complex Networks
With the growing scale and complexity of modern infrastructure networks comes the challenge of developing efficient and dependable methods for analysing their reliability. Special attention must be given to potential network interdependencies as disregarding these can lead to catastrophic failures. Furthermore, it is of paramount importance to properly treat all uncertainties. The survival signature is a recent development built to effectively analyse complex networks that far exceeds standard techniques in several important areas. Its most distinguishing feature is the complete separation of system structure from probabilistic information. Because of this, it is possible to take into account a variety of component failure phenomena such as dependencies, common causes of failure, and imprecise probabilities without reevaluating the network structure.
This cumulative dissertation presents several key improvements to the survival signature ecosystem focused on the structural evaluation of the system as well as the modelling of component failures.
A new method is presented in which (inter)-dependencies between components and networks are modelled using vine copulas. Furthermore, aleatory and epistemic uncertainties are included by applying probability boxes and imprecise copulas. By leveraging the large number of available copula families it is possible to account for varying dependent effects. The graph-based design of vine copulas synergizes well with the typical descriptions of network topologies. The proposed method is tested on a challenging scenario using the IEEE reliability test system, demonstrating its usefulness and emphasizing the ability to represent complicated scenarios with a range of dependent failure modes.
The numerical effort required to analytically compute the survival signature is prohibitive for large complex systems. This work presents two methods for the approximation of the survival signature. In the first approach system configurations of low interest are excluded using percolation theory, while the remaining parts of the signature are estimated by Monte Carlo simulation. The method is able to accurately approximate the survival signature with very small errors while drastically reducing computational demand. Several simple test systems, as well as two real-world situations, are used to show the accuracy and performance.
However, with increasing network size and complexity this technique also reaches its limits. A second method is presented where the numerical demand is further reduced. Here, instead of approximating the whole survival signature only a few strategically selected values are computed using Monte Carlo simulation and used to build a surrogate model based on normalized radial basis functions. The uncertainty resulting from the approximation of the data points is then propagated through an interval predictor model which estimates bounds for the remaining survival signature values. This imprecise model provides bounds on the survival signature and therefore the network reliability. Because a few data points are sufficient to build the interval predictor model it allows for even larger systems to be analysed.
With the rising complexity of not just the system but also the individual components themselves comes the need for the components to be modelled as subsystems in a system-of-systems approach. A study is presented, where a previously developed framework for resilience decision-making is adapted to multidimensional scenarios in which the subsystems are represented as survival signatures. The survival signature of the subsystems can be computed ahead of the resilience analysis due to the inherent separation of structural information. This enables efficient analysis in which the failure rates of subsystems for various resilience-enhancing endowments are calculated directly from the survival function without reevaluating the system structure.
In addition to the advancements in the field of survival signature, this work also presents a new framework for uncertainty quantification developed as a package in the Julia programming language called UncertaintyQuantification.jl. Julia is a modern high-level dynamic programming language that is ideal for applications such as data analysis and scientific computing. UncertaintyQuantification.jl was built from the ground up to be generalised and versatile while remaining simple to use. The framework is in constant development and its goal is to become a toolbox encompassing state-of-the-art algorithms from all fields of uncertainty quantification and to serve as a valuable tool for both research and industry. UncertaintyQuantification.jl currently includes simulation-based reliability analysis utilising a wide range of sampling schemes, local and global sensitivity analysis, and surrogate modelling methodologies
BOtied: Multi-objective Bayesian optimization with tied multivariate ranks
Many scientific and industrial applications require joint optimization of
multiple, potentially competing objectives. Multi-objective Bayesian
optimization (MOBO) is a sample-efficient framework for identifying
Pareto-optimal solutions. We show a natural connection between non-dominated
solutions and the highest multivariate rank, which coincides with the outermost
level line of the joint cumulative distribution function (CDF). We propose the
CDF indicator, a Pareto-compliant metric for evaluating the quality of
approximate Pareto sets that complements the popular hypervolume indicator. At
the heart of MOBO is the acquisition function, which determines the next
candidate to evaluate by navigating the best compromises among the objectives.
Multi-objective acquisition functions that rely on box decomposition of the
objective space, such as the expected hypervolume improvement (EHVI) and
entropy search, scale poorly to a large number of objectives. We propose an
acquisition function, called BOtied, based on the CDF indicator. BOtied can be
implemented efficiently with copulas, a statistical tool for modeling complex,
high-dimensional distributions. We benchmark BOtied against common acquisition
functions, including EHVI and random scalarization (ParEGO), in a series of
synthetic and real-data experiments. BOtied performs on par with the baselines
across datasets and metrics while being computationally efficient.Comment: 10 pages (+5 appendix), 9 figures. Submitted to NeurIP
Bayesian Network Approach to Assessing System Reliability for Improving System Design and Optimizing System Maintenance
abstract: A quantitative analysis of a system that has a complex reliability structure always involves considerable challenges. This dissertation mainly addresses uncertainty in- herent in complicated reliability structures that may cause unexpected and undesired results.
The reliability structure uncertainty cannot be handled by the traditional relia- bility analysis tools such as Fault Tree and Reliability Block Diagram due to their deterministic Boolean logic. Therefore, I employ Bayesian network that provides a flexible modeling method for building a multivariate distribution. By representing a system reliability structure as a joint distribution, the uncertainty and correlations existing between system’s elements can effectively be modeled in a probabilistic man- ner. This dissertation focuses on analyzing system reliability for the entire system life cycle, particularly, production stage and early design stages.
In production stage, the research investigates a system that is continuously mon- itored by on-board sensors. With modeling the complex reliability structure by Bayesian network integrated with various stochastic processes, I propose several methodologies that evaluate system reliability on real-time basis and optimize main- tenance schedules.
In early design stages, the research aims to predict system reliability based on the current system design and to improve the design if necessary. The three main challenges in this research are: 1) the lack of field failure data, 2) the complex reliability structure and 3) how to effectively improve the design. To tackle the difficulties, I present several modeling approaches using Bayesian inference and nonparametric Bayesian network where the system is explicitly analyzed through the sensitivity analysis. In addition, this modeling approach is enhanced by incorporating a temporal dimension. However, the nonparametric Bayesian network approach generally accompanies with high computational efforts, especially, when a complex and large system is modeled. To alleviate this computational burden, I also suggest to building a surrogate model with quantile regression.
In summary, this dissertation studies and explores the use of Bayesian network in analyzing complex systems. All proposed methodologies are demonstrated by case studies.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201
Recommended from our members
A CFD-informed model for subchannel resolution crud prediction
A physics-directed, statistically based, surrogate model of the small scale flow fea-
tures that impact Chalk River unidentified deposit (crud) growth is presented in this work. The objective of the surrogate is to provide additional details of the rod surface
temperature, heat flux, and near-wall turbulent kinetic energy fields which cannot be
explicitly captured by a subchannel code.
Operating as a mapping from the high fidelity computational fluid dynamics (CFD) data to the low fidelity subchannel grid (hi2lo), the model provides CFD-informed bound-
ary conditions to the crud model executed on the subchannel pin surface mesh. The
surface temperature, heat flux, and turbulent kinetic energy, henceforth referred to as
the fields of interest (FOI), govern the growth rate of crud on the surface of the rod and
the precipitation of boron in the porous crud layer. Therefore the model predicts the
behavior of the FOIs as a function of position in the core and local thermal-hydraulic
(TH) conditions.
The subchannel code produces an estimate for all crud-relevant TH quantities at a
coarse spatial resolution everywhere in the core and executes substantially faster than
CFD. In the hi2lo approach, the solution provided by the subchannel code is augmented
by a predicted stochastic component of the FOI informed by CFD results to provide a
more detailed description of the target FOIs than subchannel can provide alone. To this
end, a novel method based on the marriage of copula and gradient boosting techniques is proposed. This methodology forgoes a spatial interpolation procedure for a statistically
driven approach, which predicts the fractional area of a rod’s surface in excess of some
critical temperature but not precisely where such maxima occur on the rod surface. The
resultant model retains the ability to account for the presence of hot and cold spots on the
rod surface induced by turbulent flow downstream of spacer grids when producing crud
estimates. Sklar’s theorem is leveraged to decompose multivariate probability densities
of the FOI into independent copula and marginal models. The free parameters within the
copula model are predicted using a combination of supervised regression and classification
machine learning techniques with training data sets supplied by a suite of precomputed
CFD results spanning a typical pressurized water reactor TH envelope.
Results show that compared to the subchannel standalone case, the hi2lo method
more accurately preserves the influence of spacer grids on the crud growth rate. Or more
precisely, the hi2lo method recovers key statistical properties of the FOI which impact
crud growth. Compared to gold standard high fidelity CFD/crud coupled results in a
single assembly test case, the hi2lo model produced a relative total crud mass difference
of -8.9% compared to the standalone subchannel relative crud mass difference of 192.1%.Mechanical Engineerin
Parametric Copula-GP model for analyzing multidimensional neuronal and behavioral relationships
One of the main goals of current systems neuroscience is to understand how neuronal populations integrate sensory information to inform behavior. However, estimating stimulus or behavioral information that is encoded in high-dimensional neuronal populations is challenging. We propose a method based on parametric copulas which allows modeling joint distributions of neuronal and behavioral variables characterized by different statistics and timescales. To account for temporal or spatial changes in dependencies between variables, we model varying copula parameters by means of Gaussian Processes (GP). We validate the resulting Copula-GP framework on synthetic data and on neuronal and behavioral recordings obtained in awake mice. We show that the use of a parametric description of the high-dimensional dependence structure in our method provides better accuracy in mutual information estimation in higher dimensions compared to other non-parametric methods. Moreover, by quantifying the redundancy between neuronal and behavioral variables, our model exposed the location of the reward zone in an unsupervised manner (i.e., without using any explicit cues about the task structure). These results demonstrate that the Copula-GP framework is particularly useful for the analysis of complex multidimensional relationships between neuronal, sensory and behavioral variables
- …