212 research outputs found
Approximating non-Gaussian Bayesian networks using minimum information vine model with applications in financial modelling
Many financial modeling applications require to jointly model multiple uncertain quantities to presentmore accurate, near future probabilistic predictions. Informed decision making would certainly benefitfrom such predictions. Bayesian networks (BNs) and copulas are widely used for modeling numerousuncertain scenarios. Copulas, in particular, have attracted more interest due to their nice property ofapproximating the probability distribution of the data with heavy tail. Heavy tail data is frequentlyobserved in financial applications. The standard multivariate copula suffer from serious limitations whichmade them unsuitable for modeling the financial data. An alternative copula model called the pair-copulaconstruction (PCC) model is more flexible and efficient for modeling the complex dependence of finan-cial data. The only restriction of PCC model is the challenge of selecting the best model structure. Thisissue can be tackled by capturing conditional independence using the Bayesian network PCC (BN-PCC).The flexible structure of this model can be derived from conditional independences statements learnedfrom data. Additionally, the difficulty of computing conditional distributions in graphical models for non-Gaussian distributions can be eased using pair-copulas. In this paper, we extend this approach furtherusing the minimum information vine model which results in a more flexible and efficient approach inunderstanding the complex dependence between multiple variables with heavy tail dependence andasymmetric features which appear widely in the financial applications
Approximating multivariate distributions with vines
In a series of papers, Bedford and Cooke used vine (or pair-copulae) as a graphical tool for representing complex high dimensional distributions in terms of bivariate and conditional bivariate distributions or copulae. In this paper, we show that how vines can be used to approximate any given multivariate distribution to any required degree of approximation. This paper is more about the approximation rather than optimal estimation methods. To maintain uniform approximation in the class of copulae used to build the corresponding vine we use minimum information approaches. We generalised the results found by Bedford and Cooke that if a minimal information copula satis¯es each of the (local) constraints (on moments, rank correlation, etc.), then the resulting joint distribution will be also minimally informative given those constraints, to all regular vines. We then apply our results to modelling a dataset of Norwegian financial data that was previously analysed in Aas et al. (2009)
Probabilistic modeling of flood characterizations with parametric and minimum information pair-copula model
This paper highlights the usefulness of the minimum information and parametric pair-copula construction (PCC) to model the joint distribution of flood event properties. Both of these models outperform other standard multivariate copula in modeling multivariate flood data that exhibiting complex patterns of dependence, particularly in the tails. In particular, the minimum information pair-copula model shows greater flexibility and produces better approximation of the joint probability density and corresponding measures have capability for effective hazard assessments. The study demonstrates that any multivariate density can be approximated to any degree of desired precision using minimum information pair-copula model and can be practically used for probabilistic flood hazard assessment
Constructing gene regulatory networks from microarray data using non-Gaussian pair-copula Bayesian networks
Many biological and biomedical research areas such as drug design require analyzing the Gene Regulatory Networks (GRNs) to provide clear insight and understanding of the cellular processes in live cells. Under normality assumption for the genes, GRNs can be constructed by assessing the nonzero elements of the inverse covariance matrix. Nevertheless, such techniques are unable to deal with non-normality, multi-modality and heavy tailedness that are commonly seen in current massive genetic data. To relax this limitative constraint, one can apply copula function which is a multivariate cumulative distribution function with uniform marginal distribution. However, since the dependency structures of different pairs of genes in a multivariate problem are very different, the regular multivariate copula will not allow for the construction of an appropriate model. The solution to this problem is using Pair-Copula Constructions (PCCs) which are decompositions of a multivariate density into a cascade of bivariate copula, and therefore, assign different bivariate copula function for each local term. In fact, in this paper, we have constructed inverse covariance matrix based on the use of PCCs when the normality assumption can be moderately or severely violated for capturing a wide range of distributional features and complex dependency structure. To learn the non-Gaussian model for the considered GRN with non-Gaussian genomic data, we apply modified version of copula-based PC algorithm in which normality assumption of marginal densities is dropped. This paper also considers the Dynamic Time Warping (DTW) algorithm to determine the existence of a time delay relation between two genes. Breast cancer is one of the most common diseases in the world where GRN analysis of its subtypes is considerably important; Since by revealing the differences in the GRNs of these subtypes, new therapies and drugs can be found. The findings of our research are used to construct GRNs with high performance, for various subtypes of breast cancer rather than simply using previous models
Probabilistic models for data efficient reinforcement learning
Trial-and-error based reinforcement learning (RL) has seen rapid advancements
in recent times, especially with the advent of deep neural networks. However, the
standard deep learning methods often overlook the progress made in control theory
by treating systems as black-box. We propose a model-based RL framework based
on probabilistic Model Predictive Control (MPC). In particular, we propose to learn
a probabilistic transition model using Gaussian Processes (GPs) to incorporate model
uncertainty into long-term predictions, thereby, reducing the impact of model errors. We
provide theoretical guarantees for first-order optimality in the GP-based transition models
with deterministic approximate inference for long-term planning. We demonstrate that
our approach not only achieves the state-of-the-art data efficiency, but also is a principled
way for RL in constrained environments.
When the true state of the dynamical system cannot be fully observed the standard
model based methods cannot be directly applied. For these systems an additional step of
state estimation is needed. We propose distributed message passing for state estimation in
non-linear dynamical systems. In particular, we propose to use expectation propagation
(EP) to iteratively refine the state estimate, i.e., the Gaussian posterior distribution on the
latent state. We show two things: (a) Classical Rauch-Tung-Striebel (RTS) smoothers,
such as the extended Kalman smoother (EKS) or the unscented Kalman smoother (UKS),
are special cases of our message passing scheme; (b) running the message passing
scheme more than once can lead to significant improvements over the classical RTS
smoothers. We show the explicit connection between message passing with EP and
well-known RTS smoothers and provide a practical implementation of the suggested
algorithm. Furthermore, we address convergence issues of EP by generalising this
framework to damped updates and the consideration of general -divergences.
Probabilistic models can also be used to generate synthetic data. In model based RL
we use ’synthetic’ data as a proxy to real environments and in order to achieve high data
efficiency. The ability to generate high-fidelity synthetic data is crucial when available
(real) data is limited as in RL or where privacy and data protection standards allow
only for limited use of the given data, e.g., in medical and financial data-sets. Current
state-of-the-art methods for synthetic data generation are based on generative models,
such as Generative Adversarial Networks (GANs). Even though GANs have achieved
remarkable results in synthetic data generation, they are often challenging to interpret.
Furthermore, GAN-based methods can suffer when used with mixed real and categorical
variables. Moreover, the loss function (discriminator loss) design itself is problem
specific, i.e., the generative model may not be useful for tasks it was not explicitly trained
for. In this paper, we propose to use a probabilistic model as a synthetic data generator.
Learning the probabilistic model for the data is equivalent to estimating the density of
the data. Based on the copula theory, we divide the density estimation task into two parts,
i.e., estimating univariate marginals and estimating the multivariate copula density over
the univariate marginals. We use normalising flows to learn both the copula density and
univariate marginals. We benchmark our method on both simulated and real data-sets in
terms of density estimation as well as the ability to generate high-fidelity synthetic data.Open Acces
Bayesian Network Approach to Assessing System Reliability for Improving System Design and Optimizing System Maintenance
abstract: A quantitative analysis of a system that has a complex reliability structure always involves considerable challenges. This dissertation mainly addresses uncertainty in- herent in complicated reliability structures that may cause unexpected and undesired results.
The reliability structure uncertainty cannot be handled by the traditional relia- bility analysis tools such as Fault Tree and Reliability Block Diagram due to their deterministic Boolean logic. Therefore, I employ Bayesian network that provides a flexible modeling method for building a multivariate distribution. By representing a system reliability structure as a joint distribution, the uncertainty and correlations existing between system’s elements can effectively be modeled in a probabilistic man- ner. This dissertation focuses on analyzing system reliability for the entire system life cycle, particularly, production stage and early design stages.
In production stage, the research investigates a system that is continuously mon- itored by on-board sensors. With modeling the complex reliability structure by Bayesian network integrated with various stochastic processes, I propose several methodologies that evaluate system reliability on real-time basis and optimize main- tenance schedules.
In early design stages, the research aims to predict system reliability based on the current system design and to improve the design if necessary. The three main challenges in this research are: 1) the lack of field failure data, 2) the complex reliability structure and 3) how to effectively improve the design. To tackle the difficulties, I present several modeling approaches using Bayesian inference and nonparametric Bayesian network where the system is explicitly analyzed through the sensitivity analysis. In addition, this modeling approach is enhanced by incorporating a temporal dimension. However, the nonparametric Bayesian network approach generally accompanies with high computational efforts, especially, when a complex and large system is modeled. To alleviate this computational burden, I also suggest to building a surrogate model with quantile regression.
In summary, this dissertation studies and explores the use of Bayesian network in analyzing complex systems. All proposed methodologies are demonstrated by case studies.Dissertation/ThesisDoctoral Dissertation Industrial Engineering 201
Approximate uncertainty modeling in risk analysis with vine copulas
Many applications of risk analysis require us to jointly model multiple uncertain quantities. Bayesian networks and copulas are two common approaches to modelling joint uncertainties with probability distributions. This paper focuses on new methodologies for copulas by developing work of Cooke, Bedford, Kurowica and others on vines as a way of constructing higher dimensional distributions which do not suffer from some of the restrictions of alternatives such as the multivariate Gaussian copula. The paper provides a fundamental approximation result, demonstrating that we can approximate any density as closely as we like using vines. It further operationalizes this result by showing how minimum information copulas can be used to provide parametric classes of copulas which have such good levels of approximation. We extend previous approaches using vines by considering non-constant conditional dependencies which are particularly relevant in financial risk modelling. We discuss how such models may be quantified, in terms of expert judgement or by fitting data, and illustrate the approach by modelling two financial datasets
Copula-based probabilistic assessment of intensity and duration of cold episodes: A case study of Malayer vineyard region
Frost, particularly during the spring, is one of the most damaging weather phenomena for vineyards, causing significant economic losses to vineyards around the world each year. The risk of tardive frost damage in vine-yards due to changing climate is considered as an important threat to the sustainable production of grapes. Therefore, the cold monitoring strategies is one of the criteria with significant impacts on the yields and prosperity of horticulture and raisin factories. Frost events can be characterized by duration and severity. This paper investigates the risk and impacts of frost phenomenon in the vineyards by modeling the joint distribution of duration and severity factors and analyzing the influential parameter’s dependency structure using capabilities of copula functions. A novel mathematical framework is developed within this study to understand the risk and uncertainties associate with frost events and the impacts on yields of vineyards by analyzing the non-linear dependency structure using copula functions as an efficient tool. The developed model was successfully vali-dated for the case study of vineyard in Malayer city of Iran. The copula model developed in this study was shown to be a robust tool for predicting the return period of the frost events
- …