147 research outputs found
Monte Carlo Estimation of the Density of the Sum of Dependent Random Variables
We study an unbiased estimator for the density of a sum of random variables
that are simulated from a computer model. A numerical study on examples with
copula dependence is conducted where the proposed estimator performs favourably
in terms of variance compared to other unbiased estimators. We provide
applications and extensions to the estimation of marginal densities in Bayesian
statistics and to the estimation of the density of sums of random variables
under Gaussian copula dependence
Federated Variational Inference Methods for Structured Latent Variable Models
Federated learning methods enable model training across distributed data
sources without data leaving their original locations and have gained
increasing interest in various fields. However, existing approaches are
limited, excluding many structured probabilistic models. We present a general
and elegant solution based on structured variational inference, widely used in
Bayesian machine learning, adapted for the federated setting. Additionally, we
provide a communication-efficient variant analogous to the canonical FedAvg
algorithm. The proposed algorithms' effectiveness is demonstrated, and their
performance is compared with hierarchical Bayesian neural networks and topic
models
Graph Neural Network-Based Anomaly Detection for River Network Systems
Water is the lifeblood of river networks, and its quality plays a crucial
role in sustaining both aquatic ecosystems and human societies. Real-time
monitoring of water quality is increasingly reliant on in-situ sensor
technology. Anomaly detection is crucial for identifying erroneous patterns in
sensor data, but can be a challenging task due to the complexity and
variability of the data, even under normal conditions. This paper presents a
solution to the challenging task of anomaly detection for river network sensor
data, which is essential for accurate and continuous monitoring. We use a graph
neural network model, the recently proposed Graph Deviation Network (GDN),
which employs graph attention-based forecasting to capture the complex
spatio-temporal relationships between sensors. We propose an alternate anomaly
scoring method, GDN+, based on the learned graph. To evaluate the model's
efficacy, we introduce new benchmarking simulation experiments with
highly-sophisticated dependency structures and subsequence anomalies of various
types. We further examine the strengths and weaknesses of this baseline
approach, GDN, in comparison to other benchmarking methods on complex
real-world river network data. Findings suggest that GDN+ outperforms the
baseline approach in high-dimensional data, while also providing improved
interpretability. We also introduce software called gnnad
A PAC-Bayesian Perspective on the Interpolating Information Criterion
Deep learning is renowned for its theory-practice gap, whereby principled
theory typically fails to provide much beneficial guidance for implementation
in practice. This has been highlighted recently by the benign overfitting
phenomenon: when neural networks become sufficiently large to interpolate the
dataset perfectly, model performance appears to improve with increasing model
size, in apparent contradiction with the well-known bias-variance tradeoff.
While such phenomena have proven challenging to theoretically study for general
models, the recently proposed Interpolating Information Criterion (IIC)
provides a valuable theoretical framework to examine performance for
overparameterized models. Using the IIC, a PAC-Bayes bound is obtained for a
general class of models, characterizing factors which influence generalization
performance in the interpolating regime. From the provided bound, we quantify
how the test error for overparameterized models achieving effectively zero
training error depends on the quality of the implicit regularization imposed by
e.g. the combination of model, optimizer, and parameter-initialization scheme;
the spectrum of the empirical neural tangent kernel; curvature of the loss
landscape; and noise present in the data.Comment: 9 page
Continuously-Tempered PDMP samplers
New sampling algorithms based on simulating continuous-time stochastic processes called piecewise deterministic Markov processes (PDMPs) have shown considerable promise. However, these methods can struggle to sample from multi-modal or heavy-tailed distributions. We show how tempering ideas can improve the mixing of PDMPs in such cases. We introduce an extended distribution defined over the state of the posterior distribution and an inverse temperature, which interpolates between a tractable distribution when the inverse temperature is 0 and the posterior when the inverse temperature is 1. The marginal distribution of the inverse temperature is a mixture of a continuous distribution on [0,1) and a point mass at 1: which means that we obtain samples when the inverse temperature is 1, and these are draws from the posterior, but sampling algorithms will also explore distributions at lower temperatures which will improve mixing. We show how PDMPs, and particularly the Zig-Zag sampler, can be implemented to sample from such an extended distribution. The resulting algorithm is easy to implement and we show empirically that it can outperform existing PDMP-based samplers on challenging multimodal posteriors
- …