34 research outputs found
Learning the hub graphical Lasso model with the structured sparsity via an efficient algorithm
Graphical models have exhibited their performance in numerous tasks ranging
from biological analysis to recommender systems. However, graphical models with
hub nodes are computationally difficult to fit, particularly when the dimension
of the data is large. To efficiently estimate the hub graphical models, we
introduce a two-phase algorithm. The proposed algorithm first generates a good
initial point via a dual alternating direction method of multipliers (ADMM),
and then warm starts a semismooth Newton (SSN) based augmented Lagrangian
method (ALM) to compute a solution that is accurate enough for practical tasks.
The sparsity structure of the generalized Jacobian ensures that the algorithm
can obtain a nice solution very efficiently. Comprehensive experiments on both
synthetic data and real data show that it obviously outperforms the existing
state-of-the-art algorithms. In particular, in some high dimensional tasks, it
can save more than 70\% of the execution time, meanwhile still achieves a
high-quality estimation.Comment: 28 pages,3 figure
Nonlinearity, Feedback and Uniform Consistency in Causal Structural Learning
The goal of Causal Discovery is to find automated search methods for learning
causal structures from observational data. In some cases all variables of the
interested causal mechanism are measured, and the task is to predict the
effects one measured variable has on another. In contrast, sometimes the
variables of primary interest are not directly observable but instead inferred
from their manifestations in the data. These are referred to as latent
variables. One commonly known example is the psychological construct of
intelligence, which cannot directly measured so researchers try to assess
through various indicators such as IQ tests. In this case, casual discovery
algorithms can uncover underlying patterns and structures to reveal the causal
connections between the latent variables and between the latent and observed
variables. This thesis focuses on two questions in causal discovery: providing
an alternative definition of k-Triangle Faithfulness that (i) is weaker than
strong faithfulness when applied to the Gaussian family of distributions, (ii)
can be applied to non-Gaussian families of distributions, and (iii) under the
assumption that the modified version of Strong Faithfulness holds, can be used
to show the uniform consistency of a modified causal discovery algorithm;
relaxing the sufficiency assumption to learn causal structures with latent
variables. Given the importance of inferring cause-and-effect relationships for
understanding and forecasting complex systems, the work in this thesis of
relaxing various simplification assumptions is expected to extend the causal
discovery method to be applicable in a wider range with diversified causal
mechanism and statistical phenomena
Discriminative calibration: Check Bayesian computation from simulations and flexible classifier
To check the accuracy of Bayesian computations, it is common to use
rank-based simulation-based calibration (SBC). However, SBC has drawbacks: The
test statistic is somewhat ad-hoc, interactions are difficult to examine,
multiple testing is a challenge, and the resulting p-value is not a divergence
metric. We propose to replace the marginal rank test with a flexible
classification approach that learns test statistics from data. This measure
typically has a higher statistical power than the SBC rank test and returns an
interpretable divergence measure of miscalibration, computed from
classification accuracy. This approach can be used with different data
generating processes to address likelihood-free inference or traditional
inference methods like Markov chain Monte Carlo or variational inference. We
illustrate an automated implementation using neural networks and
statistically-inspired features, and validate the method with numerical and
real data experiments.Comment: Published at Neural Information Processing Systems (NeurIPS 2023
Causal Discovery in Linear Structural Causal Models with Deterministic Relations
Linear structural causal models (SCMs) -- in which each observed variable is
generated by a subset of the other observed variables as well as a subset of
the exogenous sources -- are pervasive in causal inference and casual
discovery. However, for the task of causal discovery, existing work almost
exclusively focus on the submodel where each observed variable is associated
with a distinct source with non-zero variance. This results in the restriction
that no observed variable can deterministically depend on other observed
variables or latent confounders. In this paper, we extend the results on
structure learning by focusing on a subclass of linear SCMs which do not have
this property, i.e., models in which observed variables can be causally
affected by any subset of the sources, and are allowed to be a deterministic
function of other observed variables or latent confounders. This allows for a
more realistic modeling of influence or information propagation in systems. We
focus on the task of causal discovery form observational data generated from a
member of this subclass. We derive a set of necessary and sufficient conditions
for unique identifiability of the causal structure. To the best of our
knowledge, this is the first work that gives identifiability results for causal
discovery under both latent confounding and deterministic relationships.
Further, we propose an algorithm for recovering the underlying causal structure
when the aforementioned conditions are satisfied. We validate our theoretical
results both on synthetic and real datasets.Comment: Accepted at 1st Conference on Causal Learning and Reasoning (CLeaR
2022
Regularised inference for changepoint and dependency analysis in non-stationary processes
Multivariate correlated time series are found in many modern socio-scientific domains such as neurology, cyber-security, genetics and economics. The focus of this thesis is on efficiently modelling and inferring dependency structure both between data-streams and across points in time. In particular, it is considered that generating processes may vary over time, and are thus non-stationary. For example, patterns of brain activity are expected to change when performing different tasks or thought processes. Models that can describe such behaviour must be adaptable over time. However, such adaptability creates challenges for model identification. In order to perform learning or estimation one must control how model complexity grows in relation to the volume of data. To this extent, one of the main themes of this work is to investigate both the implementation and effect of assumptions on sparsity; relating to model parsimony at an individual time- point, and smoothness; how quickly a model may change over time. Throughout this thesis two basic classes of non-stationary model are stud- ied. Firstly, a class of piecewise constant Gaussian Graphical models (GGM) is introduced that can encode graphical dependencies between data-streams. In particular, a group-fused regulariser is examined that allows for the estima- tion of changepoints across graphical models. The second part of the thesis focuses on extending a class of locally-stationary wavelet (LSW) models. Un- like the raw GGM this enables one to encode dependencies not only between data-streams, but also across time. A set of sparsity aware estimators are developed for estimation of the spectral parameters of such models which are then compared to previous works in the domain
L-C2ST: Local Diagnostics for Posterior Approximations in Simulation-Based Inference
Many recent works in simulation-based inference (SBI) rely on deep generative
models to approximate complex, high-dimensional posterior distributions.
However, evaluating whether or not these approximations can be trusted remains
a challenge. Most approaches evaluate the posterior estimator only in
expectation over the observation space. This limits their interpretability and
is not sufficient to identify for which observations the approximation can be
trusted or should be improved. Building upon the well-known classifier
two-sample test (C2ST), we introduce L-C2ST, a new method that allows for a
local evaluation of the posterior estimator at any given observation. It offers
theoretically grounded and easy to interpret - e.g. graphical - diagnostics,
and unlike C2ST, does not require access to samples from the true posterior. In
the case of normalizing flow-based posterior estimators, L-C2ST can be
specialized to offer better statistical power, while being computationally more
efficient. On standard SBI benchmarks, L-C2ST provides comparable results to
C2ST and outperforms alternative local approaches such as coverage tests based
on highest predictive density (HPD). We further highlight the importance of
local evaluation and the benefit of interpretability of L-C2ST on a challenging
application from computational neuroscience.Comment: 20 pages, 4 figures, 7 appendices, in proceeding
Debiased-CAM for bias-agnostic faithful visual explanations of deep convolutional networks
Class activation maps (CAMs) explain convolutional neural network predictions
by identifying salient pixels, but they become misaligned and misleading when
explaining predictions on images under bias, such as images blurred
accidentally or deliberately for privacy protection, or images with improper
white balance. Despite model fine-tuning to improve prediction performance on
these biased images, we demonstrate that CAM explanations become more deviated
and unfaithful with increased image bias. We present Debiased-CAM to recover
explanation faithfulness across various bias types and levels by training a
multi-input, multi-task model with auxiliary tasks for CAM and bias level
predictions. With CAM as a prediction task, explanations are made tunable by
retraining the main model layers and made faithful by self-supervised learning
from CAMs of unbiased images. The model provides representative, bias-agnostic
CAM explanations about the predictions on biased images as if generated from
their unbiased form. In four simulation studies with different biases and
prediction tasks, Debiased-CAM improved both CAM faithfulness and task
performance. We further conducted two controlled user studies to validate its
truthfulness and helpfulness, respectively. Quantitative and qualitative
analyses of participant responses confirmed Debiased-CAM as more truthful and
helpful. Debiased-CAM thus provides a basis to generate more faithful and
relevant explanations for a wide range of real-world applications with various
sources of bias