6 research outputs found
Adversarial Bayesian Simulation
In the absence of explicit or tractable likelihoods, Bayesians often resort
to approximate Bayesian computation (ABC) for inference. Our work bridges ABC
with deep neural implicit samplers based on generative adversarial networks
(GANs) and adversarial variational Bayes. Both ABC and GANs compare aspects of
observed and fake data to simulate from posteriors and likelihoods,
respectively. We develop a Bayesian GAN (B-GAN) sampler that directly targets
the posterior by solving an adversarial optimization problem. B-GAN is driven
by a deterministic mapping learned on the ABC reference by conditional GANs.
Once the mapping has been trained, iid posterior samples are obtained by
filtering noise at a negligible additional cost. We propose two post-processing
local refinements using (1) data-driven proposals with importance reweighting,
and (2) variational Bayes. We support our findings with frequentist-Bayesian
results, showing that the typical total variation distance between the true and
approximate posteriors converges to zero for certain neural network generators
and discriminators. Our findings on simulated data show highly competitive
performance relative to some of the most recent likelihood-free posterior
simulators
Maximum Moment Restriction for Instrumental Variable Regression
We propose a simple framework for nonlinear instrumental variable (IV)
regression based on a kernelized conditional moment restriction (CMR) known as
a maximum moment restriction (MMR). The MMR is formulated by maximizing the
interaction between the residual and the instruments belonging to a unit ball
in a reproducing kernel Hilbert space (RKHS). The MMR allows us to reformulate
the IV regression as a single-step empirical risk minimization problem, where
the risk depends on the reproducing kernel on the instrument and can be
estimated by a U-statistic or V-statistic. This simplification not only eases
the proofs of consistency and asymptotic normality in both parametric and
non-parametric settings, but also results in easy-to-use algorithms with an
efficient hyper-parameter selection procedure. We demonstrate the advantages of
our framework over existing ones using experiments on both synthetic and
real-world data.Comment: 34 page
The m-connecting imset and factorization for ADMG models
Directed acyclic graph (DAG) models have become widely studied and applied in
statistics and machine learning -- indeed, their simplicity facilitates
efficient procedures for learning and inference. Unfortunately, these models
are not closed under marginalization, making them poorly equipped to handle
systems with latent confounding. Acyclic directed mixed graph (ADMG) models
characterize margins of DAG models, making them far better suited to handle
such systems. However, ADMG models have not seen wide-spread use due to their
complexity and a shortage of statistical tools for their analysis. In this
paper, we introduce the m-connecting imset which provides an alternative
representation for the independence models induced by ADMGs. Furthermore, we
define the m-connecting factorization criterion for ADMG models, characterized
by a single equation, and prove its equivalence to the global Markov property.
The m-connecting imset and factorization criterion provide two new statistical
tools for learning and inference with ADMG models. We demonstrate the
usefulness of these tools by formulating and evaluating a consistent scoring
criterion with a closed form solution
Interactive Causal Structure Discovery
Multiple algorithms exist for the detection of causal relations from observational data but they are limited by their required assumptions regarding the data or by available computational resources. Only limited amount of information can be extracted from finite data but domain experts often have some knowledge of the underlying processes. We propose combining an expert’s prior knowledge with data likelihood to find models with high posterior probability. Our high-level procedure for interactive causal structure discovery contains three modules: discovery of initial models, navigation in the space of causal structures, and validation for model selection and evaluation. We present one manner of formulating the problem and implementing the approach assuming a rational, Bayesian expert which assumption we use to model the user in simulated experiments. The expert navigates greedily in the structure space using their prior information and the structures’ fit to data to find a local maximum a posteriori structure. Existing algorithms provide initial models for the navigation. Through simulated user experiments with synthetic data and use cases with real-world data, we find that the results of causal analysis can be improved by adding prior knowledge. Additionally, different initial models can lead to the expert finding different causal models and model validation helps detect overfitting and concept drift