1,029 research outputs found
Causal Inference by Stochastic Complexity
The algorithmic Markov condition states that the most likely causal direction
between two random variables X and Y can be identified as that direction with
the lowest Kolmogorov complexity. Due to the halting problem, however, this
notion is not computable.
We hence propose to do causal inference by stochastic complexity. That is, we
propose to approximate Kolmogorov complexity via the Minimum Description Length
(MDL) principle, using a score that is mini-max optimal with regard to the
model class under consideration. This means that even in an adversarial
setting, such as when the true distribution is not in this class, we still
obtain the optimal encoding for the data relative to the class.
We instantiate this framework, which we call CISC, for pairs of univariate
discrete variables, using the class of multinomial distributions. Experiments
show that CISC is highly accurate on synthetic, benchmark, as well as
real-world data, outperforming the state of the art by a margin, and scales
extremely well with regard to sample and domain sizes
Mixed Cumulative Distribution Networks
Directed acyclic graphs (DAGs) are a popular framework to express
multivariate probability distributions. Acyclic directed mixed graphs (ADMGs)
are generalizations of DAGs that can succinctly capture much richer sets of
conditional independencies, and are especially useful in modeling the effects
of latent variables implicitly. Unfortunately there are currently no good
parameterizations of general ADMGs. In this paper, we apply recent work on
cumulative distribution networks and copulas to propose one one general
construction for ADMG models. We consider a simple parameter estimation
approach, and report some encouraging experimental results.Comment: 11 pages, 4 figure
Discrete chain graph models
The statistical literature discusses different types of Markov properties for
chain graphs that lead to four possible classes of chain graph Markov models.
The different models are rather well understood when the observations are
continuous and multivariate normal, and it is also known that one model class,
referred to as models of LWF (Lauritzen--Wermuth--Frydenberg) or block
concentration type, yields discrete models for categorical data that are
smooth. This paper considers the structural properties of the discrete models
based on the three alternative Markov properties. It is shown by example that
two of the alternative Markov properties can lead to non-smooth models. The
remaining model class, which can be viewed as a discrete version of
multivariate regressions, is proven to comprise only smooth models. The proof
employs a simple change of coordinates that also reveals that the model's
likelihood function is unimodal if the chain components of the graph are
complete sets.Comment: Published in at http://dx.doi.org/10.3150/08-BEJ172 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions
Discovering interaction effects on a response of interest is a fundamental
problem faced in biology, medicine, economics, and many other scientific
disciplines. In theory, Bayesian methods for discovering pairwise interactions
enjoy many benefits such as coherent uncertainty quantification, the ability to
incorporate background knowledge, and desirable shrinkage properties. In
practice, however, Bayesian methods are often computationally intractable for
even moderate-dimensional problems. Our key insight is that many hierarchical
models of practical interest admit a particular Gaussian process (GP)
representation; the GP allows us to capture the posterior with a vector of O(p)
kernel hyper-parameters rather than O(p^2) interactions and main effects. With
the implicit representation, we can run Markov chain Monte Carlo (MCMC) over
model hyper-parameters in time and memory linear in p per iteration. We focus
on sparsity-inducing models and show on datasets with a variety of covariate
behaviors that our method: (1) reduces runtime by orders of magnitude over
naive applications of MCMC, (2) provides lower Type I and Type II error
relative to state-of-the-art LASSO-based approaches, and (3) offers improved
computational scaling in high dimensions relative to existing Bayesian and
LASSO-based approaches.Comment: Accepted at ICML 2019. 20 pages, 4 figures, 3 table
QCD Amplitudes: new perspectives on Feynman integral calculus
I analyze the algebraic patterns underlying the structure of scattering
amplitudes in quantum field theory. I focus on the decomposition of amplitudes
in terms of independent functions and the systems of differential equations the
latter obey. In particular, I discuss the key role played by unitarity for the
decomposition in terms of master integrals, by means of generalized cuts and
integrand reduction, as well as for solving the corresponding differential
equations, by means of Magnus exponential series.Comment: Presented at Rencontres de Moriond 201
- …