4,029 research outputs found
Causal inference via algebraic geometry: feasibility tests for functional causal structures with two binary observed variables
We provide a scheme for inferring causal relations from uncontrolled
statistical data based on tools from computational algebraic geometry, in
particular, the computation of Groebner bases. We focus on causal structures
containing just two observed variables, each of which is binary. We consider
the consequences of imposing different restrictions on the number and
cardinality of latent variables and of assuming different functional
dependences of the observed variables on the latent ones (in particular, the
noise need not be additive). We provide an inductive scheme for classifying
functional causal structures into distinct observational equivalence classes.
For each observational equivalence class, we provide a procedure for deriving
constraints on the joint distribution that are necessary and sufficient
conditions for it to arise from a model in that class. We also demonstrate how
this sort of approach provides a means of determining which causal parameters
are identifiable and how to solve for these. Prospects for expanding the scope
of our scheme, in particular to the problem of quantum causal inference, are
also discussed.Comment: Accepted for publication in Journal of Causal Inference. Revised and
updated in response to referee feedback. 16+5 pages, 26+2 figures. Comments
welcom
Compressive Network Analysis
Modern data acquisition routinely produces massive amounts of network data.
Though many methods and models have been proposed to analyze such data, the
research of network data is largely disconnected with the classical theory of
statistical learning and signal processing. In this paper, we present a new
framework for modeling network data, which connects two seemingly different
areas: network data analysis and compressed sensing. From a nonparametric
perspective, we model an observed network using a large dictionary. In
particular, we consider the network clique detection problem and show
connections between our formulation with a new algebraic tool, namely Randon
basis pursuit in homogeneous spaces. Such a connection allows us to identify
rigorous recovery conditions for clique detection problems. Though this paper
is mainly conceptual, we also develop practical approximation algorithms for
solving empirical problems and demonstrate their usefulness on real-world
datasets
We Are Not Your Real Parents: Telling Causal from Confounded using MDL
Given data over variables we consider the problem of finding out whether jointly causes or whether they are all confounded by an unobserved latent variable . To do so, we take an information-theoretic approach based on Kolmogorov complexity. In a nutshell, we follow the postulate that first encoding the true cause, and then the effects given that cause, results in a shorter description than any other encoding of the observed variables. The ideal score is not computable, and hence we have to approximate it. We propose to do so using the Minimum Description Length (MDL) principle. We compare the MDL scores under the models where causes and where there exists a latent variables confounding both and and show our scores are consistent. To find potential confounders we propose using latent factor modeling, in particular, probabilistic PCA (PPCA). Empirical evaluation on both synthetic and real-world data shows that our method, CoCa, performs very well -- even when the true generating process of the data is far from the assumptions made by the models we use. Moreover, it is robust as its accuracy goes hand in hand with its confidence
The lesson of causal discovery algorithms for quantum correlations: Causal explanations of Bell-inequality violations require fine-tuning
An active area of research in the fields of machine learning and statistics
is the development of causal discovery algorithms, the purpose of which is to
infer the causal relations that hold among a set of variables from the
correlations that these exhibit. We apply some of these algorithms to the
correlations that arise for entangled quantum systems. We show that they cannot
distinguish correlations that satisfy Bell inequalities from correlations that
violate Bell inequalities, and consequently that they cannot do justice to the
challenges of explaining certain quantum correlations causally. Nonetheless, by
adapting the conceptual tools of causal inference, we can show that any attempt
to provide a causal explanation of nonsignalling correlations that violate a
Bell inequality must contradict a core principle of these algorithms, namely,
that an observed statistical independence between variables should not be
explained by fine-tuning of the causal parameters. In particular, we
demonstrate the need for such fine-tuning for most of the causal mechanisms
that have been proposed to underlie Bell correlations, including superluminal
causal influences, superdeterminism (that is, a denial of freedom of choice of
settings), and retrocausal influences which do not introduce causal cycles.Comment: 29 pages, 28 figs. New in v2: a section presenting in detail our
characterization of Bell's theorem as a contradiction arising from (i) the
framework of causal models, (ii) the principle of no fine-tuning, and (iii)
certain operational features of quantum theory; a section explaining why a
denial of hidden variables affords even fewer opportunities for causal
explanations of quantum correlation
Towards a Learning Theory of Cause-Effect Inference
We pose causal inference as the problem of learning to classify probability
distributions. In particular, we assume access to a collection
, where each is a sample drawn from the
probability distribution of , and is a binary label
indicating whether "" or "". Given these data,
we build a causal inference rule in two steps. First, we featurize each
using the kernel mean embedding associated with some characteristic kernel.
Second, we train a binary classifier on such embeddings to distinguish between
causal directions. We present generalization bounds showing the statistical
consistency and learning rates of the proposed approach, and provide a simple
implementation that achieves state-of-the-art cause-effect inference.
Furthermore, we extend our ideas to infer causal relationships between more
than two variables
- …