20 research outputs found
A Kernel-Based Causal Learning Algorithm
We describe a causal learning method, which employs measuring the strength of statistical dependences in terms of the Hilbert-Schmidt norm of kernel-based cross-covariance operators. Following the line of the common faithfulness assumption of constraint-based causal learning, our approach assumes that a variable Z is likely to be a common effect of X and Y, if conditioning on Z increases the dependence between X and Y. Based on this assumption, we collect "votes" for hypothetical causal directions and orient the edges by the majority principle. In most experiments with known causal structures, our method provided plausible results and outperformed the conventional constraint-based PC algorithm
A Kernel Test for Three-Variable Interactions
We introduce kernel nonparametric tests for Lancaster three-variable
interaction and for total independence, using embeddings of signed measures
into a reproducing kernel Hilbert space. The resulting test statistics are
straightforward to compute, and are used in powerful interaction tests, which
are consistent against all alternatives for a large family of reproducing
kernels. We show the Lancaster test to be sensitive to cases where two
independent causes individually have weak influence on a third dependent
variable, but their combined effect has a strong influence. This makes the
Lancaster test especially suited to finding structure in directed graphical
models, where it outperforms competing nonparametric tests in detecting such
V-structures
Kernel-based Conditional Independence Test and Application in Causal Discovery
Conditional independence testing is an important problem, especially in
Bayesian network learning and causal discovery. Due to the curse of
dimensionality, testing for conditional independence of continuous variables is
particularly challenging. We propose a Kernel-based Conditional Independence
test (KCI-test), by constructing an appropriate test statistic and deriving its
asymptotic distribution under the null hypothesis of conditional independence.
The proposed method is computationally efficient and easy to implement.
Experimental results show that it outperforms other methods, especially when
the conditioning set is large or the sample size is not very large, in which
case other methods encounter difficulties
Reinforcement Causal Structure Learning on Order Graph
Learning directed acyclic graph (DAG) that describes the causality of
observed data is a very challenging but important task. Due to the limited
quantity and quality of observed data, and non-identifiability of causal graph,
it is almost impossible to infer a single precise DAG. Some methods approximate
the posterior distribution of DAGs to explore the DAG space via Markov chain
Monte Carlo (MCMC), but the DAG space is over the nature of super-exponential
growth, accurately characterizing the whole distribution over DAGs is very
intractable. In this paper, we propose {Reinforcement Causal Structure Learning
on Order Graph} (RCL-OG) that uses order graph instead of MCMC to model
different DAG topological orderings and to reduce the problem size. RCL-OG
first defines reinforcement learning with a new reward mechanism to approximate
the posterior distribution of orderings in an efficacy way, and uses deep
Q-learning to update and transfer rewards between nodes. Next, it obtains the
probability transition model of nodes on order graph, and computes the
posterior probability of different orderings. In this way, we can sample on
this model to obtain the ordering with high probability. Experiments on
synthetic and benchmark datasets show that RCL-OG provides accurate posterior
probability approximation and achieves better results than competitive causal
discovery algorithms.Comment: Accepted by the Thirty-Seventh AAAI Conference on Artificial
Intelligence(AAAI2023
Causal Discovery by Kernel Deviance Measures with Heterogeneous Transforms
The discovery of causal relationships in a set of random variables is a
fundamental objective of science and has also recently been argued as being an
essential component towards real machine intelligence. One class of causal
discovery techniques are founded based on the argument that there are inherent
structural asymmetries between the causal and anti-causal direction which could
be leveraged in determining the direction of causation. To go about capturing
these discrepancies between cause and effect remains to be a challenge and many
current state-of-the-art algorithms propose to compare the norms of the kernel
mean embeddings of the conditional distributions. In this work, we argue that
such approaches based on RKHS embeddings are insufficient in capturing
principal markers of cause-effect asymmetry involving higher-order structural
variabilities of the conditional distributions. We propose Kernel Intrinsic
Invariance Measure with Heterogeneous Transform (KIIM-HT) which introduces a
novel score measure based on heterogeneous transformation of RKHS embeddings to
extract relevant higher-order moments of the conditional densities for causal
discovery. Inference is made via comparing the score of each hypothetical
cause-effect direction. Tests and comparisons on a synthetic dataset, a
two-dimensional synthetic dataset and the real-world benchmark dataset
T\"ubingen Cause-Effect Pairs verify our approach. In addition, we conduct a
sensitivity analysis to the regularization parameter to faithfully compare
previous work to our method and an experiment with trials on varied
hyperparameter values to showcase the robustness of our algorithm
Meta Learning for Causal Direction
The inaccessibility of controlled randomized trials due to inherent
constraints in many fields of science has been a fundamental issue in causal
inference. In this paper, we focus on distinguishing the cause from effect in
the bivariate setting under limited observational data. Based on recent
developments in meta learning as well as in causal inference, we introduce a
novel generative model that allows distinguishing cause and effect in the small
data setting. Using a learnt task variable that contains distributional
information of each dataset, we propose an end-to-end algorithm that makes use
of similar training datasets at test time. We demonstrate our method on various
synthetic as well as real-world data and show that it is able to maintain high
accuracy in detecting directions across varying dataset sizes
Efficient Conditionally Invariant Representation Learning
We introduce the Conditional Independence Regression CovariancE (CIRCE),
a measure of conditional independence for multivariate continuous-valued variables. CIRCE applies as a regularizer in settings where we wish to learn neural
features φ(X) of data X to estimate a target Y , while being conditionally independent of a distractor Z given Y . Both Z and Y are assumed to be continuous-valued
but relatively low dimensional, whereas X and its features may be complex and
high dimensional. Relevant settings include domain-invariant learning, fairness,
and causal learning. The procedure requires just a single ridge regression from Y
to kernelized features of Z, which can be done in advance. It is then only necessary to enforce independence of φ(X) from residuals of this regression, which
is possible with attractive estimation properties and consistency guarantees. By
contrast, earlier measures of conditional feature dependence require multiple regressions for each step of feature learning, resulting in more severe bias and variance, and greater computational cost. When sufficiently rich features are used,
we establish that CIRCE is zero if and only if φ(X) ⊥⊥ Z | Y . In experiments,
we show superior performance to previous methods on challenging benchmarks,
including learning conditionally invariant image features
Discovering Dynamic Causal Space for DAG Structure Learning
Discovering causal structure from purely observational data (i.e., causal
discovery), aiming to identify causal relationships among variables, is a
fundamental task in machine learning. The recent invention of differentiable
score-based DAG learners is a crucial enabler, which reframes the combinatorial
optimization problem into a differentiable optimization with a DAG constraint
over directed graph space. Despite their great success, these cutting-edge DAG
learners incorporate DAG-ness independent score functions to evaluate the
directed graph candidates, lacking in considering graph structure. As a result,
measuring the data fitness alone regardless of DAG-ness inevitably leads to
discovering suboptimal DAGs and model vulnerabilities. Towards this end, we
propose a dynamic causal space for DAG structure learning, coined CASPER, that
integrates the graph structure into the score function as a new measure in the
causal space to faithfully reflect the causal distance between estimated and
ground truth DAG. CASPER revises the learning process as well as enhances the
DAG structure learning via adaptive attention to DAG-ness. Grounded by
empirical visualization, CASPER, as a space, satisfies a series of desired
properties, such as structure awareness and noise robustness. Extensive
experiments on both synthetic and real-world datasets clearly validate the
superiority of our CASPER over the state-of-the-art causal discovery methods in
terms of accuracy and robustness.Comment: Accepted by KDD 2023. Our codes are available at
https://github.com/liuff19/CASPE