2 research outputs found
Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
This paper provides a graphical characterization of Markov blankets in chain
graphs (CGs) under the Lauritzen-Wermuth-Frydenberg (LWF) interpretation. The
characterization is different from the well-known one for Bayesian networks and
generalizes it. We provide a novel scalable and sound algorithm for Markov
blanket discovery in LWF CGs and prove that the Grow-Shrink algorithm, the IAMB
algorithm, and its variants are still correct for Markov blanket discovery in
LWF CGs under the same assumptions as for Bayesian networks. We provide a sound
and scalable constraint-based framework for learning the structure of LWF CGs
from faithful causally sufficient data and prove its correctness when the
Markov blanket discovery algorithms in this paper are used. Our proposed
algorithms compare positively/competitively against the state-of-the-art LCD
(Learn Chain graphs via Decomposition) algorithm, depending on the algorithm
that is used for Markov blanket discovery. Our proposed algorithms make a broad
range of inference/learning problems computationally tractable and more
reliable because they exploit locality.Comment: This is an extended version of the accepted paper for UAI 202
Causality-based Feature Selection: Methods and Evaluations
Feature selection is a crucial preprocessing step in data analytics and
machine learning. Classical feature selection algorithms select features based
on the correlations between predictive features and the class variable and do
not attempt to capture causal relationships between them. It has been shown
that the knowledge about the causal relationships between features and the
class variable has potential benefits for building interpretable and robust
prediction models, since causal relationships imply the underlying mechanism of
a system. Consequently, causality-based feature selection has gradually
attracted greater attentions and many algorithms have been proposed. In this
paper, we present a comprehensive review of recent advances in causality-based
feature selection. To facilitate the development of new algorithms in the
research area and make it easy for the comparisons between new methods and
existing ones, we develop the first open-source package, called CausalFS, which
consists of most of the representative causality-based feature selection
algorithms (available at https://github.com/kuiy/CausalFS). Using CausalFS, we
conduct extensive experiments to compare the representative algorithms with
both synthetic and real-world data sets. Finally, we discuss some challenging
problems to be tackled in future causality-based feature selection research