100 research outputs found
Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels
One important assumption underlying common classification models is the
stationarity of the data. However, in real-world streaming applications, the
data concept indicated by the joint distribution of feature and label is not
stationary but drifting over time. Concept drift detection aims to detect such
drifts and adapt the model so as to mitigate any deterioration in the model's
predictive performance. Unfortunately, most existing concept drift detection
methods rely on a strong and over-optimistic condition that the true labels are
available immediately for all already classified instances. In this paper, a
novel Hierarchical Hypothesis Testing framework with Request-and-Reverify
strategy is developed to detect concept drifts by requesting labels only when
necessary. Two methods, namely Hierarchical Hypothesis Testing with
Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with
Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the
novel framework. In experiments with benchmark datasets, our methods
demonstrate overwhelming advantages over state-of-the-art unsupervised drift
detectors. More importantly, our methods even outperform DDM (the widely used
supervised drift detector) when we use significantly fewer labels.Comment: Published as a conference paper at IJCAI 201
Simple stopping criteria for information theoretic feature selection
Feature selection aims to select the smallest feature subset that yields the
minimum generalization error. In the rich literature in feature selection,
information theory-based approaches seek a subset of features such that the
mutual information between the selected features and the class labels is
maximized. Despite the simplicity of this objective, there still remain several
open problems in optimization. These include, for example, the automatic
determination of the optimal subset size (i.e., the number of features) or a
stopping criterion if the greedy searching strategy is adopted. In this paper,
we suggest two stopping criteria by just monitoring the conditional mutual
information (CMI) among groups of variables. Using the recently developed
multivariate matrix-based Renyi's \alpha-entropy functional, which can be
directly estimated from data samples, we showed that the CMI among groups of
variables can be easily computed without any decomposition or approximation,
hence making our criteria easy to implement and seamlessly integrated into any
existing information theoretic feature selection methods with a greedy search
strategy.Comment: Paper published in the journal of Entrop
CI-GNN: A Granger Causality-Inspired Graph Neural Network for Interpretable Brain Network-Based Psychiatric Diagnosis
There is a recent trend to leverage the power of graph neural networks (GNNs)
for brain-network based psychiatric diagnosis, which,in turn, also motivates an
urgent need for psychiatrists to fully understand the decision behavior of the
used GNNs. However, most of the existing GNN explainers are either post-hoc in
which another interpretive model needs to be created to explain a well-trained
GNN, or do not consider the causal relationship between the extracted
explanation and the decision, such that the explanation itself contains
spurious correlations and suffers from weak faithfulness. In this work, we
propose a granger causality-inspired graph neural network (CI-GNN), a built-in
interpretable model that is able to identify the most influential subgraph
(i.e., functional connectivity within brain regions) that is causally related
to the decision (e.g., major depressive disorder patients or healthy controls),
without the training of an auxillary interpretive network. CI-GNN learns
disentangled subgraph-level representations {\alpha} and \b{eta} that encode,
respectively, the causal and noncausal aspects of original graph under a graph
variational autoencoder framework, regularized by a conditional mutual
information (CMI) constraint. We theoretically justify the validity of the CMI
regulation in capturing the causal relationship. We also empirically evaluate
the performance of CI-GNN against three baseline GNNs and four state-of-the-art
GNN explainers on synthetic data and three large-scale brain disease datasets.
We observe that CI-GNN achieves the best performance in a wide range of metrics
and provides more reliable and concise explanations which have clinical
evidence.Comment: 45 pages, 13 figure
- …