69,434 research outputs found
Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
Recent empirical studies on domain generalization (DG) have shown that DG
algorithms that perform well on some distribution shifts fail on others, and no
state-of-the-art DG algorithm performs consistently well on all shifts.
Moreover, real-world data often has multiple distribution shifts over different
attributes; hence we introduce multi-attribute distribution shift datasets and
find that the accuracy of existing DG algorithms falls even further. To explain
these results, we provide a formal characterization of generalization under
multi-attribute shifts using a canonical causal graph. Based on the
relationship between spurious attributes and the classification label, we
obtain realizations of the canonical causal graph that characterize common
distribution shifts and show that each shift entails different independence
constraints over observed variables. As a result, we prove that any algorithm
based on a single, fixed constraint cannot work well across all shifts,
providing theoretical evidence for mixed empirical results on DG algorithms.
Based on this insight, we develop Causally Adaptive Constraint Minimization
(CACM), an algorithm that uses knowledge about the data-generating process to
adaptively identify and apply the correct independence constraints for
regularization. Results on fully synthetic, MNIST, small NORB, and Waterbirds
datasets, covering binary and multi-valued attributes and labels, show that
adaptive dataset-dependent constraints lead to the highest accuracy on unseen
domains whereas incorrect constraints fail to do so. Our results demonstrate
the importance of modeling the causal relationships inherent in the
data-generating process
Probabilistic Label Relation Graphs with Ising Models
We consider classification problems in which the label space has structure. A
common example is hierarchical label spaces, corresponding to the case where
one label subsumes another (e.g., animal subsumes dog). But labels can also be
mutually exclusive (e.g., dog vs cat) or unrelated (e.g., furry, carnivore). To
jointly model hierarchy and exclusion relations, the notion of a HEX (hierarchy
and exclusion) graph was introduced in [7]. This combined a conditional random
field (CRF) with a deep neural network (DNN), resulting in state of the art
results when applied to visual object classification problems where the
training labels were drawn from different levels of the ImageNet hierarchy
(e.g., an image might be labeled with the basic level category "dog", rather
than the more specific label "husky"). In this paper, we extend the HEX model
to allow for soft or probabilistic relations between labels, which is useful
when there is uncertainty about the relationship between two labels (e.g., an
antelope is "sort of" furry, but not to the same degree as a grizzly bear). We
call our new model pHEX, for probabilistic HEX. We show that the pHEX graph can
be converted to an Ising model, which allows us to use existing off-the-shelf
inference methods (in contrast to the HEX method, which needed specialized
inference algorithms). Experimental results show significant improvements in a
number of large-scale visual object classification tasks, outperforming the
previous HEX model.Comment: International Conference on Computer Vision (2015
Mining Brain Networks using Multiple Side Views for Neurological Disorder Identification
Mining discriminative subgraph patterns from graph data has attracted great
interest in recent years. It has a wide variety of applications in disease
diagnosis, neuroimaging, etc. Most research on subgraph mining focuses on the
graph representation alone. However, in many real-world applications, the side
information is available along with the graph data. For example, for
neurological disorder identification, in addition to the brain networks derived
from neuroimaging data, hundreds of clinical, immunologic, serologic and
cognitive measures may also be documented for each subject. These measures
compose multiple side views encoding a tremendous amount of supplemental
information for diagnostic purposes, yet are often ignored. In this paper, we
study the problem of discriminative subgraph selection using multiple side
views and propose a novel solution to find an optimal set of subgraph features
for graph classification by exploring a plurality of side views. We derive a
feature evaluation criterion, named gSide, to estimate the usefulness of
subgraph patterns based upon side views. Then we develop a branch-and-bound
algorithm, called gMSV, to efficiently search for optimal subgraph features by
integrating the subgraph mining process and the procedure of discriminative
feature selection. Empirical studies on graph classification tasks for
neurological disorders using brain networks demonstrate that subgraph patterns
selected by the multi-side-view guided subgraph selection approach can
effectively boost graph classification performances and are relevant to disease
diagnosis.Comment: in Proceedings of IEEE International Conference on Data Mining (ICDM)
201
- …