65,904 research outputs found
Greedy PIG: Adaptive Integrated Gradients
Deep learning has become the standard approach for most machine learning
tasks. While its impact is undeniable, interpreting the predictions of deep
learning models from a human perspective remains a challenge. In contrast to
model training, model interpretability is harder to quantify and pose as an
explicit optimization problem. Inspired by the AUC softmax information curve
(AUC SIC) metric for evaluating feature attribution methods, we propose a
unified discrete optimization framework for feature attribution and feature
selection based on subset selection. This leads to a natural adaptive
generalization of the path integrated gradients (PIG) method for feature
attribution, which we call Greedy PIG. We demonstrate the success of Greedy PIG
on a wide variety of tasks, including image feature attribution, graph
compression/explanation, and post-hoc feature selection on tabular data. Our
results show that introducing adaptivity is a powerful and versatile method for
making attribution methods more powerful
Sampling, Intervention, Prediction, Aggregation: A Generalized Framework for Model-Agnostic Interpretations
Model-agnostic interpretation techniques allow us to explain the behavior of
any predictive model. Due to different notations and terminology, it is
difficult to see how they are related. A unified view on these methods has been
missing. We present the generalized SIPA (sampling, intervention, prediction,
aggregation) framework of work stages for model-agnostic interpretations and
demonstrate how several prominent methods for feature effects can be embedded
into the proposed framework. Furthermore, we extend the framework to feature
importance computations by pointing out how variance-based and
performance-based importance measures are based on the same work stages. The
SIPA framework reduces the diverse set of model-agnostic techniques to a single
methodology and establishes a common terminology to discuss them in future
work
Occupancy, spatial variance, and the abundance of species
A notable and consistent ecological observation known for
a long time is that spatial variance in the abundance of a
species increases with its mean abundance and that this relationship typically conforms well to a simple
power law (Taylor 1961). Indeed, such models can be
used at a spectrum of spatial scales to describe spatial
variance in the abundance of a single species at different
times or in different regions and of different species across the same set of areas (Taylor et al. 1978; Taylor and Woiwod 1982)
The Grammar of Interactive Explanatory Model Analysis
The growing need for in-depth analysis of predictive models leads to a series
of new methods for explaining their local and global properties. Which of these
methods is the best? It turns out that this is an ill-posed question. One
cannot sufficiently explain a black-box machine learning model using a single
method that gives only one perspective. Isolated explanations are prone to
misunderstanding, which inevitably leads to wrong or simplistic reasoning. This
problem is known as the Rashomon effect and refers to diverse, even
contradictory interpretations of the same phenomenon. Surprisingly, the
majority of methods developed for explainable machine learning focus on a
single aspect of the model behavior. In contrast, we showcase the problem of
explainability as an interactive and sequential analysis of a model. This paper
presents how different Explanatory Model Analysis (EMA) methods complement each
other and why it is essential to juxtapose them together. The introduced
process of Interactive EMA (IEMA) derives from the algorithmic side of
explainable machine learning and aims to embrace ideas developed in cognitive
sciences. We formalize the grammar of IEMA to describe potential human-model
dialogues. IEMA is implemented in the human-centered framework that adopts
interactivity, customizability and automation as its main traits. Combined,
these methods enhance the responsible approach to predictive modeling.Comment: 17 pages, 10 figures, 3 table
Coupling active and sterile neutrinos in the cosmon plus seesaw framework
The cosmological evolution of neutrino energy densities driven by cosmon-type
field equations is introduced assuming that active and sterile neutrinos are
intrinsically connected by cosmon fields through the {\em seesaw} mechanism.
Interpreting sterile neutrinos as dark matter adiabatically coupled with dark
energy results in a natural decoupling of (active) mass varying neutrino
(MaVaN) equations. Identifying the dimensionless scale of the {\em seesaw}
mechanism, , with a power of the cosmological scale factor, , allows
for embedding the resulting masses into the generalized Chaplygin gas (GCG)
scenario for the dark sector. Without additional assumptions, our findings
establish a precise connection among three distinct frameworks: the cosmon
field dynamics for MaVaN's, the {\em seesaw} mechanism for dynamical mass
generation and the GCG scenario. Our results also corroborate with previous
assertions that mass varying particles can be the right responsible for the
stability issue and for the cosmic acceleration of the universe.Comment: 12 pages, 2 figure
An Interpretable Deep Hierarchical Semantic Convolutional Neural Network for Lung Nodule Malignancy Classification
While deep learning methods are increasingly being applied to tasks such as
computer-aided diagnosis, these models are difficult to interpret, do not
incorporate prior domain knowledge, and are often considered as a "black-box."
The lack of model interpretability hinders them from being fully understood by
target users such as radiologists. In this paper, we present a novel
interpretable deep hierarchical semantic convolutional neural network (HSCNN)
to predict whether a given pulmonary nodule observed on a computed tomography
(CT) scan is malignant. Our network provides two levels of output: 1) low-level
radiologist semantic features, and 2) a high-level malignancy prediction score.
The low-level semantic outputs quantify the diagnostic features used by
radiologists and serve to explain how the model interprets the images in an
expert-driven manner. The information from these low-level tasks, along with
the representations learned by the convolutional layers, are then combined and
used to infer the high-level task of predicting nodule malignancy. This unified
architecture is trained by optimizing a global loss function including both
low- and high-level tasks, thereby learning all the parameters within a joint
framework. Our experimental results using the Lung Image Database Consortium
(LIDC) show that the proposed method not only produces interpretable lung
cancer predictions but also achieves significantly better results compared to
common 3D CNN approaches
- …