77 research outputs found
Unfooling Perturbation-Based Post Hoc Explainers
Monumental advancements in artificial intelligence (AI) have lured the
interest of doctors, lenders, judges, and other professionals. While these
high-stakes decision-makers are optimistic about the technology, those familiar
with AI systems are wary about the lack of transparency of its decision-making
processes. Perturbation-based post hoc explainers offer a model agnostic means
of interpreting these systems while only requiring query-level access. However,
recent work demonstrates that these explainers can be fooled adversarially.
This discovery has adverse implications for auditors, regulators, and other
sentinels. With this in mind, several natural questions arise - how can we
audit these black box systems? And how can we ascertain that the auditee is
complying with the audit in good faith? In this work, we rigorously formalize
this problem and devise a defense against adversarial attacks on
perturbation-based explainers. We propose algorithms for the detection
(CAD-Detect) and defense (CAD-Defend) of these attacks, which are aided by our
novel conditional anomaly detection approach, KNN-CAD. We demonstrate that our
approach successfully detects whether a black box system adversarially conceals
its decision-making process and mitigates the adversarial attack on real-world
data for the prevalent explainers, LIME and SHAP.Comment: Accepted to AAAI-23. 9 pages (not including references and
supplemental
How Well Do Feature-Additive Explainers Explain Feature-Additive Predictors?
Surging interest in deep learning from high-stakes domains has precipitated
concern over the inscrutable nature of black box neural networks. Explainable
AI (XAI) research has led to an abundance of explanation algorithms for these
black boxes. Such post hoc explainers produce human-comprehensible
explanations, however, their fidelity with respect to the model is not well
understood - explanation evaluation remains one of the most challenging issues
in XAI. In this paper, we ask a targeted but important question: can popular
feature-additive explainers (e.g., LIME, SHAP, SHAPR, MAPLE, and PDP) explain
feature-additive predictors? Herein, we evaluate such explainers on ground
truth that is analytically derived from the additive structure of a model. We
demonstrate the efficacy of our approach in understanding these explainers
applied to symbolic expressions, neural networks, and generalized additive
models on thousands of synthetic and several real-world tasks. Our results
suggest that all explainers eventually fail to correctly attribute the
importance of features, especially when a decision-making process involves
feature interactions.Comment: Accepted to NeurIPS Workshop XAI in Action: Past, Present, and Future
Applications. arXiv admin note: text overlap with arXiv:2106.0837
On the Objective Evaluation of Post Hoc Explainers
Many applications of data-driven models demand transparency of decisions,
especially in health care, criminal justice, and other high-stakes
environments. Modern trends in machine learning research have led to algorithms
that are increasingly intricate to the degree that they are considered to be
black boxes. In an effort to reduce the opacity of decisions, methods have been
proposed to construe the inner workings of such models in a
human-comprehensible manner. These post hoc techniques are described as being
universal explainers - capable of faithfully augmenting decisions with
algorithmic insight. Unfortunately, there is little agreement about what
constitutes a "good" explanation. Moreover, current methods of explanation
evaluation are derived from either subjective or proxy means. In this work, we
propose a framework for the evaluation of post hoc explainers on ground truth
that is directly derived from the additive structure of a model. We demonstrate
the efficacy of the framework in understanding explainers by evaluating popular
explainers on thousands of synthetic and several real-world tasks. The
framework unveils that explanations may be accurate but misattribute the
importance of individual features.Comment: 14 pages, 4 figures. Under revie
Neuron Segmentation Using Deep Complete Bipartite Networks
In this paper, we consider the problem of automatically segmenting neuronal
cells in dual-color confocal microscopy images. This problem is a key task in
various quantitative analysis applications in neuroscience, such as tracing
cell genesis in Danio rerio (zebrafish) brains. Deep learning, especially using
fully convolutional networks (FCN), has profoundly changed segmentation
research in biomedical imaging. We face two major challenges in this problem.
First, neuronal cells may form dense clusters, making it difficult to correctly
identify all individual cells (even to human experts). Consequently,
segmentation results of the known FCN-type models are not accurate enough.
Second, pixel-wise ground truth is difficult to obtain. Only a limited amount
of approximate instance-wise annotation can be collected, which makes the
training of FCN models quite cumbersome. We propose a new FCN-type deep
learning model, called deep complete bipartite networks (CB-Net), and a new
scheme for leveraging approximate instance-wise annotation to train our
pixel-wise prediction model. Evaluated using seven real datasets, our proposed
new CB-Net model outperforms the state-of-the-art FCN models and produces
neuron segmentation results of remarkable qualityComment: miccai 201
UG^2: a Video Benchmark for Assessing the Impact of Image Restoration and Enhancement on Automatic Visual Recognition
Advances in image restoration and enhancement techniques have led to
discussion about how such algorithmscan be applied as a pre-processing step to
improve automatic visual recognition. In principle, techniques like deblurring
and super-resolution should yield improvements by de-emphasizing noise and
increasing signal in an input image. But the historically divergent goals of
the computational photography and visual recognition communities have created a
significant need for more work in this direction. To facilitate new research,
we introduce a new benchmark dataset called UG^2, which contains three
difficult real-world scenarios: uncontrolled videos taken by UAVs and manned
gliders, as well as controlled videos taken on the ground. Over 160,000
annotated frames forhundreds of ImageNet classes are available, which are used
for baseline experiments that assess the impact of known and unknown image
artifacts and other conditions on common deep learning-based object
classification approaches. Further, current image restoration and enhancement
techniques are evaluated by determining whether or not theyimprove baseline
classification performance. Results showthat there is plenty of room for
algorithmic innovation, making this dataset a useful tool going forward.Comment: Supplemental material: https://goo.gl/vVM1xe, Dataset:
https://goo.gl/AjA6En, CVPR 2018 Prize Challenge: ug2challenge.or
- …