Gradient Hedging for Intensively Exploring Salient Interpretation beyond Neuron Activation

Abstract

Hedging is a strategy for reducing the potential risks in various types of investments by adopting an opposite position in a related asset. Motivated by the equity technique, we introduce a method for decomposing output predictions into intensive salient attributions by hedging the evidence for a decision. We analyze the conventional approach applied to the evidence for a decision and discuss the paradox of the conservation rule. Subsequently, we define the viewpoint of evidence as a gap of positive and negative influence among the gradient-derived initial contribution maps and propagate the antagonistic elements to the evidence as suppressors, following the criterion of the degree of positive attribution defined by user preference. In addition, we reflect the severance or sparseness contribution of inactivated neurons, which are mostly irrelevant to a decision, resulting in increased robustness to interpretability. We conduct the following assessments in a verified experimental environment: pointing game, most relevant first region insertion, outside-inside relevance ratio, and mean average precision on the PASCAL VOC 2007, MS COCO 2014, and ImageNet datasets. The results demonstrate that our method outperforms existing attribution methods in distinctive, intensive, and intuitive visualization with robustness and applicability in general models

    Similar works

    Full text

    thumbnail-image

    Available Versions