The importance of neighborhood construction in local explanation methods has
been already highlighted in the literature. And several attempts have been made
to improve neighborhood quality for high-dimensional data, for example, texts,
by adopting generative models. Although the generators produce more realistic
samples, the intuitive sampling approaches in the existing solutions leave the
latent space underexplored. To overcome this problem, our work, focusing on
local model-agnostic explanations for text classifiers, proposes a progressive
approximation approach that refines the neighborhood of a to-be-explained
decision with a careful two-stage interpolation using counterfactuals as
landmarks. We explicitly specify the two properties that should be satisfied by
generative models, the reconstruction ability and the locality-preserving
property, to guide the selection of generators for local explanation methods.
Moreover, noticing the opacity of generative models during the study, we
propose another method that implements progressive neighborhood approximation
with probability-based editions as an alternative to the generator-based
solution. The explanation results from both methods consist of word-level and
instance-level explanations benefiting from the realistic neighborhood. Through
exhaustive experiments, we qualitatively and quantitatively demonstrate the
effectiveness of the two proposed methods