Generating Realistic Natural Language Counterfactuals

Bex, Floris; Feelders, Ad; Huang, Xuanjing; Intelligent Systems; Moens, Marie-Francine; Robeer, Marcel; Specia, Lucia; Sub Algorithmic Data Analysis; Sub Intelligent Systems; Yih, Scott Wen-tau

Generating Realistic Natural Language Counterfactuals

Authors: Floris Bex
Ad Feelders
Xuanjing Huang
Intelligent Systems
Marie-Francine Moens
Marcel Robeer
Lucia Specia
Sub Algorithmic Data Analysis
Sub Intelligent Systems
Scott Wen-tau Yih
Publication date: 1 January 2021
Publisher

Abstract

Counterfactuals are a valuable means for understanding decisions made by ML systems. However, the counterfactuals generated by the methods currently available for natural language text are either unrealistic or introduce imperceptible changes. We propose CounterfactualGAN: a method that combines a conditional GAN and the embeddings of a pretrained BERT encoder to model-agnostically generate realistic natural language text counterfactuals for explaining regression and classification tasks. Experimental results show that our method produces perceptibly distinguishable counterfactuals, while outperforming four baseline methods on fidelity and human judgments of naturalness, across multiple datasets and multiple predictive models

Similar works

Full text

Available Versions

NARCIS

Last time updated on 12/10/2022