1,925 research outputs found
Pathologies of Neural Models Make Interpretations Difficult
One way to interpret neural model predictions is to highlight the most
important input features---for example, a heatmap visualization over the words
in an input sentence. In existing interpretation methods for NLP, a word's
importance is determined by either input perturbation---measuring the decrease
in model confidence when that word is removed---or by the gradient with respect
to that word. To understand the limitations of these methods, we use input
reduction, which iteratively removes the least important word from the input.
This exposes pathological behaviors of neural models: the remaining words
appear nonsensical to humans and are not the ones determined as important by
interpretation methods. As we confirm with human experiments, the reduced
examples lack information to support the prediction of any label, but models
still make the same predictions with high confidence. To explain these
counterintuitive results, we draw connections to adversarial examples and
confidence calibration: pathological behaviors reveal difficulties in
interpreting neural models trained with maximum likelihood. To mitigate their
deficiencies, we fine-tune the models by encouraging high entropy outputs on
reduced examples. Fine-tuned models become more interpretable under input
reduction without accuracy loss on regular examples.Comment: EMNLP 2018 camera read
Improving Zero-shot Reader by Reducing Distractions from Irrelevant Documents in Open-Domain Question Answering
Large language models (LLMs) enable zero-shot approaches in open-domain
question answering (ODQA), yet with limited advancements as the reader is
compared to the retriever. This study aims at the feasibility of a zero-shot
reader that addresses the challenges of computational cost and the need for
labeled data. We find that LLMs are distracted due to irrelevant documents in
the retrieved set and the overconfidence of the generated answers when they are
exploited as zero-shot readers. To tackle these problems, we mitigate the
impact of such documents via Distraction-aware Answer Selection (DAS) with a
negation-based instruction and score adjustment for proper answer selection.
Experimental results show that our approach successfully handles distraction
across diverse scenarios, enhancing the performance of zero-shot readers.
Furthermore, unlike supervised readers struggling with unseen data, zero-shot
readers demonstrate outstanding transferability without any training.Comment: Findings of EMNLP 2023 Camera Read
- …