4,634 research outputs found
Look at the First Sentence: Position Bias in Question Answering
Many extractive question answering models are trained to predict start and
end positions of answers. The choice of predicting answers as positions is
mainly due to its simplicity and effectiveness. In this study, we hypothesize
that when the distribution of the answer positions is highly skewed in the
training set (e.g., answers lie only in the k-th sentence of each passage), QA
models predicting answers as positions can learn spurious positional cues and
fail to give answers in different positions. We first illustrate this position
bias in popular extractive QA models such as BiDAF and BERT and thoroughly
examine how position bias propagates through each layer of BERT. To safely
deliver position information without position bias, we train models with
various de-biasing methods including entropy regularization and bias
ensembling. Among them, we found that using the prior distribution of answer
positions as a bias model is very effective at reducing position bias,
recovering the performance of BERT from 37.48% to 81.64% when trained on a
biased SQuAD dataset.Comment: 13 pages, EMNLP 202
FiLM: Visual Reasoning with a General Conditioning Layer
We introduce a general-purpose conditioning method for neural networks called
FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network
computation via a simple, feature-wise affine transformation based on
conditioning information. We show that FiLM layers are highly effective for
visual reasoning - answering image-related questions which require a
multi-step, high-level process - a task which has proven difficult for standard
deep learning methods that do not explicitly model reasoning. Specifically, we
show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error
for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are
robust to ablations and architectural modifications, and 4) generalize well to
challenging, new data from few examples or even zero-shot.Comment: AAAI 2018. Code available at http://github.com/ethanjperez/film .
Extends arXiv:1707.0301
- …