1,046 research outputs found
State-of-the-art generalisation research in NLP: a taxonomy and review
The ability to generalise well is one of the primary desiderata of natural
language processing (NLP). Yet, what `good generalisation' entails and how it
should be evaluated is not well understood, nor are there any common standards
to evaluate it. In this paper, we aim to lay the ground-work to improve both of
these issues. We present a taxonomy for characterising and understanding
generalisation research in NLP, we use that taxonomy to present a comprehensive
map of published generalisation studies, and we make recommendations for which
areas might deserve attention in the future. Our taxonomy is based on an
extensive literature review of generalisation research, and contains five axes
along which studies can differ: their main motivation, the type of
generalisation they aim to solve, the type of data shift they consider, the
source by which this data shift is obtained, and the locus of the shift within
the modelling pipeline. We use our taxonomy to classify over 400 previous
papers that test generalisation, for a total of more than 600 individual
experiments. Considering the results of this review, we present an in-depth
analysis of the current state of generalisation research in NLP, and make
recommendations for the future. Along with this paper, we release a webpage
where the results of our review can be dynamically explored, and which we
intend to up-date as new NLP generalisation studies are published. With this
work, we aim to make steps towards making state-of-the-art generalisation
testing the new status quo in NLP.Comment: 35 pages of content + 53 pages of reference
Vision Transformer Adapters for Generalizable Multitask Learning
We introduce the first multitasking vision transformer adapters that learn
generalizable task affinities which can be applied to novel tasks and domains.
Integrated into an off-the-shelf vision transformer backbone, our adapters can
simultaneously solve multiple dense vision tasks in a parameter-efficient
manner, unlike existing multitasking transformers that are parametrically
expensive. In contrast to concurrent methods, we do not require retraining or
fine-tuning whenever a new task or domain is added. We introduce a task-adapted
attention mechanism within our adapter framework that combines gradient-based
task similarities with attention-based ones. The learned task affinities
generalize to the following settings: zero-shot task transfer, unsupervised
domain adaptation, and generalization without fine-tuning to novel domains. We
demonstrate that our approach outperforms not only the existing convolutional
neural network-based multitasking methods but also the vision transformer-based
ones. Our project page is at \url{https://ivrl.github.io/VTAGML}.Comment: Accepted to ICCV 202
- …