2,549 research outputs found
Unsupervised Cross-Task Generalization via Retrieval Augmentation
Humans can perform unseen tasks by recalling relevant skills that are
acquired previously and then generalizing them to the target tasks, even if
there is no supervision at all. In this paper, we aim to improve such
cross-task generalization ability of massive multi-task language models such as
T0 (Sanh et al., 2021) in an unsupervised setting. We propose a
retrieval-augmentation method named ReCross that takes a few unlabelled
examples as queries to retrieve a small subset of upstream data and uses them
to update the multi-task model for better generalization. Our empirical results
show that the proposed ReCross consistently outperforms non-retrieval baselines
by a significant margin.Comment: Project website: https://inklab.usc.edu/ReCross
Pretrained Transformers for Text Ranking: BERT and Beyond
The goal of text ranking is to generate an ordered list of texts retrieved
from a corpus in response to a query. Although the most common formulation of
text ranking is search, instances of the task can also be found in many natural
language processing applications. This survey provides an overview of text
ranking with neural network architectures known as transformers, of which BERT
is the best-known example. The combination of transformers and self-supervised
pretraining has been responsible for a paradigm shift in natural language
processing (NLP), information retrieval (IR), and beyond. In this survey, we
provide a synthesis of existing work as a single point of entry for
practitioners who wish to gain a better understanding of how to apply
transformers to text ranking problems and researchers who wish to pursue work
in this area. We cover a wide range of modern techniques, grouped into two
high-level categories: transformer models that perform reranking in multi-stage
architectures and dense retrieval techniques that perform ranking directly.
There are two themes that pervade our survey: techniques for handling long
documents, beyond typical sentence-by-sentence processing in NLP, and
techniques for addressing the tradeoff between effectiveness (i.e., result
quality) and efficiency (e.g., query latency, model and index size). Although
transformer architectures and pretraining techniques are recent innovations,
many aspects of how they are applied to text ranking are relatively well
understood and represent mature techniques. However, there remain many open
research questions, and thus in addition to laying out the foundations of
pretrained transformers for text ranking, this survey also attempts to
prognosticate where the field is heading
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
We propose the Neuro-Symbolic Concept Learner (NS-CL), a model that learns
visual concepts, words, and semantic parsing of sentences without explicit
supervision on any of them; instead, our model learns by simply looking at
images and reading paired questions and answers. Our model builds an
object-based scene representation and translates sentences into executable,
symbolic programs. To bridge the learning of two modules, we use a
neuro-symbolic reasoning module that executes these programs on the latent
scene representation. Analogical to human concept learning, the perception
module learns visual concepts based on the language description of the object
being referred to. Meanwhile, the learned visual concepts facilitate learning
new words and parsing new sentences. We use curriculum learning to guide the
searching over the large compositional space of images and language. Extensive
experiments demonstrate the accuracy and efficiency of our model on learning
visual concepts, word representations, and semantic parsing of sentences.
Further, our method allows easy generalization to new object attributes,
compositions, language concepts, scenes and questions, and even new program
domains. It also empowers applications including visual question answering and
bidirectional image-text retrieval.Comment: ICLR 2019 (Oral). Project page: http://nscl.csail.mit.edu
Self-Adaptive Named Entity Recognition by Retrieving Unstructured Knowledge
Although named entity recognition (NER) helps us to extract domain-specific
entities from text (e.g., artists in the music domain), it is costly to create
a large amount of training data or a structured knowledge base to perform
accurate NER in the target domain. Here, we propose self-adaptive NER, which
retrieves external knowledge from unstructured text to learn the usages of
entities that have not been learned well. To retrieve useful knowledge for NER,
we design an effective two-stage model that retrieves unstructured knowledge
using uncertain entities as queries. Our model predicts the entities in the
input and then finds those of which the prediction is not confident. Then, it
retrieves knowledge by using these uncertain entities as queries and
concatenates the retrieved text to the original input to revise the prediction.
Experiments on CrossNER datasets demonstrated that our model outperforms strong
baselines by 2.35 points in F1 metric.Comment: EACL2023 (long
- …