83 research outputs found
1-PAGER: One Pass Answer Generation and Evidence Retrieval
We present 1-Pager the first system that answers a question and retrieves
evidence using a single Transformer-based model and decoding process. 1-Pager
incrementally partitions the retrieval corpus using constrained decoding to
select a document and answer string, and we show that this is competitive with
comparable retrieve-and-read alternatives according to both retrieval and
answer accuracy metrics. 1-Pager also outperforms the equivalent closed-book
question answering model, by grounding predictions in an evidence corpus. While
1-Pager is not yet on-par with more expensive systems that read many more
documents before generating an answer, we argue that it provides an important
step toward attributed generation by folding retrieval into the
sequence-to-sequence paradigm that is currently dominant in NLP. We also show
that the search paths used to partition the corpus are easy to read and
understand, paving a way forward for interpretable neural retrieval.Comment: Accepted at EMNLP 2023 (Findings
NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders
Neural document rerankers are extremely effective in terms of accuracy.
However, the best models require dedicated hardware for serving, which is
costly and often not feasible. To avoid this serving-time requirement, we
present a method of capturing up to 86% of the gains of a Transformer
cross-attention model with a lexicalized scoring function that only requires
10-6% of the Transformer's FLOPs per document and can be served using commodity
CPUs. When combined with a BM25 retriever, this approach matches the quality of
a state-of-the art dual encoder retriever, that still requires an accelerator
for query encoding. We introduce NAIL (Non-Autoregressive Indexing with
Language models) as a model architecture that is compatible with recent
encoder-decoder and decoder-only large language models, such as T5, GPT-3 and
PaLM. This model architecture can leverage existing pre-trained checkpoints and
can be fine-tuned for efficiently constructing document representations that do
not require neural processing of queries.Comment: To appear at EMNLP 202
- …