621,241 research outputs found
Topically Driven Neural Language Model
Language models are typically applied at the sentence level, without access
to the broader document context. We present a neural language model that
incorporates document context in the form of a topic model-like architecture,
thus providing a succinct representation of the broader document context
outside of the current sentence. Experiments over a range of datasets
demonstrate that our model outperforms a pure sentence-based model in terms of
language model perplexity, and leads to topics that are potentially more
coherent than those produced by a standard LDA topic model. Our model also has
the ability to generate related sentences for a topic, providing another way to
interpret topics.Comment: 11 pages, Proceedings of the 55th Annual Meeting of the Association
for Computational Linguistics (ACL 2017) (to appear
Syntactic Topic Models
The syntactic topic model (STM) is a Bayesian nonparametric model of language
that discovers latent distributions of words (topics) that are both
semantically and syntactically coherent. The STM models dependency parsed
corpora where sentences are grouped into documents. It assumes that each word
is drawn from a latent topic chosen by combining document-level features and
the local syntactic context. Each document has a distribution over latent
topics, as in topic models, which provides the semantic consistency. Each
element in the dependency parse tree also has a distribution over the topics of
its children, as in latent-state syntax models, which provides the syntactic
consistency. These distributions are convolved so that the topic of each word
is likely under both its document and syntactic context. We derive a fast
posterior inference algorithm based on variational methods. We report
qualitative and quantitative studies on both synthetic data and hand-parsed
documents. We show that the STM is a more predictive model of language than
current models based only on syntax or only on topics
CEDR: Contextualized Embeddings for Document Ranking
Although considerable attention has been given to neural ranking architectures recently, far less attention has been paid to the term representations that are used as input to these models. In this work, we investigate how two pretrained contextualized language modes (ELMo and BERT) can be utilized for ad-hoc document ranking. Through experiments on TREC benchmarks, we find that several existing neural ranking architectures can benefit from the additional context provided by contextualized language models. Furthermore, we propose a joint approach that incorporates BERT's classification vector into existing neural models and show that it outperforms state-of-the-art ad-hoc ranking baselines. We call this joint approach CEDR (Contextualized Embeddings for Document Ranking). We also address practical challenges in using these models for ranking, including the maximum input length imposed by BERT and runtime performance impacts of contextualized language models
CEDR: Contextualized Embeddings for Document Ranking
Although considerable attention has been given to neural ranking
architectures recently, far less attention has been paid to the term
representations that are used as input to these models. In this work, we
investigate how two pretrained contextualized language models (ELMo and BERT)
can be utilized for ad-hoc document ranking. Through experiments on TREC
benchmarks, we find that several existing neural ranking architectures can
benefit from the additional context provided by contextualized language models.
Furthermore, we propose a joint approach that incorporates BERT's
classification vector into existing neural models and show that it outperforms
state-of-the-art ad-hoc ranking baselines. We call this joint approach CEDR
(Contextualized Embeddings for Document Ranking). We also address practical
challenges in using these models for ranking, including the maximum input
length imposed by BERT and runtime performance impacts of contextualized
language models.Comment: Appeared in SIGIR 2019, 4 page
Explainable and Discourse Topic-aware Neural Language Understanding
Marrying topic models and language models exposes language understanding to a
broader source of document-level context beyond sentences via topics. While
introducing topical semantics in language models, existing approaches
incorporate latent document topic proportions and ignore topical discourse in
sentences of the document. This work extends the line of research by
additionally introducing an explainable topic representation in language
understanding, obtained from a set of key terms correspondingly for each latent
topic of the proportion. Moreover, we retain sentence-topic associations along
with document-topic association by modeling topical discourse for every
sentence in the document. We present a novel neural composite language model
that exploits both the latent and explainable topics along with topical
discourse at sentence-level in a joint learning framework of topic and language
models. Experiments over a range of tasks such as language modeling, word sense
disambiguation, document classification, retrieval and text generation
demonstrate ability of the proposed model in improving language understanding.Comment: Accepted at ICML2020 (13 pages, 2 figures), acknowledgements adde
Lost in the Middle: How Language Models Use Long Contexts
While recent language models have the ability to take long contexts as input,
relatively little is known about how well the language models use longer
context. We analyze language model performance on two tasks that require
identifying relevant information within their input contexts: multi-document
question answering and key-value retrieval. We find that performance is often
highest when relevant information occurs at the beginning or end of the input
context, and significantly degrades when models must access relevant
information in the middle of long contexts. Furthermore, performance
substantially decreases as the input context grows longer, even for explicitly
long-context models. Our analysis provides a better understanding of how
language models use their input context and provides new evaluation protocols
for future long-context models.Comment: 15 pages, 17 figure
- …