3 research outputs found

    CEDR: Contextualized Embeddings for Document Ranking

    No full text
    Although considerable attention has been given to neural ranking architectures recently, far less attention has been paid to the term representations that are used as input to these models. In this work, we investigate how two pretrained contextualized language modes (ELMo and BERT) can be utilized for ad-hoc document ranking. Through experiments on TREC benchmarks, we find that several existing neural ranking architectures can benefit from the additional context provided by contextualized language models. Furthermore, we propose a joint approach that incorporates BERT's classification vector into existing neural models and show that it outperforms state-of-the-art ad-hoc ranking baselines. We call this joint approach CEDR (Contextualized Embeddings for Document Ranking). We also address practical challenges in using these models for ranking, including the maximum input length imposed by BERT and runtime performance impacts of contextualized language models

    CEDR: Contextualized Embeddings for Document Ranking

    Get PDF
    Although considerable attention has been given to neural ranking architectures recently, far less attention has been paid to the term representations that are used as input to these models. In this work, we investigate how two pretrained contextualized language models (ELMo and BERT) can be utilized for ad-hoc document ranking. Through experiments on TREC benchmarks, we find that several existing neural ranking architectures can benefit from the additional context provided by contextualized language models. Furthermore, we propose a joint approach that incorporates BERT's classification vector into existing neural models and show that it outperforms state-of-the-art ad-hoc ranking baselines. We call this joint approach CEDR (Contextualized Embeddings for Document Ranking). We also address practical challenges in using these models for ranking, including the maximum input length imposed by BERT and runtime performance impacts of contextualized language models.Comment: Appeared in SIGIR 2019, 4 page

    EsdRank: Connecting Query and Documents through External Semi-Structured Data

    Get PDF
    ABSTRACT This paper presents EsdRank, a new technique for improving ranking using external semi-structured data such as controlled vocabularies and knowledge bases. EsdRank treats vocabularies, terms and entities from external data, as objects connecting query and documents. Evidence used to link query to objects, and to rank documents are incorporated as features between query-object and object-document correspondingly. A latent listwise learning to rank algorithm, Latent-ListMLE, models the objects as latent space between query and documents, and learns how to handle all evidence in a unified procedure from document relevance judgments. EsdRank is tested in two scenarios: Using a knowledge base for web search, and using a controlled vocabulary for medical search. Experiments on TREC Web Track and OHSUMED data show significant improvements over state-of-the-art baselines
    corecore