Search CORE

4 research outputs found

A Living Lab Architecture for Reproducible Shared Task Experimentation

Author: Breuer Timo
Schaer Philipp
Publication venue: Werner Hülsbusch
Publication date: 01/01/2021
Field of study

No existing evaluation infrastructure for shared tasks currently supports both reproducible on- and offline experiments. In this work, we present an architecture that ties together both types of experiments with a focus on reproducibility. The readers are provided with a technical description of the infrastructure and details of how to contribute their own experiments to upcoming evaluation tasks

University of Regensburg Publication Server

Pretrained Transformers for Text Ranking: BERT and Beyond

Author: Lin Jimmy
Nogueira Rodrigo
Yates Andrew
Publication venue
Publication date: 01/01/2020
Field of study

The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in response to a query. Although the most common formulation of text ranking is search, instances of the task can also be found in many natural language processing applications. This survey provides an overview of text ranking with neural network architectures known as transformers, of which BERT is the best-known example. The combination of transformers and self-supervised pretraining has been responsible for a paradigm shift in natural language processing (NLP), information retrieval (IR), and beyond. In this survey, we provide a synthesis of existing work as a single point of entry for practitioners who wish to gain a better understanding of how to apply transformers to text ranking problems and researchers who wish to pursue work in this area. We cover a wide range of modern techniques, grouped into two high-level categories: transformer models that perform reranking in multi-stage architectures and dense retrieval techniques that perform ranking directly. There are two themes that pervade our survey: techniques for handling long documents, beyond typical sentence-by-sentence processing in NLP, and techniques for addressing the tradeoff between effectiveness (i.e., result quality) and efficiency (e.g., query latency, model and index size). Although transformer architectures and pretraining techniques are recent innovations, many aspects of how they are applied to text ranking are relatively well understood and represent mature techniques. However, there remain many open research questions, and thus in addition to laying out the foundations of pretrained transformers for text ranking, this survey also attempts to prognosticate where the field is heading

arXiv.org e-Print Archive

MPG.PuRe

Information between Data and Knowledge: Information Science and its Neighbors from Data Science to Digital Humanities

Author
Publication venue: Werner Hülsbusch
Publication date: 01/01/2021
Field of study

Digital humanities as well as data science as neighboring fields pose new challenges and opportunities for information science. The recent focus on data in the context of big data and deep learning brings along new tasks for information scientist for example in research data management. At the same time, information behavior changes in the light of the increasing digital availability of information in academia as well as in everyday life. In this volume, contributions from various fields like information behavior and information literacy, information retrieval, digital humanities, knowledge representation, emerging technologies, and information infrastructure showcase the development of information science research in recent years. Topics as diverse as social media analytics, fake news on Facebook, collaborative search practices, open educational resources or recent developments in research data management are some of the highlights of this volume. For more than 30 years, the International Symposium of Information Science has been the venue for bringing together information scientists from the German speaking countries. In addition to the regular scientific contributions, six of the best competitors for the prize for the best information science master thesis present their work

University of Regensburg Publication Server