Search CORE

12 research outputs found

Analyzing Adversarial Attacks on Sequence-to-Sequence Relevance Models

Author: Fröbe Maik
Hagen Matthias
MacAvaney Sean
Parry Andrew
Potthast Martin
Publication venue
Publication date: 12/03/2024
Field of study

Modern sequence-to-sequence relevance models like monoT5 can effectively capture complex textual interactions between queries and documents through cross-encoding. However, the use of natural language tokens in prompts, such as Query, Document, and Relevant for monoT5, opens an attack vector for malicious documents to manipulate their relevance score through prompt injection, e.g., by adding target words such as true. Since such possibilities have not yet been considered in retrieval evaluation, we analyze the impact of query-independent prompt injection via manually constructed templates and LLM-based rewriting of documents on several existing relevance models. Our experiments on the TREC Deep Learning track show that adversarial documents can easily manipulate different sequence-to-sequence relevance models, while BM25 (as a typical lexical model) is not affected. Remarkably, the attacks also affect encoder-only relevance models (which do not rely on natural language prompt tokens), albeit to a lesser extent.Comment: 13 pages, 3 figures, Accepted at ECIR 2024 as a Full Pape

arXiv.org e-Print Archive

The Infinite Index: Information Retrieval on Generative Text-To-Image Models

Author: Deckers Niklas
Fröbe Maik
Kiesel Johannes
Pandolfo Gianluca
Potthast Martin
Schröder Christopher
Stein Benno
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/01/2023
Field of study

Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt. Finding and refining prompts that produce a desired image has become the art of prompt engineering. Generative models do not provide a built-in retrieval model for a user's information need expressed through prompts. In light of an extensive literature review, we reframe prompt engineering for generative models as interactive text-based retrieval on a novel kind of "infinite index". We apply these insights for the first time in a case study on image generation for game design with an expert. Finally, we envision how active learning may help to guide the retrieval of generated images.Comment: Final version for CHIIR 202

arXiv.org e-Print Archive

The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives

Author: Fröbe Maik
Gienapp Lukas
Hagen Matthias
Potthast Martin
Reimer Jan Heinrich
Scells Harrisen
Schmidt Sebastian
Stein Benno
Publication venue
Publication date: 31/07/2023
Field of study

The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years. Its first version includes 356 million queries, 166 million search result pages, and 1.7 billion search results across 550 search providers. Although many query logs have been studied in the literature, the search providers that own them generally do not publish their logs to protect user privacy and vital business data. Of the few query logs publicly available, none combines size, scope, and diversity. The AQL is the first to do so, enabling research on new retrieval models and (diachronic) search engine analyses. Provided in a privacy-preserving manner, it promotes open research as well as more transparency and accountability in the search industry.Comment: SIGIR 2023 resource paper, 13 page

arXiv.org e-Print Archive

Evaluating Generative Ad Hoc Information Retrieval

Author: Bevendorff Janek
Deckers Niklas
Fröbe Maik
Gienapp Lukas
Hagen Matthias
Kiesel Johannes
Potthast Martin
Scells Harrisen
Stein Benno
Syed Shahbaz
Wang Shuai
Zuccon Guido
Publication venue
Publication date: 08/11/2023
Field of study

Recent advances in large language models have enabled the development of viable generative information retrieval systems. A generative retrieval system returns a grounded generated text in response to an information need instead of the traditional document ranking. Quantifying the utility of these types of responses is essential for evaluating generative retrieval systems. As the established evaluation methodology for ranking-based ad hoc retrieval may seem unsuitable for generative retrieval, new approaches for reliable, repeatable, and reproducible experimentation are required. In this paper, we survey the relevant information retrieval and natural language processing literature, identify search tasks and system architectures in generative retrieval, develop a corresponding user model, and study its operationalization. This theoretical analysis provides a foundation and new insights for the evaluation of generative ad hoc retrieval systems.Comment: 14 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

The Eighth Workshop on Search-Oriented Conversational Artificial Intelligence (SCAI'24)

Author: Frummet Alexander
Fröbe Maik
Kiesel Johannes
Papenmeier Andrea
Publication venue: Association for Computing Machinery
Publication date: 10/09/2024
Field of study

With the emergence of voice assistants and large language models, conversational interaction with information has become part of everyday life. The eighth edition of the search-oriented conversational AI (SCAI) workshop brings together practitioners and researchers from various disciplines to discuss challenges and advances in conversational search systems. This year's edition focuses on evaluations beyond relevance and accuracy and looks at conversational search from the user's perspective. The workshop features a shared task on user-centered evaluation datasets and metrics, challenging participants to develop new and innovative ways to evaluate conversational search systems while accounting for the needs and preferences of users.</p

University of Twente Research Information

Touché23-Image-Retrieval-for-Arguments

Author: Fröbe Maik
Handke Nicolas
Kiesel Johannes
Potthast Martin
Stein Benno
Publication venue
Publication date: 13/09/2023
Field of study

Data for the Image Retrieval for Arguments task at Touché 2023. This version is lacking the touche23-image-search-archives.zip and touche23-image-search-screenshots.zip for space restrictions. Please get them from https://files.webis.de/corpora/corpora-webis/corpus-touche-image-search-23

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Report on the 1st Workshop on Query Performance Prediction and Its Evaluation in New Tasks (QPP++ 2023) at ECIR 2023

Author: Faggioli Guglielmo
Ferro Nicola
Fröbe Maik
Mothe Josiane
Raiber Fiana
Publication venue: Association for Computing Machinery (ACM)
Publication date: 01/01/2023
Field of study

SIGIR is the Association for Computing Machinery’s Special Interest Group on Information Retrieval. ECIR 2023: 45th European Conference on Information RetrievalInternational audienceQuery Performance Prediction (QPP) is currently primarily applied to ad-hoc retrieval tasks. The Information Retrieval (IR) field is reaching new heights thanks to recent advances in large language models and neural networks, as well as emerging new ways of searching, such as conversational search. Such advancements are quickly spreading to adjacent research areas, including QPP, necessitating a reconsideration of how we perform and evaluate QPP. This workshop sought to elicit discussion on three topics related to the future of QPP: exploiting advances in IR to improve QPP, instantiating QPP on new search paradigms, and evaluating QPP on new tasks

Scientific Publications of the University of Toulouse II Le Mirail

Archivio istituzionale della ricerca - Università di Padova

The Information Retrieval Experiment Platform

Author: Bevendorff Janek
Deckers Niklas
Fröbe Maik
Hagen Matthias
MacAvaney Sean
Potthast Martin
Reich Simon
Reimer Jan Heinrich
Stein Benno
Publication venue
Publication date
Field of study

Enlighten

Resources for Combining Teaching and Research in Information Retrieval Coursework

Author: Akiki Christopher
Elstner Theresa
Fröbe Maik
Gienapp Lukas
Hagen Matthias
MacAvaney Sean
Potthast Martin
Reimer Jan Heinrich
Scells Harrisen
Stein Benno
Publication venue
Publication date
Field of study

Enlighten