Search CORE

180 research outputs found

Report from Dagstuhl Seminar 23031: Frontiers of Information Access Experimentation for Research and Education

Author: Bauer Christine
Carterette Ben
Faggioli Guglielmo
Ferro Nicola
Fuhr Norbert
Publication venue
Publication date: 01/01/2023
Field of study

This report documents the program and the outcomes of Dagstuhl Seminar 23031 ``Frontiers of Information Access Experimentation for Research and Education'', which brought together 37 participants from 12 countries. The seminar addressed technology-enhanced information access (information retrieval, recommender systems, natural language processing) and specifically focused on developing more responsible experimental practices leading to more valid results, both for research as well as for scientific education. The seminar brought together experts from various sub-fields of information access, namely IR, RS, NLP, information science, and human-computer interaction to create a joint understanding of the problems and challenges presented by next generation information access systems, from both the research and the experimentation point of views, to discuss existing solutions and impediments, and to propose next steps to be pursued in the area in order to improve not also our research methods and findings but also the education of the new generation of researchers and developers. The seminar featured a series of long and short talks delivered by participants, who helped in setting a common ground and in letting emerge topics of interest to be explored as the main output of the seminar. This led to the definition of five groups which investigated challenges, opportunities, and next steps in the following areas: reality check, i.e. conducting real-world studies, human-machine-collaborative relevance judgment frameworks, overcoming methodological challenges in information retrieval and recommender systems through awareness and education, results-blind reviewing, and guidance for authors.Comment: Dagstuhl Seminar 23031, report

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

How Sensitivity Classification Effectiveness Impacts Reviewers in Technology-Assisted Sensitivity Review

Author: Macdonald Craig
Mcdonald Graham
Ounis Iadh
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/03/2019
Field of study

All government documents that are released to the public must first be manually reviewed to identify and protect any sensitive information, e.g. confidential information. However, the unassisted manual sensitivity review of born-digital documents is not practical due to, for example, the volume of documents that are created. Previous work has shown that sensitivity classification can be effective for predicting if a document contains sensitive information. However, since all of the released documents must be manually reviewed, it is important to know if sensitivity classification can assist sensitivity reviewers in making their sensitivity judgements. Hence, in this paper, we conduct a digital sensitivity review user study, to investigate if the accuracy of sensitivity classification effects the number of documents that a reviewer correctly judges to be sensitive or not (reviewer accuracy) and the time that it takes to sensitivity review a document (reviewing speed). Our results show that providing reviewers with sensitivity classification predictions, from a classifier that achieves 0.7 Balanced Accuracy, results in a 38% increase in mean reviewer accuracy and an increase of 72% in mean reviewing speeds, compared to when reviewers are not provided with predictions. Overall, our findings demonstrate that sensitivity classification is a viable technology for assisting with the sensitivity review of born-digital government documents

Crossref

Enlighten

Unbiased Learning to Rank: Counterfactual and Online Approaches

Author: de Rijke Maarten
Jagerman Rolf
Oosterhuis Harrie
Publication venue
Publication date: 16/07/2019
Field of study

This tutorial covers and contrasts the two main methodologies in unbiased Learning to Rank (LTR): Counterfactual LTR and Online LTR. There has long been an interest in LTR from user interactions, however, this form of implicit feedback is very biased. In recent years, unbiased LTR methods have been introduced to remove the effect of different types of bias caused by user-behavior in search. For instance, a well addressed type of bias is position bias: the rank at which a document is displayed heavily affects the interactions it receives. Counterfactual LTR methods deal with such types of bias by learning from historical interactions while correcting for the effect of the explicitly modelled biases. Online LTR does not use an explicit user model, in contrast, it learns through an interactive process where randomized results are displayed to the user. Through randomization the effect of different types of bias can be removed from the learning process. Though both methodologies lead to unbiased LTR, their approaches differ considerably, furthermore, so do their theoretical guarantees, empirical results, effects on the user experience during learning, and applicability. Consequently, for practitioners the choice between the two is very substantial. By providing an overview of both approaches and contrasting them, we aim to provide an essential guide to unbiased LTR so as to aid in understanding and choosing between methodologies.Comment: Abstract for tutorial appearing at SIGIR 201

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Tutorial: Are You My Neighbor?: Bringing Order to Neighbor Computing Problems

Author: Anastasiu David C.
Catanese Helen N.
David
Li Yuliang
Publication venue: SJSU ScholarWorks
Publication date: 01/08/2019
Field of study

Finding nearest neighbors is an important topic that has attracted much attention over the years and has applications in many fields, such as market basket analysis, plagiarism and anomaly detection, community detection, ligand-based virtual screening, etc. As data are easier and easier to collect, finding neighbors has become a potential bottleneck in analysis pipelines. Performing pairwise comparisons given the massive datasets of today is no longer feasible. The high computational complexity of the task has led researchers to develop approximate methods, which find many but not all of the nearest neighbors. Yet, for some types of data, efficient exact solutions have been found by carefully partitioning or filtering the search space in a way that avoids most unnecessary comparisons.In recent years, there have been several fundamental advances in our ability to efficiently identify appropriate neighbors, especially in non-traditional data, such as graphs or document collections. In this tutorial, we provide an in-depth overview of recent methods for finding (nearest) neighbors, focusing on the intuition behind choices made in the design of those algorithms and on the utility of the methods in real-world applications. Our tutorial aims to provide a unifying view of neighbor computing problems, spanning from numerical data to graph data, from categorical data to sequential data, and related application scenarios. For each type of data, we will review the current state-of-the-art approaches used to identify neighbors and discuss how neighbor search methods are used to solve important problems

Crossref

Scholar Commons - Santa Clara University

SJSU ScholarWorks

Fifteenth Biennial Status Report: March 2019 - February 2021

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2021
Field of study

MPG.PuRe

Extended Overview of the Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab 2015

Author: Balog K.
Kelly L.
Schuth A.
Publication venue: CEUR-WS
Publication date: 01/01/2015
Field of study

International Migration, Integration and Social Cohesion online publications