Search CORE

24 research outputs found

The Road Towards Reproducibility in Science: The Case of Data Citation

Author: Ferro Nicola
Silvello Gianmaria
Publication venue
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Università di Padova

Report from Dagstuhl Seminar 23031: Frontiers of Information Access Experimentation for Research and Education

Author: Bauer Christine
Carterette Ben
Faggioli Guglielmo
Ferro Nicola
Fuhr Norbert
Publication venue
Publication date: 01/01/2023
Field of study

This report documents the program and the outcomes of Dagstuhl Seminar 23031 ``Frontiers of Information Access Experimentation for Research and Education'', which brought together 37 participants from 12 countries. The seminar addressed technology-enhanced information access (information retrieval, recommender systems, natural language processing) and specifically focused on developing more responsible experimental practices leading to more valid results, both for research as well as for scientific education. The seminar brought together experts from various sub-fields of information access, namely IR, RS, NLP, information science, and human-computer interaction to create a joint understanding of the problems and challenges presented by next generation information access systems, from both the research and the experimentation point of views, to discuss existing solutions and impediments, and to propose next steps to be pursued in the area in order to improve not also our research methods and findings but also the education of the new generation of researchers and developers. The seminar featured a series of long and short talks delivered by participants, who helped in setting a common ground and in letting emerge topics of interest to be explored as the main output of the seminar. This led to the definition of five groups which investigated challenges, opportunities, and next steps in the following areas: reality check, i.e. conducting real-world studies, human-machine-collaborative relevance judgment frameworks, overcoming methodological challenges in information retrieval and recommender systems through awareness and education, results-blind reviewing, and guidance for authors.Comment: Dagstuhl Seminar 23031, report

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Workflows and Provenance: Toward Information Science Solutions for the Natural Sciences

Author: Gryk Michael R.
Ludäscher Bertram
Publication venue: 'Project Muse'
Publication date: 01/01/2017
Field of study

The era of big data and ubiquitous computation has brought with it concerns about ensuring reproducibility in this new research environment. It is easy to assume that computational methods self-document by their very nature of being exact, deterministic processes. However, similar to laboratory experiments, ensuring reproducibility in the computational realm requires the documentation of both the protocols used (workflows), as well as a detailed description of the computational environment: algorithms, implementations, software environments, and the data ingested and execution logs of the computation. These two aspects of computational reproducibility (workflows and execution details) are discussed within the context of biomolecular Nuclear Magnetic Resonance spectroscopy (bioNMR), as well as the PRIMAD model for computational reproducibility

Illinois Digital Environment for Access to Learning and Scholarship Repository

ir_metadata: An Extensible Metadata Schema for IR Experiments

Author: Breuer Timo
Keller Jüri
Schaer Philipp
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/07/2022
Field of study

The information retrieval (IR) community has a strong tradition of making the computational artifacts and resources available for future reuse, allowing the validation of experimental results. Besides the actual test collections, the underlying run files are often hosted in data archives as part of conferences like TREC, CLEF, or NTCIR. Unfortunately, the run data itself does not provide much information about the underlying experiment. For instance, the single run file is not of much use without the context of the shared task's website or the run data archive. In other domains, like the social sciences, it is good practice to annotate research data with metadata. In this work, we introduce ir_metadata - an extensible metadata schema for TREC run files based on the PRIMAD model. We propose to align the metadata annotations to PRIMAD, which considers components of computational experiments that can affect reproducibility. Furthermore, we outline important components and information that should be reported in the metadata and give evidence from the literature. To demonstrate the usefulness of these metadata annotations, we implement new features in repro_eval that support the outlined metadata schema for the use case of reproducibility studies. Additionally, we curate a dataset with run files derived from experiments with different instantiations of PRIMAD components and annotate these with the corresponding metadata. In the experiments, we cover reproducibility experiments that are identified by the metadata and classified by PRIMAD. With this work, we enable IR researchers to annotate TREC run files and improve the reuse value of experimental artifacts even further.Comment: Resource pape

arXiv.org e-Print Archive

From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences

Author
Publication venue: Dagstuhl
Publication date: 01/01/2018
Field of study

We describe the state-of-the-art in performance modeling and prediction for Information Retrieval (IR), Natural Language Processing (NLP) and Recommender Systems (RecSys) along with its shortcomings and strengths. We present a framework for further research, identifying five major problem areas: understanding measures, performance analysis, making underlying assumptions explicit, identifying application features determining performance, and the development of prediction models describing the relationship between assumptions, features and resulting performanc

Biblos-e Archivo

From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences

Author: Castells Pablo
Daly Elizabeth M.
Declerck Thierry
Ekstrand Michael D.
Ferro Nicola
Fuhr Norbert
Geyer Werner
Gonzalo Julio
Grefenstette Gregory
Konstan Joseph A.
Kuflik Tsvi
Lindén Krister
Magnini Bernardo
Nie Jian-Yun
Perego Raffaele
Shapira Bracha
Soboroff Ian
Tintarev Nava
Verspoor Karin
Willemsen Martijn C.
Zobel Justin
Publication venue
Publication date: 01/01/2018
Field of study

Non peer reviewe

Dagstuhl Research Online Publication Server

Helsingin yliopiston digitaalinen arkisto

Archivio istituzionale della ricerca - Università di Padova

Computing environments for reproducibility: Capturing the 'Whole Tale'

Author: Brinckman Adam
Chard Kyle
Gaffney Niall
Hategan Mihael
Jones Matthew B.
Kowalik Kacper
Kulasekaran Sivakumar
Ludäscher Bertram
Mecum Bryce D.
Nabrzyski Jarek
Stodden Victoria
Taylor Ian J.
Turk Matthew J.
Turner Kandace
Publication venue: 'Elsevier BV'
Publication date: 01/02/2018
Field of study

The act of sharing scientific knowledge is rapidly evolving away from traditional articles and presentations to the delivery of executable objects that integrate the data and computational details (e.g., scripts and workflows) upon which the findings rely. This envisioned coupling of data and process is essential to advancing science but faces technical and institutional barriers. The Whole Tale project aims to address these barriers by connecting computational, data-intensive research efforts with the larger research process—transforming the knowledge discovery and dissemination process into one where data products are united with research articles to create “living publications” or tales. The Whole Tale focuses on the full spectrum of science, empowering users in the long tail of science, and power users with demands for access to big data and compute resources. We report here on the design, architecture, and implementation of the Whole Tale environment

arXiv.org e-Print Archive

Online Research @ Cardiff

From evaluating to forecasting performance: how to turn information retrieval, natural language processing and recommender systems into predictive sciences:Manifesto from Dagstuhl Perspectives Workshop 17442

Author
Publication venue
Publication date: 01/01/2018
Field of study

Pure OAI Repository

From evaluating to forecasting performance: how to turn information retrieval, natural language processing and recommender systems into predictive sciences:Manifesto from Dagstuhl Perspectives Workshop 17442

Author
Publication venue
Publication date: 01/01/2018
Field of study

Pure OAI Repository