Search CORE

19,136 research outputs found

A reproducible approach with R markdown to automatic classification of medical certificates in French

Author: Beghini Federica
Di Nunzio Giorgio Maria
Henrot Genevi\ue8ve
Vezzani Federica
Publication venue: CEUR-WS
Publication date: 01/01/2017
Field of study

In this paper, we report the ongoing developments of our first participation to the Cross-Language Evaluation Forum (CLEF) eHealth Task 1: “Multilingual Information Extraction - ICD10 coding” (Névéol et al., 2017). The task consists in labelling death certificates, in French with international standard codes. In particular, we wanted to accomplish the goal of the ‘Replication track’ of this Task which promotes the sharing of tools and the dissemination of solid, reproducible results.In questo articolo presentiamo gli sviluppi del lavoro iniziato con la partecipazione al Laboratorio CrossLanguage Evaluation Forum (CLEF) eHealth denominato: “Multilingual Information Extraction - ICD10 coding” (Névéol et al., 2017) che ha come obiettivo quello di classificare certificati di morte in lingua francese con dei codici standard internazionali. In particolare, abbiamo come obiettivo quello proposto dalla ‘Replication track’ di questo Task, che promuove la condivisione di strumenti e la diffusione di risultati riproducibili

Archivio istituzionale della ricerca - Università di Padova

Induced Magnetic Ordering by Proton Irradiation in Graphite

Author: A. Setzer
A. A. Ovchinnikov
D. Spemann
K. Murata
K. Murata
K. Murata
K.-H. Han
K.-H. Han
K.-H. Han
M. Fujita
M. Tamura
P. Esquinazi
P. Turek
P.-M. Allemand
R. Höhne
R. Höhne
R. Siegele
R. A. Wood
T. Butz
T. Makarova
Publication venue: 'American Physical Society (APS)'
Publication date: 05/09/2003
Field of study

We provide evidence that proton irradiation of energy 2.25 MeV on highly-oriented pyrolytic graphite samples triggers ferro- or ferrimagnetism. Measurements performed with a superconducting quantum interferometer device (SQUID) and magnetic force microscopy (MFM) reveal that the magnetic ordering is stable at room temperature.Comment: 3 Figure

arXiv.org e-Print Archive

Crossref

Measuring reproducibility of high-throughput experiments

Author: Bickel Peter J.
Brown James B.
Huang Haiyan
Li Qunhua
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 21/10/2011
Field of study

Reproducibility is essential to reliable scientific discovery in high-throughput experiments. In this work we propose a unified approach to measure the reproducibility of findings identified from replicate experiments and identify putative discoveries using reproducibility. Unlike the usual scalar measures of reproducibility, our approach creates a curve, which quantitatively assesses when the findings are no longer consistent across replicates. Our curve is fitted by a copula mixture model, from which we derive a quantitative reproducibility score, which we call the "irreproducible discovery rate" (IDR) analogous to the FDR. This score can be computed at each set of paired replicate ranks and permits the principled setting of thresholds both for assessing reproducibility and combining replicates. Since our approach permits an arbitrary scale for each replicate, it provides useful descriptive measures in a wide variety of situations to be explored. We study the performance of the algorithm using simulations and give a heuristic analysis of its theoretical properties. We demonstrate the effectiveness of our method in a ChIP-seq experiment.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS466 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

ir_metadata: An Extensible Metadata Schema for IR Experiments

Author: Breuer Timo
Keller Jüri
Schaer Philipp
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/07/2022
Field of study

The information retrieval (IR) community has a strong tradition of making the computational artifacts and resources available for future reuse, allowing the validation of experimental results. Besides the actual test collections, the underlying run files are often hosted in data archives as part of conferences like TREC, CLEF, or NTCIR. Unfortunately, the run data itself does not provide much information about the underlying experiment. For instance, the single run file is not of much use without the context of the shared task's website or the run data archive. In other domains, like the social sciences, it is good practice to annotate research data with metadata. In this work, we introduce ir_metadata - an extensible metadata schema for TREC run files based on the PRIMAD model. We propose to align the metadata annotations to PRIMAD, which considers components of computational experiments that can affect reproducibility. Furthermore, we outline important components and information that should be reported in the metadata and give evidence from the literature. To demonstrate the usefulness of these metadata annotations, we implement new features in repro_eval that support the outlined metadata schema for the use case of reproducibility studies. Additionally, we curate a dataset with run files derived from experiments with different instantiations of PRIMAD components and annotate these with the corresponding metadata. In the experiments, we cover reproducibility experiments that are identified by the metadata and classified by PRIMAD. With this work, we enable IR researchers to annotate TREC run files and improve the reuse value of experimental artifacts even further.Comment: Resource pape

arXiv.org e-Print Archive

From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences

Author
Publication venue: Dagstuhl
Publication date: 01/01/2018
Field of study

We describe the state-of-the-art in performance modeling and prediction for Information Retrieval (IR), Natural Language Processing (NLP) and Recommender Systems (RecSys) along with its shortcomings and strengths. We present a framework for further research, identifying five major problem areas: understanding measures, performance analysis, making underlying assumptions explicit, identifying application features determining performance, and the development of prediction models describing the relationship between assumptions, features and resulting performanc

Biblos-e Archivo

From Evaluating to Forecasting Performance: How to Turn Information Retrieval, Natural Language Processing and Recommender Systems into Predictive Sciences

Author: Castells Pablo
Daly Elizabeth M.
Declerck Thierry
Ekstrand Michael D.
Ferro Nicola
Fuhr Norbert
Geyer Werner
Gonzalo Julio
Grefenstette Gregory
Konstan Joseph A.
Kuflik Tsvi
Lindén Krister
Magnini Bernardo
Nie Jian-Yun
Perego Raffaele
Shapira Bracha
Soboroff Ian
Tintarev Nava
Verspoor Karin
Willemsen Martijn C.
Zobel Justin
Publication venue
Publication date: 01/01/2018
Field of study

Non peer reviewe

Dagstuhl Research Online Publication Server

Helsingin yliopiston digitaalinen arkisto

Archivio istituzionale della ricerca - Università di Padova

From evaluating to forecasting performance: how to turn information retrieval, natural language processing and recommender systems into predictive sciences:Manifesto from Dagstuhl Perspectives Workshop 17442

Author
Publication venue
Publication date: 01/01/2018
Field of study

Pure OAI Repository