20,161 research outputs found

    Lucene4IR: Developing information retrieval evaluation resources using Lucene

    Get PDF
    The workshop and hackathon on developing Information Retrieval Evaluation Resources using Lucene (L4IR) was held on the 8th and 9th of September, 2016 at the University of Strathclyde in Glasgow, UK and funded by the ESF Elias Network. The event featured three main elements: (i) a series of keynote and invited talks on industry, teaching and evaluation; (ii) planning, coding and hacking where a number of groups created modules and infrastructure to use Lucene to undertake TREC based evaluations; and (iii) a number of breakout groups discussing challenges, opportunities and problems in bridging the divide between academia and industry, and how we can use Lucene for teaching and learning Information Retrieval (IR). The event was composed of a mix and blend of academics, experts and students wanting to learn, share and create evaluation resources for the community. The hacking was intense and the discussions lively creating the basis of many useful tools but also raising numerous issues. It was clear that by adopting and contributing to most widely used and supported Open Source IR toolkit, there were many benefits for academics, students, researchers, developers and practitioners - providing a basis for stronger evaluation practices, increased reproducibility, more efficient knowledge transfer, greater collaboration between academia and industry, and shared teaching and training resources

    A Comparative Study and Analysis of Query Performance Prediction Algorithms to Improve their Reproducibility

    Get PDF
    Una delle sfide principali nella valutazione all’interno dell’Information Retrieval ù rappresentata dal costo richiesto dalla valutazione stessa, sia online che offline. Pertanto, negli ultimi anni diversi sforzi sono stati dedicati al compito svolto dalla Query Performance Prediction (QPP). QPP ha come obiettivo quello di stimare la qualità di un sistema quando viene utilizzato per recuperare documenti in risposta a una data query, basandosi su diverse fonti di informazione come la query, i documenti o i punteggi di similarità forniti dal sistema di Information Retrieval. Negli ultimi anni sono stati progettati diversi modelli di QPP pre e post-retrieval, ma raramente sono stati testati nelle stesse condizioni sperimentali. L’obiettivo del nostro lavoro ù molteplice: sviluppare una struttura unificante che includa diversi approcci QPP presenti nello stato dell’arte e usare tale struttura per valutare la riproducibilità degli approcci QPP implementati. I nostri risultati illustrano che siamo in grado di raggiungere un alto grado di riproducibilità, con quattordici metodi diversi riprodotti correttamente e risultati di performance paragonabili a quelli originali.One of the primary challenges in Information Retrieval evaluation is represented by the cost of carrying out either online or offline evaluation. Therefore, in recent years several endeavors have been devoted to the Query Performance Prediction (QPP) task. QPP aims to estimate the quality of a system when used to retrieve documents in response to a given query, relying on different sources of information such as the query, the documents or the similarity scores provided by the Information Retrieval system. In the last years several pre and post-retrieval QPP models have been designed, but rarely tested under the same experimental conditions. The objective of our work is multifold: we develop a unifying framework that includes several state-of-the-art QPP approaches and use such framework to assess the reproducibility of such QPP approaches. Our findings illustrate that we are able to achieve a high degree of reproducibility, with fourteen different methods correctly reproduced and performance results comparable to the original ones

    A reproducible approach with R markdown to automatic classification of medical certificates in French

    Get PDF
    In this paper, we report the ongoing developments of our first participation to the Cross-Language Evaluation Forum (CLEF) eHealth Task 1: “Multilingual Information Extraction - ICD10 coding” (NĂ©vĂ©ol et al., 2017). The task consists in labelling death certificates, in French with international standard codes. In particular, we wanted to accomplish the goal of the ‘Replication track’ of this Task which promotes the sharing of tools and the dissemination of solid, reproducible results.In questo articolo presentiamo gli sviluppi del lavoro iniziato con la partecipazione al Laboratorio CrossLanguage Evaluation Forum (CLEF) eHealth denominato: “Multilingual Information Extraction - ICD10 coding” (NĂ©vĂ©ol et al., 2017) che ha come obiettivo quello di classificare certificati di morte in lingua francese con dei codici standard internazionali. In particolare, abbiamo come obiettivo quello proposto dalla ‘Replication track’ di questo Task, che promuove la condivisione di strumenti e la diffusione di risultati riproducibili

    The Lucene for Information Access and Retrieval Research (LIARR) Workshop at SIGIR 2017

    Get PDF
    As an empirical discipline, information access and retrieval research requires substantial software infrastructure to index and search large collections. This workshop is motivated by the desire to better align information retrieval research with the practice of building search applications from the perspective of open-source information retrieval systems. Our goal is to promote the use of Lucene for information access and retrieval research

    Benchmarking news recommendations: the CLEF NewsREEL use case

    Get PDF
    The CLEF NewsREEL challenge is a campaign-style evaluation lab allowing participants to evaluate and optimize news recommender algorithms. The goal is to create an algorithm that is able to generate news items that users would click, respecting a strict time constraint. The lab challenges participants to compete in either a "living lab" (Task 1) or perform an evaluation that replays recorded streams (Task 2). In this report, we discuss the objectives and challenges of the NewsREEL lab, summarize last year's campaign and outline the main research challenges that can be addressed by participating in NewsREEL 2016
    • 

    corecore