437 research outputs found

    Workshop on evaluating personal search

    Get PDF
    The first ECIR workshop on Evaluating Personal Search was held on 18th April 2011 in Dublin, Ireland. The workshop consisted of 6 oral paper presentations and several discussion sessions. This report presents an overview of the scope and contents of the workshop and outlines the major outcomes

    Information access for personal media archives

    Get PDF
    It is now possible to archive much of our life experiences in digital form using a variety of sources, e.g. blogs written, tweets made, photographs taken, etc. Information can be captured from a myriad of personal information devices. In this workshop, researchers from diverse disciplines discussed how we can advance towards the goal of effective capture, retrieval and exploration of e-memories. Proposed solutions included advanced textile sensors to capture new data, P2P methods to store this data, and personal reflection applications to review this data. Much discussion centered around search and navigation strategies, interactive interfaces, and the cognitive basis in using digitally captured information as memorabilia

    A strategy for evaluating search of “Real” personal information archives

    Get PDF
    Personal information archives (PIAs) can include materials from many sources, e.g. desktop and laptop computers, mobile phones, etc. Evaluation of personal search over these collections is problematic for reasons relating to the personal and private nature of the data and associated information needs and measuring system response effectiveness. Conventional information retrieval (IR) evaluation involving use of Cranfield type test collections to establish retrieval effectiveness and laboratory testing of interactive search behaviour have to be re-thought in this situation. One key issue is that personal data and information needs are very different to search of more public third party datasets used in most existing evaluations. Related to this, understanding the issues of how users interact with a search system for their personal data is important in developing search in this area on a well grounded basis. In this proposal we suggest an alternative IR evaluation strategy which preserves privacy of user data and enables evaluation of both the accuracy of search and exploration of interactive search behaviour. The general strategy is that instead of a common search dataset being distributed to participants, we suggest distributing standard expandable personal data collection, indexing and search tools to non-intrusively collect data from participants conducting search tasks over their own data collections on their own machines, and then performing local evaluation of individual results before central agregation

    An investigation of term weighting approaches for microblog retrieval

    Get PDF
    The use of effective term frequency weighting and document length normalisation strategies have been shown over a number of decades to have a significant positive effect for document retrieval. When dealing with much shorter documents, such as those obtained from microblogs, it would seem intuitive that these would have less benefit. In this paper we investigate their effect on microblog retrieval performance using the Tweets2011 collection from the TREC 2011 Microblog Track

    A reproducible approach with R markdown to automatic classification of medical certificates in French

    Get PDF
    In this paper, we report the ongoing developments of our first participation to the Cross-Language Evaluation Forum (CLEF) eHealth Task 1: “Multilingual Information Extraction - ICD10 coding” (NĂ©vĂ©ol et al., 2017). The task consists in labelling death certificates, in French with international standard codes. In particular, we wanted to accomplish the goal of the ‘Replication track’ of this Task which promotes the sharing of tools and the dissemination of solid, reproducible results.In questo articolo presentiamo gli sviluppi del lavoro iniziato con la partecipazione al Laboratorio CrossLanguage Evaluation Forum (CLEF) eHealth denominato: “Multilingual Information Extraction - ICD10 coding” (NĂ©vĂ©ol et al., 2017) che ha come obiettivo quello di classificare certificati di morte in lingua francese con dei codici standard internazionali. In particolare, abbiamo come obiettivo quello proposto dalla ‘Replication track’ di questo Task, che promuove la condivisione di strumenti e la diffusione di risultati riproducibili

    DCU search runs at MediaEval 2012: search and hyperlinking task

    Get PDF
    We describe the runs for our participation in the Search sub-task of the Search and Hyperlinking Task at MediaEval 2012. Our runs are designed to form a retrieval baseline by using time-based segmentation of audio transcripts incorporating pause information and a sliding window to define the retrieval segments boundaries with a standard language modelling information retrieval strategy. Using this baseline system runs based on transcripts provided by LIUM were better for all evaluation metrics, than those using transcripts provided by LIMSI

    Considering subjects and scenarios in large-scale user-centered evaluation of a multilingual multimodal medical search system

    Get PDF
    Medical search applications can be required to service the differing information needs of multiple classes of users with varying medical knowledge levels, and language skills, as well as varying querying behaviours. The precise nature of these users' needs has to be understood to develop effective applications. Evaluation of developed search applications requires creation of holistic user-centred evaluation approaches which allow for comprehensive evaluation while being mindful of the diversity of users

    Segmenting and summarizing general events in a long-term lifelog

    Get PDF
    Lifelogging aims to capture a person’s life experiences using digital devices. When captured over an extended period of time a lifelog can potentially contain millions of files from various sources in a range of formats. For lifelogs containing such massive numbers of items, we believe it is important to group them into meaningful sets and summarize them, so that users can search and browse their lifelog data efficiently. Existing studies have explored the segmentation of continuously captured images over short periods of at most a few days into small groups of “events” (episodes). Yet, for long-term lifelogs, higher levels of abstraction are desirable due to the very large number of “events” which will occur over an extended period. We aim to segment a long-term lifelog at the level of general events which typically extend beyond a daily boundary, and to select summary information to represent these events. We describe our current work on higher level segmentation and summary information extraction for long term life logs and report a preliminary pilot study on a real long-term lifelog collection

    Mr. DLib: Recommendations-as-a-Service (RaaS) for Academia

    Full text link
    Only few digital libraries and reference managers offer recommender systems, although such systems could assist users facing information overload. In this paper, we introduce Mr. DLib's recommendations-as-a-service, which allows third parties to easily integrate a recommender system into their products. We explain the recommender approaches implemented in Mr. DLib (content-based filtering among others), and present details on 57 million recommendations, which Mr. DLib delivered to its partner GESIS Sowiport. Finally, we outline our plans for future development, including integration into JabRef, establishing a living lab, and providing personalized recommendations.Comment: Accepted for publication at the JCDL conference 201
    • 

    corecore