162 research outputs found

    Test collections for medical information retrieval evaluation

    Get PDF
    The web has rapidly become one of the main resources for medical information for many people: patients, clinicians, medical doctors, etc. Measuring the effectiveness with which information can be retrieved from web resources for these users is crucial: it brings better information to professionals for better diagnosis, treatment, patient care; and helps patients and relatives get informed on their condition. Several existing information retrieval (IR) evaluation campaigns have been developed to assess and improve medical IR methods, for example the TREC Medical Record Track [11] and TREC Genomics Track [10]. These campaigns only target certain type of users, mainly clinicians and some medical professionals: queries are mainly centered on cohorts of records describing a specific patient cases or on biomedical reports. Evaluating search effectiveness over the many heterogeneous online medical information sources now available, which are increasingly used by a diverse range of medical professionals and, very importantly, the general public, is vital to the understanding and development of medical IR. We describe the development of two benchmarks for medical IR evaluation from the Khresmoi project. The first of these has been developed using existing medical query logs for internal research within the Khresmoi project and targets both medical professionals and general public; the second has been created in the framework of a new CLEFeHealth evaluation campaign and is designed to evaluate patient search in context

    Building realistic potential patient queries for medical information retrieval evaluation

    Get PDF
    To evaluate and improve medical information retrieval, benchmarking data sets need to be created. Few benchmarks have been focusing on patients’ information needs. There is a need for additional benchmarks to enable research into effective retrieval methods. In this paper we describe the manual creation of patient queries and investigate their automatic generation. This work is conducted in the framework of a medical evaluation campaign, which aims to evaluate and improve technologies to help patients and laypeople access eHealth data. To this end, the campaign is composed of different tasks, including a medical information retrieval (IR) task. Within this IR task, a web crawl of medically related documents, as well as patient queries are provided to participants. The queries are built to represent the potential information needs patients may have while reading their medical report. We start by describing typical types of patients’ information needs. We then describe how these queries have been manually generated from medical reports for the first two years of the eHealth campaign. We then explore techniques that would enable us to automate the query generation process. This process is particularly challenging, as it requires an understanding of the patients’ information needs, and of the electronic health records. We describe various approaches to automatically generate potential patient queries from medical reports and describe our future development and evaluation phase

    ShARe/CLEF eHealth evaluation lab 2014, task 3: user-centred health information retrieval

    Get PDF
    This paper presents the results of task 3 of the ShARe/CLEF eHealth Evaluation Lab 2014. This evaluation lab focuses on improving access to medical information on the web. The task objective was to investigate the effect of using additional information such as a related discharge summary and external resources such as medical ontologies on the IR effectiveness, in a monolingual and in a multilingual context. The participants were allowed to submit up to seven runs for each language, one mandatory run using no additional information or external resources, and three each using or not using discharge summaries

    Health consumers' knowledge learning in online health information seeking

    Get PDF
    With the increasing awareness of health consumers as active information seekers, the past decade has witnessed a shifting research interest from a physician-centered paradigm to a consumer-centered paradigm. Online health information seeking (OHIS) has become pervasive, with critical impacts on consumers' health. However, the inherent complexity and the uniqueness of health tasks pose new challenges to consumers in OHIS, such as a lack of adequate knowledge to formulate queries and evaluate the online resources with various qualities. OHIS is, by nature, a learning-oriented behavior, and knowledge learning is a critical component and outcome of consumers' OHIS. On the other hand, studies in the area of search as learning (SAL) have demonstrated that learning is a common phenomenon in the information-seeking process. However, the existing studies in OHIS mainly concentrated on viewing consumers' domain knowledge as a fixed value, even though consumers are involved in the knowledge learning in the OHIS. Therefore, this dissertation proposes a conceptual framework of health information search as learning (HearSAL) by linking the related models and prior studies from the two areas — OHIS and SAL — and conducts a systematic study to understand what, how, and how well health consumers can search and learn in online health information seeking, particularly for three increasing levels of learning objectives: Understand, Analyze and Evaluate. Two representative health consumer groups, laypeople and cancer patients, are targeted in this dissertation study because they share the common issue of facing barriers in searching and learning in OHIS, yet they are different due to prior topic knowledge, learning duration, and learning expectation. Following the conceptual framework HearSAL, four sub-studies are conducted with emphasis on different dimensions of health consumers' search as learning in OHIS, including the following: Study 1: a user study with laypeople that examines the method dimension (e.g., search behaviors and source selections); Study 2: an analysis of an ovarian cancer online health community that reveals the information dimension (e.g., types and amount of information); Study 3: interviews with laypeople; and Study 4: interviews with ovarian cancer patients and caregivers. The two complementary interviews highlight the outcomes of OHIS. Major results demonstrate that, (1) health consumers’ SAL behaviors and sources vary by different levels of learning objectives, and the variation is affected by the severity of health conditions; (2) Analyze is the most prevalent learning objective in the online health community, while the amount of informational support is the highest in the Evaluate level; (3) Though consumers’ prior knowledge of the Understand level is the highest, compared to higher levels, consumers still tend to achieve the most knowledge increase in the Understand level of learning; and (4) Receiving more informational support drives consumers to increase the level of learning objectives. This dissertation makes empirical, practical, theoretical and methodological contributions. The empirical studies of laypeople and ovarian cancer patients provide a deeper insight into health consumers' SAL behavior and performance in today's web environment. Based on the empirical results, practical implications are proposed for designing consumer-centered health information systems, which facilitate seeking and enhance learning. Finally, the HearSAL framework and its application in this study can serve as a theoretical and methodological basis for future explorations

    Examining and Supporting Laypeople's Learning in Online Health Information Seeking

    Get PDF
    It has long been understood that knowledge acquisition is an important component in the information seeking process [2,18]. Further, empirical studies have demonstrated that learning is a common phenomenon in information seeking [8,10,20]. However, for users, especially laypeople, who must gain knowledge through their interactions with a search engine, the current general-purpose search engine does not sufficiently support learning through search. Health information seeking (HIS, hereafter) is a domain-specific search [14], where users who possess higher knowledge tend to have better strategies and performances in solving their search tasks [3,21]. While learning clearly plays an important role in the HIS process, there has been little research in this area. Little is known about the factors that might enhance or impede such learning during onlineHIS. Therefore, this project aims at examining health consumers, especially laypeople’s search as learning behaviors and performances. A mixed method design will be adopted, consisting of experimental-based studies and interviews. So far, we have conducted 24 user studies and semi-structured interviews, investigating the source selection behaviors in the HIS tasks with increasing levels of learning goals. The results of this phase of the study will be used to guide the following analysis and predict laypeople’s knowledge levels in the HIS process and provide corresponding support

    The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives

    Full text link
    The Archive Query Log (AQL) is a previously unused, comprehensive query log collected at the Internet Archive over the last 25 years. Its first version includes 356 million queries, 166 million search result pages, and 1.7 billion search results across 550 search providers. Although many query logs have been studied in the literature, the search providers that own them generally do not publish their logs to protect user privacy and vital business data. Of the few query logs publicly available, none combines size, scope, and diversity. The AQL is the first to do so, enabling research on new retrieval models and (diachronic) search engine analyses. Provided in a privacy-preserving manner, it promotes open research as well as more transparency and accountability in the search industry.Comment: SIGIR 2023 resource paper, 13 page

    Consumer Health Search at CLEF eHealth 2021

    Get PDF
    This paper details materials, methods, results, and analyses of the Consumer Health Search Task of the CLEF eHealth 2021 Evaluation Lab. This task investigates the effectiveness of information retrieval (IR) approaches in providing access to medical information to laypeople. For this a TREC-style evaluation methodology was applied: a shared collection of documents and queries is distributed, participants’ runs received, relevance assessments generated, and participants’ submissions evaluated. The task generated a new representative web corpus including web pages acquired from a 2021 CommonCrawl and social media content from Twitter and Reddit, along with a new collection of 55 manually generated layperson medical queries and their respective credibility, understandability, and topicality assessments for returned documents. This year’s task focused on three subtask: (i) ad-hoc IR, (ii) weakly supervised IR, and (iii) document credibility prediction. In total, 15 runs were submitted to the three subtasks: eight addressed the ad-hoc IR task, three the weakly supervised IR challenge, and 4 the document credibility prediction challenge. As in previous years, the organizers have made data and tools associated with the task available for future research and development

    Effects of language and terminology of query suggestions on medical accuracy considering different user characteristics

    Get PDF
    Searching for health information is one of the most popular activities on the web. In this domain, users often misspell or lack knowledge of the proper medical terms to use in queries. To overcome these difficulties and attempt to retrieve higher-quality content, we developed a query suggestion system that provides alternative queries combining the Portuguese or English language with lay or medico-scientific terminology. Here we evaluate this system’s impact on the medical accuracy of the knowledge acquired during the search. Evaluation shows that simply providing these suggestions contributes to reduce the quantity of incorrect content. This indicates that even when suggestions are not clicked, they are useful either for subsequent queries’ formulation or for interpreting search results. Clicking on suggestions, regardless of type, leads to answers with more correct content. An analysis by type of suggestion and user characteristics showed that the benefits of certain languages and terminologies are more perceptible in users with certain levels of English proficiency and health literacy. This suggests a personalization of this suggestion system toward these characteristics. Overall, the effect of language is more preponderant than the effect of terminology. Clicks on English suggestions are clearly preferable to clicks on Portuguese ones.Thanks to Fundação para a CiĂȘncia e a Tecnologia for partially funding this work under the grant SFRH/BD/40982/2007 to Carla Teixeira Lopes, UID/DTP/04750/2013 to the Epidemiology Research Unit (EPIUnit) and the project UID/EEA/50014/2013 to the INESC TEC. The authors also thank Andreia Ribeirinho Soares, MD, for contributions on the definition of the criteria used to assess medical accuracy
    • 

    corecore