3,379 research outputs found

    Overview of the CLEF 2018 Consumer Health Search Task

    Get PDF
    This paper details the collection, systems and evaluation methods used in the CLEF 2018 eHealth Evaluation Lab, Consumer Health Search (CHS) task (Task 3). This task investigates the effectiveness of search engines in providing access to medical information present on the Web for people that have no or little medical knowledge. The task aims to foster advances in the development of search technologies for Consumer Health Search by providing resources and evaluation methods to test and validate search systems. Built upon the the 2013-17 series of CLEF eHealth Information Retrieval tasks, the 2018 task considers both mono- and multilingual retrieval, embracing the Text REtrieval Conference (TREC) -style evaluation process with a shared collection of documents and queries, the contribution of runs from participants and the subsequent formation of relevance assessments and evaluation of the participants submissions. For this year, the CHS task uses a new Web corpus and a new set of queries compared to the previous years. The new corpus consists of Web pages acquired from the CommonCrawl and the new set of queries consists of 50 queries issued by the general public to the Health on the Net (HON) search services. We then manually translated the 50 queries to French, German, and Czech; and obtained English query variations of the 50 original queries. A total of 7 teams from 7 different countries participated in the 2018 CHS task: CUNI (Czech Republic), IMS Unipd (Italy), MIRACL (Tunisia), QUT (Australia), SINAI (Spain), UB-Botswana (Botswana), and UEvora (Portugal)

    Ariel - Volume 4 Number 1

    Get PDF
    Editors David A. Jacoby Eugenia Miller Tom Williams Associate Editors Paul Bialas Terry Burt Michael Leo Gail Tenikat Editor Emeritus and Business Manager Richard J. Bonnano Movie Editor Robert Breckenridg

    Improving approximation of domain-focused, corpus-based, lexical semantic relatedness

    Get PDF
    Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts and concepts represented by these texts is an important part of many text and knowledge processing tasks of crucial importance in many domain-specific scenarios. The problem of most state-of-the-art methods for calculating domain-specific semantic relatedness is their dependence on highly specialized, structured knowledge resources, which makes these methods poorly adaptable for many usage scenarios. On the other hand, the domain knowledge in the fields such as Life Sciences has become more and more accessible, but mostly in its unstructured form - as texts in large document collections, which makes its use more challenging for automated processing. In this dissertation, three new corpus-based methods for approximating domain-specific textual semantic relatedness are presented and evaluated with a set of standard benchmarks focused on the field of biomedicine. Nonetheless, the proposed measures are general enough to be adapted to other domain-focused scenarios. The evaluation involves comparisons with other relevant state-of-the-art measures for calculating semantic relatedness and the results suggest that the methods presented here perform comparably or better than other approaches. Additionally, the dissertation also presents an experiment, in which one of the proposed methods is applied within an ontology matching system, DisMatch. The performance of the system was evaluated externally on a biomedically themed ‘Phenotype’ track of the Ontology Alignment Evaluation Initiative 2016 campaign. The results of the track indicate, that the use distributional semantic relatedness for ontology matching is promising, as the system presented in this thesis did stand out in detecting correct mappings that were not detected by any other systems participating in the track. The work presented in the dissertation indicates an improvement achieved w.r.t. the stat-of-the-art through the domain adapted use of the distributional principle (i.e. the presented methods are corpus-based and do not require additional resources). The ontology matching experiment showcases practical implications of the presented theoretical body of work

    Application of Semantics to Solve Problems in Life Sciences

    Get PDF
    Fecha de lectura de Tesis: 10 de diciembre de 2018La cantidad de información que se genera en la Web se ha incrementado en los últimos años. La mayor parte de esta información se encuentra accesible en texto, siendo el ser humano el principal usuario de la Web. Sin embargo, a pesar de todos los avances producidos en el área del procesamiento del lenguaje natural, los ordenadores tienen problemas para procesar esta información textual. En este cotexto, existen dominios de aplicación en los que se están publicando grandes cantidades de información disponible como datos estructurados como en el área de las Ciencias de la Vida. El análisis de estos datos es de vital importancia no sólo para el avance de la ciencia, sino para producir avances en el ámbito de la salud. Sin embargo, estos datos están localizados en diferentes repositorios y almacenados en diferentes formatos que hacen difícil su integración. En este contexto, el paradigma de los Datos Vinculados como una tecnología que incluye la aplicación de algunos estándares propuestos por la comunidad W3C tales como HTTP URIs, los estándares RDF y OWL. Haciendo uso de esta tecnología, se ha desarrollado esta tesis doctoral basada en cubrir los siguientes objetivos principales: 1) promover el uso de los datos vinculados por parte de la comunidad de usuarios del ámbito de las Ciencias de la Vida 2) facilitar el diseño de consultas SPARQL mediante el descubrimiento del modelo subyacente en los repositorios RDF 3) crear un entorno colaborativo que facilite el consumo de Datos Vinculados por usuarios finales, 4) desarrollar un algoritmo que, de forma automática, permita descubrir el modelo semántico en OWL de un repositorio RDF, 5) desarrollar una representación en OWL de ICD-10-CM llamada Dione que ofrezca una metodología automática para la clasificación de enfermedades de pacientes y su posterior validación haciendo uso de un razonador OWL

    QUT IElab at CLEF 2018 Consumer Health Search Task: Knowledge Base Retrieval for Consumer Health Search

    Get PDF
    In this paper we describe our participation to the CLEF 2018 Consumer Health Search Task, sub task IRTask1. This track aims to evaluate and advance search technologies aimed at supporting consumers to find health advice online. Our solution addressed this challenge by extending the Entity Query Feature Expansion model (EQFE), a knowledge base (KB) query expansion method. In previous work we showed that Wikipedia, UMLS and CHV can be effective as basis for CHS query expansions within the EQFE model. To obtain the query expansion terms, first, we mapped entity mentions to KB entities by performing exact matching. After mapping, we used the Title of the mapped KB entities as the source for expansion terms. For our first three expanded query sets, we expanded the original queries sourcing expansion terms from each of Wikipedia, the UMLS, and the CHV. For our fourth expanded query set, we combined expansion terms from Wikipedia and CHV

    TMT: una herramienta para guiar a los usuarios en la búsqueda de información sobre textos clínicos

    Get PDF
    La gran cantidad de información médica disponible a través de internet, tanto en formato estructurado como en formato texto, hace que los distintos tipos de usuario se encuentren con diferentes problemas a la hora de efectuar una búsqueda efectiva. Por un lado, los estudiantes de medicina, el personal sanitario y los investigadores en el área de la biomedicina disponen de una gran variedad de fuentes y herramientas de características dispares, que precisan de un periodo de aprendizaje a veces insalvable. Por otro lado, los pacientes, sus familiares y personas que no pertenecen a la profesión médica, se encuentran con el problema añadido que supone no estar suficientemente familiarizados con la terminología médica. En este artículo presentamos una herramienta que permite extraer conceptos médicos relevantes presentes en un texto clínico, haciendo uso de técnicas para el reconocimiento de entidades nombradas, aplicadas sobre listas de conceptos, y técnicas de anotación a partir de ontologías. Para proponer los conceptos se hace uso de un recurso no formal de conocimiento, como es Freebase, y de recursos formales como son Medlineplus y Pubmed. Nosotros argumentamos que la combinación de estos recursos, con información menos formal y en lenguaje más divulgativo (como es Freebase), con información formal y en lenguaje más divulgativo (como es Medlineplus) o con información formal y en lenguaje más especializado (como son las publicaciones científicas de Pubmed), optimiza el proceso de localización de información médica sobre un caso clínico complejo a usuarios con diferentes perfiles y necesidades, tales como son los pacientes, los médicos o los investigadores. Nuestro objetivo último es la construcción de una plataforma que permita albergar diferentes técnicas para facilitar la práctica de la medicina traslacional.The large amount of medical information available through the Internet, in both structure and text formats, makes that different types of users will encounter different problems when they have to carry out an effective search. On the one hand, medical students, health staff and researchers in the field of biomedicine have a variety of sources and tools of different characteristics which require a learning period sometimes insurmountable. On the other hand, patients, family members and people outside of the medical profession, face the added problem of not being sufficiently familiarized with medical terminology. In this paper we present a tool that can extract relevant medical concepts present in a clinical text, using techniques for named entity recognition, applied on lists of concepts, and annotation techniques from ontologies. To propose these concepts, our tool makes use of a non formal knowledge source, such as Freebase, and formal resources such as MedlinePlus and PubMed. We argue that the combination of these resources, with information less formal and more plain language (like Freebase), with formal information and more plain language (like Medlineplus) or with formal information and more technical language (such as the Pubmed scientific literature), optimize the process of discover medical information on a complex clinical case to users with different profiles and needs, such as are patients, doctors or researchers. Our ultimate goal is to build a platform to accommodate different techniques facilitating the practice of translational medicine

    Integrating understandability in the evaluation of consumer health search engines

    Get PDF
    In this paper we propose a method that integrates the no- tion of understandability, as a factor of document relevance, into the evaluation of information retrieval systems for con- sumer health search. We consider the gain-discount evaluation framework (RBP, nDCG, ERR) and propose two understandability-based variants (uRBP) of rank biased precision, characterised by an estimation of understandability based on document readability and by different models of how readability influences user understanding of document content. The proposed uRBP measures are empirically contrasted to RBP by comparing system rankings obtained with each measure. The findings suggest that considering understandability along with topicality in the evaluation of in- formation retrieval systems lead to different claims about systems effectiveness than considering topicality alone
    • …
    corecore