851 research outputs found

    A computational ecosystem to support eHealth Knowledge Discovery technologies in Spanish

    Get PDF
    The massive amount of biomedical information published online requires the development of automatic knowledge discovery technologies to effectively make use of this available content. To foster and support this, the research community creates linguistic resources, such as annotated corpora, and designs shared evaluation campaigns and academic competitive challenges. This work describes an ecosystem that facilitates research and development in knowledge discovery in the biomedical domain, specifically in Spanish language. To this end, several resources are developed and shared with the research community, including a novel semantic annotation model, an annotated corpus of 1045 sentences, and computational resources to build and evaluate automatic knowledge discovery techniques. Furthermore, a research task is defined with objective evaluation criteria, and an online evaluation environment is setup and maintained, enabling researchers interested in this task to obtain immediate feedback and compare their results with the state-of-the-art. As a case study, we analyze the results of a competitive challenge based on these resources and provide guidelines for future research. The constructed ecosystem provides an effective learning and evaluation environment to encourage research in knowledge discovery in Spanish biomedical documents.This research has been partially supported by the University of Alicante and University of Havana, the Generalitat Valenciana (Conselleria d’Educació, Investigació, Cultura i Esport) and the Spanish Government through the projects SIIA (PROMETEO/2018/089, PROMETEU/2018/089) and LIVING-LANG (RTI2018-094653-B-C22)

    Overview of the eHealth Knowledge Discovery Challenge at IberLEF 2020

    Get PDF
    This paper summarises the results of the third edition of the eHealth Knowledge Discovery (KD) challenge, hosted at the Iberian Language Evaluation Forum 2020. The eHealth-KD challenge proposes two computational tasks involving the identification of semantic entities and relations in natural language text, focusing on Spanish language health documents. In this edition, besides text extracted from medical sources, Wikipedia content was introduced into the corpus, and a novel transfer-learning evaluation scenario was designed that challenges participants to create systems that provide cross-domain generalisation. A total of eight teams participated with a variety of approaches including deep learning end-to-end systems as well as rule-based and knowledge-driven techniques. This paper analyses the most successful approaches and highlights the most interesting challenges for future research in this field.This research has been partially supported by the University of Alicante and University of Havana, the Generalitat Valenciana (Conselleria d’Educació, Investigació, Cultura i Esport) and the Spanish Government through the projects SIIA (PROMETEO/2018/089, PROMETEU/2018/089) and LIVING-LANG (RTI2018-094653-B-C22)

    Automatic extension of corpora from the intelligent ensembling of eHealth knowledge discovery systems outputs

    Get PDF
    Corpora are one of the most valuable resources at present for building machine learning systems. However, building new corpora is an expensive task, which makes the automatic extension of corpora a highly attractive task to develop. Hence, finding new strategies that reduce the cost and effort involved in this task, while at the same time guaranteeing quality, remains an open and important challenge for the research community. In this paper, we present a set of ensembling strategies oriented toward entity and relation extraction tasks. The main goal is to combine several automatically annotated versions of corpora to produce a single version with improved quality. An ensembler is built by exploring a configuration space in search of the combination that maximizes the fitness of the ensembled collection according to a reference collection. The eHealth-KD 2019 challenge was chosen for the case study. The submitted systems’ outputs were ensembled, resulting in the construction of an automatically annotated collection of 8000 sentences. We show that using this collection as additional training input for a baseline algorithm has a positive impact on its performance. Additionally, the ensembling pipeline was used as a participant system in the 2020 edition of the challenge. The ensembled run achieved a slightly better performance than the individual runs.This research has been partially funded by the University of Alicante and the University of Havana, the Generalitat Valenciana (Conselleria d’Educació, Investigació, Cultura i Esport) and the Spanish Government through the projects LIVING-LANG (RTI2018-094653-B-C22) and SIIA (PROMETEO/2018/089, PROMETEU/2018/089). Moreover, it has been backed by the work of both COST Actions: CA19134 - “Distributed Knowledge Graphs” and CA19142 - “Leading Platform for European Citizens, Industries, Academia and Policymakers in Media Accessibility”

    Advancements in eHealth Data Analytics through Natural Language Processing and Deep Learning

    Full text link
    The healthcare environment is commonly referred to as "information-rich" but also "knowledge poor". Healthcare systems collect huge amounts of data from various sources: lab reports, medical letters, logs of medical tools or programs, medical prescriptions, etc. These massive sets of data can provide great knowledge and information that can improve the medical services, and overall the healthcare domain, such as disease prediction by analyzing the patient's symptoms or disease prevention, by facilitating the discovery of behavioral factors for diseases. Unfortunately, only a relatively small volume of the textual eHealth data is processed and interpreted, an important factor being the difficulty in efficiently performing Big Data operations. In the medical field, detecting domain-specific multi-word terms is a crucial task as they can define an entire concept with a few words. A term can be defined as a linguistic structure or a concept, and it is composed of one or more words with a specific meaning to a domain. All the terms of a domain create its terminology. This chapter offers a critical study of the current, most performant solutions for analyzing unstructured (image and textual) eHealth data. This study also provides a comparison of the current Natural Language Processing and Deep Learning techniques in the eHealth context. Finally, we examine and discuss some of the current issues, and we define a set of research directions in this area

    A two-stage deep learning approach for extracting entities and relationships from medical texts

    Get PDF
    This Work Presents A Two-Stage Deep Learning System For Named Entity Recognition (Ner) And Relation Extraction (Re) From Medical Texts. These Tasks Are A Crucial Step To Many Natural Language Understanding Applications In The Biomedical Domain. Automatic Medical Coding Of Electronic Medical Records, Automated Summarizing Of Patient Records, Automatic Cohort Identification For Clinical Studies, Text Simplification Of Health Documents For Patients, Early Detection Of Adverse Drug Reactions Or Automatic Identification Of Risk Factors Are Only A Few Examples Of The Many Possible Opportunities That The Text Analysis Can Offer In The Clinical Domain. In This Work, Our Efforts Are Primarily Directed Towards The Improvement Of The Pharmacovigilance Process By The Automatic Detection Of Drug-Drug Interactions (Ddi) From Texts. Moreover, We Deal With The Semantic Analysis Of Texts Containing Health Information For Patients. Our Two-Stage Approach Is Based On Deep Learning Architectures. Concretely, Ner Is Performed Combining A Bidirectional Long Short-Term Memory (Bi-Lstm) And A Conditional Random Field (Crf), While Re Applies A Convolutional Neural Network (Cnn). Since Our Approach Uses Very Few Language Resources, Only The Pre-Trained Word Embeddings, And Does Not Exploit Any Domain Resources (Such As Dictionaries Or Ontologies), This Can Be Easily Expandable To Support Other Languages And Clinical Applications That Require The Exploitation Of Semantic Information (Concepts And Relationships) From Texts...This work was supported by the Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R)

    Behavior Change Apps for Gestational Diabetes Management : Exploring Desirable Features

    Get PDF
    Publisher Copyright: © 2021 The Author(s). Published with license by Taylor & Francis Group, LLC.Gestational diabetes mellitus (GDM) has considerable and increasing health effects as it raises both the mother’s and offspring’s risk for short- and long-term health problems. GDM can usually be treated with a healthier lifestyle, such as appropriate dietary modifications and engaging insufficient physical activity. While telemedicine interventions requiring weekly or more frequent feedback from health care professionals have shown the potential to improve glycemic control amongst women with GDM, apps without extensive input from health care professionals are limited and have not shown to be effective. We aimed to improve the efficacy of GDM self-management apps by exploring desirable features in a review. We derived six desirable features from the multidisciplinary literature and we evaluated the state of implementation of these features in existing GDM apps. The results showed that features for increasing competence to manage GDM and for providing social support were largely lacking.Peer reviewe

    Ontology Enrichment from Texts: A Biomedical Dataset for Concept Discovery and Placement

    Full text link
    Mentions of new concepts appear regularly in texts and require automated approaches to harvest and place them into Knowledge Bases (KB), e.g., ontologies and taxonomies. Existing datasets suffer from three issues, (i) mostly assuming that a new concept is pre-discovered and cannot support out-of-KB mention discovery; (ii) only using the concept label as the input along with the KB and thus lacking the contexts of a concept label; and (iii) mostly focusing on concept placement w.r.t a taxonomy of atomic concepts, instead of complex concepts, i.e., with logical operators. To address these issues, we propose a new benchmark, adapting MedMentions dataset (PubMed abstracts) with SNOMED CT versions in 2014 and 2017 under the Diseases sub-category and the broader categories of Clinical finding, Procedure, and Pharmaceutical / biologic product. We provide usage on the evaluation with the dataset for out-of-KB mention discovery and concept placement, adapting recent Large Language Model based methods.Comment: 5 pages, 1 figure, accepted for CIKM 2023. The dataset, data construction scripts, and baseline implementation are available at https://zenodo.org/record/8228005 (Zenodo) and https://github.com/KRR-Oxford/OET (GitHub

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe
    corecore