160 research outputs found

    Extracción de información en informes médicos

    Get PDF
    El acceso a la información contenida dentro de un informe médico es vital tanto para la investigación como para el tratamiento de los pacientes. Sin embargo, la información relevante suele estar escrita en lenguaje natural, por lo su procesamiento automático no es una tarea trivial. Con este objetivo en mente, hemos desarrollado un sistema capaz de obtener un archivo que represente el contenido más relevante de un informe clínico. Como parte de esta representación se deberán detectar aquellos conceptos médicos pertenecientes a una de las ontologías más utilizadas en este ámbito, UMLS. Además previamente se realizará un proceso automático de corrección ortográfica, expansión de acrónimos y detección de frases afirmadas, negadas y especuladas. Todo esto en dos de los idiomas más hablados a nivel mundial: español e inglés. Esta representación permitirá a su vez desarrollar aplicaciones que la utilicen, por lo que se ha implementado también un buscador de informes médicos como ejemplo de ello. Por último, como parte de este trabajo, también se incluye todo el proceso seguido durante nuestra participación en el Conference and Labs of the Evaluation Forum del año 2013, una de las organizaciones más conocidas a nivel internacional en el campo de la recuperación de información, así como el artículo científico desarrollado para la misma. [ABSTRACT] The information inside a medial report it’s very important for researchers and for the patient. But this information is usually written in natural language, so automatic processing isn’t a trivial task. With this target in mind, we developed a system that is able to generate a representation which contains the most relevant information in a medical report. It detects medical concepts from one of most popular biomedical ontologies, UMLS. Previously will also perform a spelling correction, acronym expansion and affirmed, negated and speculated sentences detection. All this process could be executed into the two most spoken languages in the world, English and Spanish. The representation will allow us to develop applications that use it. In fact it’s been including a searcher for medical reports to show an example of what can be done with our software. Finally, as part of this work, we explain our experience in our participation into the Conference and Labs of the Evaluation Forum 2013, a self-organized body whose is wellknown in the international IR community, and the paper generate for it

    Boosting terminology extraction through crosslingual resources

    Get PDF
    Terminology Extraction is an important Natural Language Processing task with multiple applications in many areas. The task has been approached from different points of view using different techniques. Language and domain independent systems have been proposed as well. Our contribution in this paper focuses on the improvements on Terminology Extraction using crosslingual resources and specifically the Wikipedia and on the use of a variant of PageRank for scoring the candidate terms. // La extracción de terminología es una tarea de procesamiento de la lengua sumamente importante y aplicable en numerosas áreas. La tarea se ha abordado desde múltiples perspectivas y utilizando técnicas diversas. También se han propuesto sistemas independientes de la lengua y del dominio. La contribución de este artículo se centra en las mejoras que los sistemas de extracción de terminología pueden lograr utilizando recursos translingües, y concretamente la Wikipedia y en el uso de una variante de PageRank para valorar los candidatos a términoPeer ReviewedPostprint (published version

    Mapping of electronic health records in Spanish to the unified medical language system metathesaurus

    Get PDF
    [EN] This work presents a preliminary approach to annotate Spanish electronic health records with concepts of the Unified Medical Language System Metathesaurus. The prototype uses Apache Lucene R to index the Metathesaurus and generate mapping candidates from input text. In addition, it relies on UKB to resolve ambiguities. The tool has been evaluated by measuring its agreement with MetaMap in two English-Spanish parallel corpora, one consisting of titles and abstracts of papers in the clinical domain, and the other of real electronic health record excerpts.[EU] Lan honetan, espainieraz idatzitako mediku-txosten elektronikoak Unified Medical Languge System Metathesaurus deituriko terminologia biomedikoarekin etiketatzeko lehen urratsak eman dira. Prototipoak Apache Lucene R erabiltzen du Metathesaurus-a indexatu eta mapatze hautagaiak sortzeko. Horrez gain, anbiguotasunak UKB bidez ebazten ditu. Ebaluazioari dagokionez, prototipoaren eta MetaMap-en arteko adostasuna neurtu da bi ingelera-gaztelania corpus paralelotan. Corpusetako bat artikulu zientifikoetako izenburu eta laburpenez osatutako dago. Beste corpusa mediku-txosten pasarte batzuez dago osatuta

    Clinical narrative analytics challenges

    Get PDF
    Precision medicine or evidence based medicine is based on the extraction of knowledge from medical records to provide individuals with the appropriate treatment in the appropriate moment according to the patient features. Despite the efforts of using clinical narratives for clinical decision support, many challenges have to be faced still today such as multilinguarity, diversity of terms and formats in different services, acronyms, negation, to name but a few. The same problems exist when one wants to analyze narratives in literature whose analysis would provide physicians and researchers with highlights. In this talk we will analyze challenges, solutions and open problems and will analyze several frameworks and tools that are able to perform NLP over free text to extract medical entities by means of Named Entity Recognition process. We will also analyze a framework we have developed to extract and validate medical terms. In particular we present two uses cases: (i) medical entities extraction of a set of infectious diseases description texts provided by MedlinePlus and (ii) scales of stroke identification in clinical narratives written in Spanish

    Automatic Identification of Biomedical Concepts in Spanish Language Unstructured Clinical Texts

    Get PDF
    [Poster]. IHI'10 ACM International Health Informatics Symposium Arlington, VA, USA - November 11-12, 2010The processing of health information from medical records and, especially, clinical notes is a complex task due to the nature of the texts themeselves (i.e., hand-written and containing semi-structured or unstructured data) and the diversity of the terminology used. While certain technologies exist to process these types of texts and data in the English language, only a few such initiatives exist for similar texts and data in the Spanish language. This paper presents a new proposal for the semantic annotation of Spanish-language clinical notes, implementing an automated tool similar to the UMLS MetaMap Transfer (MMTx) for the identification of biomedical concepts in the Spanish-language SNOMED CT ontology. Moreover, an assessment of the tool using 100 Spanish-language clinical notes is presented. Using the clinical notes manually annotated by specialists of a Spanish hospital as the gold standard, it is concluded that precision scores are sufficiently good for the several types of matching achieved by the automated tool proposed. The research presented in this contribution offers a launching point for the establishment of semantic relationships between concepts and the application of mining techniques to Spanish-language clinical notes.This study has been partially supported by the MAVIR Consortium (S2009/TIC-1542) and by the TIN2007-67407-C03-01 project BRAVOPublicad

    Contributions to information extraction for spanish written biomedical text

    Get PDF
    285 p.Healthcare practice and clinical research produce vast amounts of digitised, unstructured data in multiple languages that are currently underexploited, despite their potential applications in improving healthcare experiences, supporting trainee education, or enabling biomedical research, for example. To automatically transform those contents into relevant, structured information, advanced Natural Language Processing (NLP) mechanisms are required. In NLP, this task is known as Information Extraction. Our work takes place within this growing field of clinical NLP for the Spanish language, as we tackle three distinct problems. First, we compare several supervised machine learning approaches to the problem of sensitive data detection and classification. Specifically, we study the different approaches and their transferability in two corpora, one synthetic and the other authentic. Second, we present and evaluate UMLSmapper, a knowledge-intensive system for biomedical term identification based on the UMLS Metathesaurus. This system recognises and codifies terms without relying on annotated data nor external Named Entity Recognition tools. Although technically naive, it performs on par with more evolved systems, and does not exhibit a considerable deviation from other approaches that rely on oracle terms. Finally, we present and exploit a new corpus of real health records manually annotated with negation and uncertainty information: NUBes. This corpus is the basis for two sets of experiments, one on cue andscope detection, and the other on assertion classification. Throughout the thesis, we apply and compare techniques of varying levels of sophistication and novelty, which reflects the rapid advancement of the field

    Proyecto de transferencia tecnológica Deteami: tecnologías de procesamiento del lenguaje natural para la ayuda en farmacia y en farmacovigilancia

    Get PDF
    The goal of the Deteami project is to develop tools that make clinicians aware of adverse drug reactions stated in electronic health records of the clinical digital history. The records produced in hospitals are a valuable though nearly unexplored source of information among others due to the fact that are tough to get due to privacy and confidentiality restrictions. To leverage the clinicians work of reading and analyzing the health records looking for information about the health of the patients, in this project we explore the records automatically, identify among others disorder and drug entities, and infer medical information, in this case, adverse drug reactions. In this project a research-framework was settled with the Galdakao-Usansolo and Basurto Hospitals from Osakidetza (the Basque Health System). Osakidetza provided both the texts and the final user feedback, as well as, specialists that annotate the corpora, an in this way, we obtained a gold-standard.El objetivo del proyecto Deteami es el desarrollo de herramientas para ayudar al personal clínico a identificar reacciones adversas a medicamentos en informes médicos electrónicos de la historia clínica digital. Los informes que se generan en los hospitales son una valiosa fuente de información aún no debidamente explotada debido principalmente a restricciones de privacidad y confidencialidad. Con el objetivo de aliviar el trabajo del personal clínico que se dedica a leer y analizar los informes médicos buscando información sobre la salud de los pacientes, en este proyecto analizamos automáticamente los informes, identificamos entre otras entidades que describen enfermedades y medicamentos, y finalmente, inferimos información médica; en este caso, reacciones adversas a medicamentos. En este proyecto hemos establecido un marco de colaboración con los hospitales de Galdakao-Usansolo y Basurto pertenecientes a Osakidetza (Servicio Vasco de Salud). Osakidetza participa mediante la provisión de los textos y retroalimentando el trabajo técnico con su experiencia, así como expertos que anotan el corpus para la obtención del gold-standard.This work was partially supported by the Spanish Ministry of Science and Innovation (EXTRECM: TIN2013-46616-C2-1-R, TADEEP: TIN2015-70214-P) and the Basque Government (DETEAMI: Ministry of Health 2014111003, IXA Research Group of type A (2010-2015), ELKAROLA: KK-2015/00098)
    • …
    corecore