130 research outputs found

    Deep learning for information extraction in the biomedical domain

    Get PDF
    Mención Internacional en el título de doctorThe main hypothesis of this PhD dissertation is that novel Deep Learning algorithms can outperform classical Machine Learning methods for the task of Information Extraction in the Biomedical Domain. Contrary to classical systems, Deep Learning models can learn the representation of the data automatically without an expert domain knowledge and avoid the tedious and time-consuming task of defining relevant features. A Drug-Drug Interaction (DDI), which is an essential subset of Adverse Drug Reaction (ADR), represents the alterations in the effects of drugs that were taken simultaneously. The early recognition of interacting drugs is a vital process that prevents serious health problems that can cause death in the worst cases. Health-care professionals and researchers in this domain find the task of discovering information about these incidents very challenging due to the vast number of pharmacovigilance documents. For this reason, several shared tasks and datasets have been developed in order to solve this issue with automated annotation systems with the capability to extract this information. In the present document, the DDI corpus, which is an annotated dataset of DDIs, is used with Deep Learning architectures without any external information for the tasks of Name Entity Recognition and Relation Extraction in order to validate the hypothesis. Furthermore, some other datasets are tested to evidence the performance of these systems. To sum up, the results suggest that the most common Deep Learning methods like Convolutional Neural Networks and Recurrent Neural Networks overcome the traditional algorithms concluding that Deep Learning is a real alternative for a specific and complex scenario like the Information Extraction in the Biomedical domain. As a final goal, a complete architecture that covers the two tasks is developed to structure the named entities and their relationships from raw pharmacological texts.This thesis has been supported by: Pre-doctoral research training scholarship of the Carlos III University of Madrid (PIF UC3M 02-1415) for four years. Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R). Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (eGovernAbility-Access project TIN2014-52665-C2-2-R). Doctoral stay TEAM - Technologies for information and communication, Europe - east Asia Mobilities project (Erasmus Mundus Action 2-Strand 2 Programme) funded by the European Commission realized in the University of Tokyo, Japan, for the Aizawa Laboratory in National Institute of Informatics (NII) for seven months.PublicadoPrograma de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Ricardo Aler Mur.- Secretario: Alberto Díaz Esteban.- Vocal: María Herrero Zaz

    A multi-BERT hybrid system for named entity recognition in Spanish radiology reports

    Get PDF
    The present work describes the proposed methods by the EdIE-KnowLab team in Information Extraction Task of CLEF eHealth 2021, SpRadIE Task 1. This task focuses on detecting and classifying relevant mentions in ultrasonography reports. The architecture developed is an ensemble of multiple BERT (multi-BERT) systems, one per each entity type, together with a generated dictionary and available off-the-shelf tools, Google Healthcare Natural Language API and GATECloud's Measurement Expression Annotator system, applied to the documents translated into English with word alignment from the neural machine translation tool, Microsoft Translator API. Our best system configuration (multi-BERT with a dictionary) achieves 85.51% and 80.04% F1 for Lenient and Exact metrics, respectively. Thus, the system ranked first out of 17 submissions from 7 teams that participated in this shared task. Our system also achieved the best Recall merging the previous predictions to the results given by English-translated texts and cross-lingual word alignment (83.87% Lenient match and 78.71% Exact match). The overall results demonstrate the potential of pre-trained language models and cross-lingual word alignment for limited corpus and low-resource NER in the clinical domain

    Asociaciones Público-Privadas: Nuevos esquemas de gobernanza para el desarrollo inclusivo y sostenible de las regiones en México

    Get PDF
    El objetivo de este trabajo es examinar las asociaciones público-privadas como nuevos esquemas de gobernanza que se han desplegado en México, como consecuencia de la disminución del papel del Estado-nación en la resolución de los problemas económicos y sociales y discernir si han sido la respuesta adecuada a la necesidad de encontrar nuevas formas de colaboración entre diversos agentes para solventar estos problemas o disminuirlos, particularmente para el caso de las regiones. Se realiza una revisión conceptual de los nuevos esquemas de gobernanza y la puesta en práctica de los esquemas de asociaciones público-privadas para ejecutar proyectos de infraestructura o de impulso económico, promovidos especialmente por el gobierno federal y por organismos internacionales, principalmente por el Banco Mundial. Se encontró que, si bien estos esquemas ayudan al desarrollo de obras y servicios de infraestructura y con ello contribuyen al desarrollo económico y social, no son suficientes para emprender un desarrollo con la participación de todos los actores locales, para impulsar un proceso de desarrollo local y regional integral y con participación social, por lo que se enfatiza en la necesidad de la incorporación de organizaciones de la sociedad civil en los nuevos esquemas de gobernanza para impulsar procesos de desarrollo regional inclusivos

    Clasificación de escenas en contenido audiovisual

    Get PDF
    En este proyecto se han diseñado algoritmos para la clasificación de escenas mediante categorías genéricas tales como: costa, calle, montaña, cielo, oficina, casa, etc. El objetivo de los algoritmos desarrollados es ser integrado en un sistema de anotación automático de contenidos visuales. Para ello, se realiza un estudio comparativo y sistemático de dos alternativas del estado del arte: el método de extracción de esencia de imágenes, llamado gist descriptor, y el método bag of words para la extracción de características. Para la clasificación se utiliza, además, una máquina de vector soporte. La motivación del proyecto es, en primer lugar, desarrollar un sistema que pueda ser capaz de clasificar una gran base de datos de imágenes según el tipo de escena visual, y, en segundo lugar, que el estudiante conozca y sepa utilizar diferentes métodos de clasificación de escenas audiovisuales, pueda evaluarlas y tomar conclusiones. ___________________________________________________________________________________________________________________________In this project I had designed algorithms for scene classification using generic categories such as: coast road, mountain, sky, office, home, etc.. The developed algorithms have been integrated into a system for automatic annotation of visual content. For this purpose, I have performed a comparative and systematic study of two state of the art alternatives: the gist extraction method of images, called gist descriptor, and the bag of words method for feature extraction. For the classification, additionaly, I use a support vector machine. This project aims to, on the one hand, develop a system that will be able to classify a large database of images according to the type of audiovisual scene, and, on the other, that the student learns and knows how to use different methods of classification for visual scenes, as well as he can evaluate and draw his own conclusions from the results.Ingeniería Técnica en Sonido e Image

    Forced migration and food crises

    Get PDF
    Existe una preocupación creciente por el aumento de la inseguridad alimentaria a escala global. Sin embargo, sus implicaciones económicas son poco conocidas. El presente artículo utiliza un modelo estructural de gravedad para cuantificar el efecto de las crisis alimentarias sobre los flujos de migrantes forzados internacionales (MFIs). Para ello, se emplea una base de datos que mide la severidad, la intensidad y las causas de las crisis alimentarias. Los resultados indican que incluso las crisis alimentarias menos severas tienden a incrementar los flujos de MFIs. Por su parte, las crisis alimentarias más severas tienden a impulsar en mayor medida los flujos de MFIs hacia países en desarrollo. Los resultados obtenidos indicarían que las crisis alimentarias aumentan las restricciones de liquidez sobre la migración, y que estas empeoran a medida que aumenta la intensidad de la crisis.There is growing concern about the increase in food insecurity across the world, but little is known of its economic implications. This paper quantifies the effect of food crises on forced international migration (FIM) flows using a structural gravity model. To this end, we use a database that measures the severity, intensity and causes of food crises. The results suggest that even less severe food crises tend to increase FIM flows. More severe food crises tend to skew FIM flows towards developing countries. The results obtained appear to indicate that food crises tighten liquidity constraints on migration and that these constraints worsen as the food crisis intensifies

    A two-stage deep learning approach for extracting entities and relationships from medical texts

    Get PDF
    This Work Presents A Two-Stage Deep Learning System For Named Entity Recognition (Ner) And Relation Extraction (Re) From Medical Texts. These Tasks Are A Crucial Step To Many Natural Language Understanding Applications In The Biomedical Domain. Automatic Medical Coding Of Electronic Medical Records, Automated Summarizing Of Patient Records, Automatic Cohort Identification For Clinical Studies, Text Simplification Of Health Documents For Patients, Early Detection Of Adverse Drug Reactions Or Automatic Identification Of Risk Factors Are Only A Few Examples Of The Many Possible Opportunities That The Text Analysis Can Offer In The Clinical Domain. In This Work, Our Efforts Are Primarily Directed Towards The Improvement Of The Pharmacovigilance Process By The Automatic Detection Of Drug-Drug Interactions (Ddi) From Texts. Moreover, We Deal With The Semantic Analysis Of Texts Containing Health Information For Patients. Our Two-Stage Approach Is Based On Deep Learning Architectures. Concretely, Ner Is Performed Combining A Bidirectional Long Short-Term Memory (Bi-Lstm) And A Conditional Random Field (Crf), While Re Applies A Convolutional Neural Network (Cnn). Since Our Approach Uses Very Few Language Resources, Only The Pre-Trained Word Embeddings, And Does Not Exploit Any Domain Resources (Such As Dictionaries Or Ontologies), This Can Be Easily Expandable To Support Other Languages And Clinical Applications That Require The Exploitation Of Semantic Information (Concepts And Relationships) From Texts...This work was supported by the Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R)

    Rare Disease Identification from Clinical Notes with Ontologies and Weak Supervision

    Get PDF
    The identification of rare diseases from clinical notes with Natural Language Processing (NLP) is challenging due to the few cases available for machine learning and the need of data annotation from clinical experts. We propose a method using ontologies and weak supervision. The approach includes two steps: (i) Text-to-UMLS, linking text mentions to concepts in Unified Medical Language System (UMLS), with a named entity linking tool (e.g. SemEHR) and weak supervision based on customised rules and Bidirectional Encoder Representations from Transformers (BERT) based contextual representations, and (ii) UMLS-to-ORDO, matching UMLS concepts to rare diseases in Orphanet Rare Disease Ontology (ORDO). Using MIMIC-III US intensive care discharge summaries as a case study, we show that the Text-to-UMLS process can be greatly improved with weak supervision, without any annotated data from domain experts. Our analysis shows that the overall pipeline processing discharge summaries can surface rare disease cases, which are mostly uncaptured in manual ICD codes of the hospital admissions.Comment: 5 pages, 3 figures, accepted for IEEE EMBC 202

    La competitividad de la región centro del estado de Guanajuato y valoración de su capital territorial

    Get PDF
    La obra aborda el tema de la competitividad desde una perspectiva territorial, primero descubre el estado del arte y sistematiza los enfoques y criterios que se emplean en el análisis de la competitivididad, principalmente desde la visión del desarrollo territorial. El texto también profundiza sobre el enfoque del capital territorial y el papel que cumple en el logro de la competitividad, elemento que explica en buena medida el posicionamiento y desempeño de los territorios en términos de sus activos físicos e intangibles. Tomado como punto de partida la importancia del capital territorial para la competitividad, se abordan sus distintos componentes que expresan las variadas dimensiones de los territorios: económica, social, cultural, medio ambiental, y político-institucional. Es así como en la obra se muestra el capital territorial con el que cuenta la región centro del estado de Guanajuato, desglozando los componentes del capital territorial, con una desagregación de indicadores a escala municipal y se realiza un análisis de éste capital, develando el la capacidades diferenciales de dichos componentes y cómo influyen en las condiciones del capital territorial de los municipios estudiados. La obra permite comprender los factores que inciden en el desempeño de los territorios y sus capacidades competitivas en un mundo globalizado
    corecore