2,642 research outputs found

    Knowledge-based best of breed approach for automated detection of clinical events based on German free text digital hospital discharge letters

    Get PDF
    OBJECTIVES: The secondary use of medical data contained in electronic medical records, such as hospital discharge letters, is a valuable resource for the improvement of clinical care (e.g. in terms of medication safety) or for research purposes. However, the automated processing and analysis of medical free text still poses a huge challenge to available natural language processing (NLP) systems. The aim of this study was to implement a knowledge-based best of breed approach, combining a terminology server with integrated ontology, a NLP pipeline and a rules engine. METHODS: We tested the performance of this approach in a use case. The clinical event of interest was the particular drug-disease interaction "proton-pump inhibitor [PPI] use and osteoporosis". Cases were to be identified based on free text digital discharge letters as source of information. Automated detection was validated against a gold standard. RESULTS: Precision of recognition of osteoporosis was 94.19%, and recall was 97.45%. PPIs were detected with 100% precision and 97.97% recall. The F-score for the detection of the given drug-disease-interaction was 96,13%. CONCLUSION: We could show that our approach of combining a NLP pipeline, a terminology server, and a rules engine for the purpose of automated detection of clinical events such as drug-disease interactions from free text digital hospital discharge letters was effective. There is huge potential for the implementation in clinical and research contexts, as this approach enables analyses of very high numbers of medical free text documents within a short time period

    Extraction of chemical-induced diseases using prior knowledge and textual information

    Get PDF
    We describe our approach to the chemical-disease relation (CDR) task in the BioCreative V challenge. The CDR task consists of two subtasks: Automatic disease-named entity recognition and normalization (DNER), and extraction of chemical-induced diseases (CIDs) from Medline abstracts. For the DNER subtask, we used our concept recognition tool Peregrine, in combination with several optimization steps. For the CID subtask, our system, which we named RELigator, was trained on a rich feature set, comprising features derived from a graph database containing prior knowledge about chemicals and diseases, and linguistic and statistical features derived from the abstracts in the CDR training corpus. We describe the systems that were developed and present evaluation results for both subtasks on the CDR test set. For DNER, our Peregrine system reached an F-score of 0.757. For CID, the system achieved an F-score of 0.526, which ranked second among 18 participating teams. Several post-challenge modifications of the systems resulted in substantially improved F-scores (0.828 for DNER and 0.602 for CID)

    Knowledge-based extraction of adverse drug events from biomedical text

    Get PDF
    Background: Many biomedical relation extraction systems are machine-learning based and have to be trained on large annotated corpora that are expensive and cumbersome to construct. We developed a knowledge-based relation extraction system that requires minimal training data, and applied the system for the extraction of adverse drug events from biomedical text. The system consists of a concept recognition module that identifies drugs and adverse effects in sentences, and a knowledg

    A two-stage deep learning approach for extracting entities and relationships from medical texts

    Get PDF
    This Work Presents A Two-Stage Deep Learning System For Named Entity Recognition (Ner) And Relation Extraction (Re) From Medical Texts. These Tasks Are A Crucial Step To Many Natural Language Understanding Applications In The Biomedical Domain. Automatic Medical Coding Of Electronic Medical Records, Automated Summarizing Of Patient Records, Automatic Cohort Identification For Clinical Studies, Text Simplification Of Health Documents For Patients, Early Detection Of Adverse Drug Reactions Or Automatic Identification Of Risk Factors Are Only A Few Examples Of The Many Possible Opportunities That The Text Analysis Can Offer In The Clinical Domain. In This Work, Our Efforts Are Primarily Directed Towards The Improvement Of The Pharmacovigilance Process By The Automatic Detection Of Drug-Drug Interactions (Ddi) From Texts. Moreover, We Deal With The Semantic Analysis Of Texts Containing Health Information For Patients. Our Two-Stage Approach Is Based On Deep Learning Architectures. Concretely, Ner Is Performed Combining A Bidirectional Long Short-Term Memory (Bi-Lstm) And A Conditional Random Field (Crf), While Re Applies A Convolutional Neural Network (Cnn). Since Our Approach Uses Very Few Language Resources, Only The Pre-Trained Word Embeddings, And Does Not Exploit Any Domain Resources (Such As Dictionaries Or Ontologies), This Can Be Easily Expandable To Support Other Languages And Clinical Applications That Require The Exploitation Of Semantic Information (Concepts And Relationships) From Texts...This work was supported by the Research Program of the Ministry of Economy and Competitiveness - Government of Spain, (DeepEMR project TIN2017-87548-C2-1-R)

    Extraction and Classification of Drug-Drug Interaction from Biomedical Text Using a Two-Stage Classifier

    Get PDF
    One of the critical causes of medical errors is Drug-Drug interaction (DDI), which occurs when one drug increases or decreases the effect of another drug. We propose a machine learning system to extract and classify drug-drug interactions from the biomedical literature, using the annotated corpus from the DDIExtraction-2013 shared task challenge. Our approach applies a two-stage classifier to handle the highly unbalanced class distribution in the corpus. The first stage is designed for binary classification of drug pairs as interacting or non-interacting, and the second stage for further classification of interacting pairs into one of four interacting types: advise, effect, mechanism, and int. To find the set of best features for classification, we explored many features, including stemmed words, bigrams, part of speech tags, verb lists, parse tree information, mutual information, and similarity measures, among others. As the system faced two different classification tasks, binary and multi-class, we also explored various classifiers in each stage. Our results show that the best performing classifier in both stages was Support Vector Machines, and the best performing features were 1000 top informative words and part of speech tags between two main drugs. We obtained an F-Measure of 0.64, showing a 12% improvement over our submitted system to the DDIExtraction 2013 competition

    Cómo adaptar un modelo de aprendizaje profundo a un nuevo dominio: el caso de la extracción de relaciones biomédicas

    Get PDF
    In this article, we study the relation extraction problem from Natural Language Processing (NLP) implementing a domain adaptation setting without external resources. We trained a Deep Learning (DL) model for Relation Extraction (RE), which extracts semantic relations in the biomedical domain. However, can the model be applied to different domains? The model should be adaptable to automatically extract relationships across different domains using the DL network. Completely training DL models in a short time is impractical because the models should quickly adapt to different datasets in several domains without delay. Therefore, adaptation is crucial for intelligent systems, where changing factors and unanticipated perturbations are common. In this study, we present a detailed analysis of the problem, as well as preliminary experimentation, results, and their evaluation.En este trabajo estudiamos el problema de extracción de relaciones del Procesamiento de Lenguaje Natural (PLN). Realizamos una configuración para la adaptación de dominio sin recursos externos. De esta forma, entrenamos un modelo con aprendizaje profundo (DL) para la extracción de relaciones (RE). El modelo permite extraer relaciones semánticas para el dominio biomédico. Sin embargo, ¿El modelo puede ser aplicado a diferentes dominios? El modelo debería adaptarse automáticamente para la extracción de relaciones entre diferentes dominios usando la red de DL. Entrenar completamente modelos DL en una escala de tiempo corta no es práctico, deseamos que los modelos se adapten rápidamente de diferentes conjuntos de datos con varios dominios y sin demora. Así, la adaptación es crucial para los sistemas inteligentes que operan en el mundo real, donde los factores cambiantes y las perturbaciones imprevistas son habituales. En este artículo, presentamos un análisis detallado del problema, una experimentación preliminar, resultados y la discusión acerca de los resultados

    A study on developing novel methods for relation extraction

    Get PDF
    Relation Extraction (RE) is a task of Natural Language Processing (NLP) to detect and classify the relations between two entities. Relation extraction in the biomedical and scientific literature domain is challenging as text can contain multiple pairs of entities in the same instance. During the course of this research, we developed an RE framework (RelEx), which consists of five main RE paradigms: rule-based, machine learning-based, Convolutional Neural Network (CNN)-based, Bidirectional Encoder Representations from Transformers (BERT)-based, and Graph Convolutional Networks (GCNs)-based approaches. RelEx\u27s rule-based approach uses co-location information of the entities to determine whether a relation exists between a selected entity and the other entities. RelEx\u27s machine learning-based approach consists of traditional feature representations into traditional machine learning algorithms. RelEx\u27s CNN-based approach consists of three CNN architectures: Segment-CNN, single-label Sentence-CNN, and multi-label Sentence-CNN. RelEx\u27s BERT-based approach utilizes BERT\u27s contextualized word embeddings into a feed-forward neural network. Finally, RelEx\u27s GCN-based approach consists of two GCN-based architectures: GCN-Vanilla, GCN-BERT. We evaluated variations of these approaches in two different domains across four distinct relation types. Overall our findings showed that the rule-based approach is applicable for data with fewer instances in the training data. In contrast, the CNN-based, BERT-based, and GCN-based approaches perform better with labeled data with many training instances. These approaches automatically identify patterns in the data efficiently, whereas rule-based approaches require expert knowledge to generate rules. The CNN-based, BERT-based approaches capture the local contextual information within a sentence or document by embedding both semantic and syntactic information in a learned representation. However, their ability to capture the long-range dependency global information in a text is limited. GCN-based approaches capture the global association information by performing convolution operations on neighbor nodes in a graph and incorporating information from neighbors. Combining GCN with BERT integrates the local contextual and global association information of the words and generates better representations for the words
    corecore