27,743 research outputs found

    Extracting clinical information from electronic medical records

    Get PDF
    As the adoption of Electronic Medical Records (EMRs) rises in the healthcare institutions, these resources are each day more important because of the clinical data they contain about patients. However, the unstructured textual data in the form of narrative present in those records, makes it hard to extract and structure useful clinical information. This unstructured text limits the potential of the EMRs, because the clinical data these records contain, can be used to perform important operations inside healthcare institutions such as searching, summarization, decision support and statistical analysis, as well as be used to support management decisions or serve for research. These operations can only be done if the clinical data from the narratives is properly extracted and structured. Usually this extraction is made manually by healthcare practitioners, what is not efficient and is error-prone. The present work uses Natural Language Processing (NLP) and Information Extraction(IE) techniques in order to develop a pipeline system that can extract clinical information directly from unstructured texts present in Portuguese EMRs, in an automated way, in order to help EMRs to fulfil their potential.info:eu-repo/semantics/acceptedVersio

    Clinical Assistant Diagnosis for Electronic Medical Record Based on Convolutional Neural Network

    Full text link
    Automatically extracting useful information from electronic medical records along with conducting disease diagnoses is a promising task for both clinical decision support(CDS) and neural language processing(NLP). Most of the existing systems are based on artificially constructed knowledge bases, and then auxiliary diagnosis is done by rule matching. In this study, we present a clinical intelligent decision approach based on Convolutional Neural Networks(CNN), which can automatically extract high-level semantic information of electronic medical records and then perform automatic diagnosis without artificial construction of rules or knowledge bases. We use collected 18,590 copies of the real-world clinical electronic medical records to train and test the proposed model. Experimental results show that the proposed model can achieve 98.67\% accuracy and 96.02\% recall, which strongly supports that using convolutional neural network to automatically learn high-level semantic features of electronic medical records and then conduct assist diagnosis is feasible and effective.Comment: 9 pages, 4 figures, Accepted by Scientific Report

    GR-453 Medical Records Summarization Using Prompt-Based NLP

    Get PDF
    In this paper, we present an innovative Natural Language Processing (NLP) algorithm for summarizing medical records extracted from the MIMIC-IV dataset using state-of-the-art (SOTA) techniques in text summarization. The increasing volume of electronic health records (EHRs) demands efficient methods for extracting meaningful insights from these complex and extensive documents. Our algorithm leverages recent advancements in NLP, including transformer-based models, to automate summarizing medical records while preserving critical information. Our algorithm is trained and tested using the Medical Information Mart for Intensive Care (MIMIC)-IV database that provides critical care data for over 40,000 patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC) between 2008 and 2019. The algorithm aims to extract the query text from medical records in the MIMIC-IV dataset, which often contains diverse and extensive clinical information

    Multi-domain clinical natural language processing with MedCAT: The Medical Concept Annotation Toolkit

    Get PDF
    Electronic health records (EHR) contain large volumes of unstructured text, requiring the application of information extraction (IE) technologies to enable clinical analysis. We present the open source Medical Concept Annotation Toolkit (MedCAT) that provides: (a) a novel self-supervised machine learning algorithm for extracting concepts using any concept vocabulary including UMLS/SNOMED-CT; (b) a feature-rich annotation interface for customizing and training IE models; and (c) integrations to the broader CogStack ecosystem for vendor-agnostic health system deployment. We show improved performance in extracting UMLS concepts from open datasets (F1:0.448-0.738 vs 0.429-0.650). Further real-world validation demonstrates SNOMED-CT extraction at 3 large London hospitals with self-supervised training over ∼8.8B words from ∼17M clinical records and further fine-tuning with ∼6K clinician annotated examples. We show strong transferability (F1 > 0.94) between hospitals, datasets and concept types indicating cross-domain EHR-agnostic utility for accelerated clinical and research use cases

    Natural Language Processing for Medical Texts – A Taxonomy to Inform Integration Decisions into Clinical Practice

    Get PDF
    Electronic health records (EHR) have significantly amplified the volume of information accessible in the healthcare sector. Nevertheless, this information load also translates into elevated workloads for clinicians engaged in extracting and generating patient information. Natural Language Process (NLP) aims to overcome this problem by automatically extracting and structuring relevant information from medical texts. While other methods related to artificial intelligence have been implemented successfully in healthcare (e.g., computer vision in radiology), NLP still lacks commercial success in this domain. The lack of a structured overview of NLP systems is exacerbating the problem, especially with the emergence of new technologies like generative pre-trained transformers. Against this background, this paper presents a taxonomy to inform integration decisions of NLP systems into healthcare IT landscapes. We contribute to a better understanding of how NLP systems can be integrated into daily clinical contexts. In total, we reviewed 29 papers and 36 commercial NLP products

    Extracting information from the text of electronic medical records to improve case detection: a systematic review

    Get PDF
    Background: Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods: A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results: Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions: Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall)

    Named Entity Recognition in Electronic Health Records Using Transfer Learning Bootstrapped Neural Networks

    Full text link
    Neural networks (NNs) have become the state of the art in many machine learning applications, especially in image and sound processing [1]. The same, although to a lesser extent [2,3], could be said in natural language processing (NLP) tasks, such as named entity recognition. However, the success of NNs remains dependent on the availability of large labelled datasets, which is a significant hurdle in many important applications. One such case are electronic health records (EHRs), which are arguably the largest source of medical data, most of which lies hidden in natural text [4,5]. Data access is difficult due to data privacy concerns, and therefore annotated datasets are scarce. With scarce data, NNs will likely not be able to extract this hidden information with practical accuracy. In our study, we develop an approach that solves these problems for named entity recognition, obtaining 94.6 F1 score in I2B2 2009 Medical Extraction Challenge [6], 4.3 above the architecture that won the competition. Beyond the official I2B2 challenge, we further achieve 82.4 F1 on extracting relationships between medical terms. To reach this state-of-the-art accuracy, our approach applies transfer learning to leverage on datasets annotated for other I2B2 tasks, and designs and trains embeddings that specially benefit from such transfer.Comment: 11 pages, 4 figures, 8 table
    • …
    corecore