1,193 research outputs found

    Extracting information from the text of electronic medical records to improve case detection: a systematic review

    Get PDF
    Background: Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods: A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results: Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions: Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall)

    Combining Unsupervised, Supervised, and Rule-based Algorithms for Text Mining of Electronic Health Records - A Clinical Decision Support System for Identifying and Classifying Allergies of Concern for Anesthesia During Surgery

    Get PDF
    Undisclosed allergic reactions of patients are a major risk when undertaking surgeries in hospitals. We present our early experience and preliminary findings for a Clinical Decision Support System (CDSS) being developed in a Norwegian Hospital Trust. The system incorporates unsupervised and supervised machine learning algorithms in combination with rule-based algorithms to identify and classify allergies of concern for anesthesia during surgery. Our approach is novel in that it utilizes unsupervised machine learning to analyze large corpora of narratives to automatically build a clinical language model containing words and phrases of which meanings and relative meanings are also learnt. It further implements a semi-automatic annotation scheme for efficient and interactive machine-learning, which to a large extent eliminates the substantial manual annotation (of clinical narratives) effort necessary for the training of supervised algorithms. Validation of system performance was performed through comparing allergies identified by the CDSS with a manual reference standard

    Knowledge Author: Facilitating user-driven, Domain content development to support clinical information extraction

    Get PDF
    Background: Clinical Natural Language Processing (NLP) systems require a semantic schema comprised of domain-specific concepts, their lexical variants, and associated modifiers to accurately extract information from clinical texts. An NLP system leverages this schema to structure concepts and extract meaning from the free texts. In the clinical domain, creating a semantic schema typically requires input from both a domain expert, such as a clinician, and an NLP expert who will represent clinical concepts created from the clinician's domain expertise into a computable format usable by an NLP system. The goal of this work is to develop a web-based tool, Knowledge Author, that bridges the gap between the clinical domain expert and the NLP system development by facilitating the development of domain content represented in a semantic schema for extracting information from clinical free-text. Results: Knowledge Author is a web-based, recommendation system that supports users in developing domain content necessary for clinical NLP applications. Knowledge Author's schematic model leverages a set of semantic types derived from the Secondary Use Clinical Element Models and the Common Type System to allow the user to quickly create and modify domain-related concepts. Features such as collaborative development and providing domain content suggestions through the mapping of concepts to the Unified Medical Language System Metathesaurus database further supports the domain content creation process. Two proof of concept studies were performed to evaluate the system's performance. The first study evaluated Knowledge Author's flexibility to create a broad range of concepts. A dataset of 115 concepts was created of which 87 (76%) were able to be created using Knowledge Author. The second study evaluated the effectiveness of Knowledge Author's output in an NLP system by extracting concepts and associated modifiers representing a clinical element, carotid stenosis, from 34 clinical free-text radiology reports using Knowledge Author and an NLP system, pyConText. Knowledge Author's domain content produced high recall for concepts (targeted findings: 86%) and varied recall for modifiers (certainty: 91% sidedness: 80%, neurovascular anatomy: 46%). Conclusion: Knowledge Author can support clinical domain content development for information extraction by supporting semantic schema creation by domain experts

    A review of automatic phenotyping approaches using electronic health records

    Get PDF
    Electronic Health Records (EHR) are a rich repository of valuable clinical information that exist in primary and secondary care databases. In order to utilize EHRs for medical observational research a range of algorithms for automatically identifying individuals with a specific phenotype have been developed. This review summarizes and offers a critical evaluation of the literature relating to studies conducted into the development of EHR phenotyping systems. This review describes phenotyping systems and techniques based on structured and unstructured EHR data. Articles published on PubMed and Google scholar between 2013 and 2017 have been reviewed, using search terms derived from Medical Subject Headings (MeSH). The popularity of using Natural Language Processing (NLP) techniques in extracting features from narrative text has increased. This increased attention is due to the availability of open source NLP algorithms, combined with accuracy improvement. In this review, Concept extraction is the most popular NLP technique since it has been used by more than 50% of the reviewed papers to extract features from EHR. High-throughput phenotyping systems using unsupervised machine learning techniques have gained more popularity due to their ability to efficiently and automatically extract a phenotype with minimal human effort

    THE USE OF ELECTRONIC MEDICAL RECORDS BASED ON A PHYSICIAN DIAGNOSIS OF ASTHMA FOR COUNTY WIDE ASTHMA SURVEILLANCE

    Get PDF
    Allegheny County (AC) has limited information on asthma morbidity. In order to improve upon the sensitivity of asthma, a cross sectional study from January 1, 2002 through December 31, 2005 was conducted to determine whether the data received for emergency room visits from a large regional medical center might be a good predictor for quantifying asthma cases for surveillance. An electronic medical record (EMR) abstract using the Council for State and Territorial Epidemiology (CSTE) Asthma Surveillance case definition of an ICD 9 coded physician diagnosis for primary and secondary asthma (n= 18,284), and primary asthma (n = 5,100) were used to define asthma. The analysis used data from a subset of six hospitals from a large regional medical center covering approximately 60% of adult ED visits in AC that use electronic data for reporting. A secondary analysis of the physician diagnosed primary asthma cases (n= 180) was applied against the CSTE Clinical and Laboratory case definition. Statistical software was used to validate these data abstracted from the EMR. Once these data were validated for accuracy, a fourth dataset of any primary asthma emergency room visits (n= 10,183) were used to test the relationship between asthma morbidity and exposure to ozone. Recent studies have linked asthma hospitalizations in several cities to ozone action days. However, data on the effects of ozone as they relate to asthma emergency room (ER) visits have not been well studied. Electronic medical records from the six hospitals representing the large metropolitan medical center in Allegheny County, PA were obtained on individuals with asthma based on the ICD-9 discharge diagnosis of (493.0-493.9) for the respective time period. Data on ozone, PM2.5, and temperature were obtained for same period. A case crossover methodology using conditional logistic regression as the statistical estimator was conducted to assess the relationship between levels of ozone and PM 2.5 and increases in asthma ER visits. A time stratified sampling strategy was employed assuming a 3:1 case-control ratio.A total of 6,979 individuals were included in the study, with a mean age of 39.25 ±21.0. The mean ozone exposure for this period was 40.6 ppb (range: 0-126). The effect estimates for year-round data was greatest for a 2-day lag adjusted for temperature (OR= 1.02 (95% CI= 1.01-1.04) (p<.05). For each 10-ppb increase in 24-hour maximum ozone, a 2% increase was noted in asthma ER visits. These results indicate that asthma ED visits may be an additional source of information for use in environmental public health tracking

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe

    Consolidation of CDA-based documents from multiple sources : a modular approach

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Physicians receive multiple CCDs for a single patient encompassing various encounters and medical history recorded in different information systems. It is cumbersome for providers to explore different pages of CCDs to find specific data which can be duplicated or even conflicted. This study describes the steps towards a system that integrates multiple CCDs into one consolidated document for viewing or processing patient-level data. Also, the impact of the system on healthcare providers’ perceived workload is evaluated. A modular system is developed to consolidate and de-duplicate CDA-based documents. The system is engineered to be scalable, extensible and open source. The system’s performance and output has evaluated first based on synthesized data and later based on real-world CCDs obtained from INPC database. The accuracy of the consolidation system along with the gaps in identification of the duplications were assessed. Finally, the impact of the system on healthcare providers’ workload is evaluated using NASA TLX tool. All of the synthesized CCDs were successfully consolidated, and no data were lost. The de-duplication accuracy was 100% based on synthesized data and the processing time for each document was 1.12 seconds. For real-world CCDs, our system de-duplicated 99.1% of the problems, 87.0% of allergies, and 91.7% of medications. Although the accuracy of the system is still very promising, however, there is a minor inaccuracy. Due to system improvements, the processing time for each document is reduced to average 0.38 seconds for each CCD. The result of NASA TLX evaluation shows that the system significantly decreases healthcare providers’ perceived workload. Also, it is observed that information reconciliation reduces the medical errors. The time for review of medical documents review time is significantly reduced after CCD consolidation. Given increasing adoption and use of Health Information Exchange (HIE) to share data and information across the care continuum, duplication of information is inevitable. A novel system designed to support automated consolidation and de-duplication of information across clinical documents as they are exchanged shows promise. Future work is needed to expand the capabilities of the system and further test it using heterogeneous vocabularies across multiple HIE scenarios

    Defining and Testing EMR Usability: Principles and Proposed Methods of EMR Usability Evaluation and Rating

    Get PDF
    For more information about the Information Experience Laboratory, visit http://ielab.missouri.edu/Electronic medical record (EMR) adoption rates have been slower than expected in the United States, especially in comparison to other industry sectors and other developed countries. A key reason, aside from initial costs and lost productivity during EMR implementation, is lack of efficiency and usability of EMRs currently available. Achieving the healthcare reform goals of broad EMR adoption and “meaningful use” will require that efficiency and usability be effectively addressed at a fundamental level. We conducted a literature review of usability principles, especially those applicable to EMRs. The key principles identified were simplicity, naturalness, consistency, minimizing cognitive load, efficient interactions, forgiveness and feedback, effective use of language, effective information presentation, and preservation of context. Usability is often mistakenly equated with user satisfaction, which is an oversimplification. We describe methods of usability evaluation, offering several alternative methods for measuring efficiency and effectiveness, including patient safety. We provide samples of objective, repeatable and cost‐efficient test scenarios applicable to evaluating EMR usability as an adjunct to certification, and we discuss rating schema for scoring the results. (42 pages

    Contextualized clinical decision support to detect and prevent adverse drug events

    Get PDF
    corecore