10 research outputs found

    Developing a natural language processing application for measuring the quality of colonoscopy procedures

    Get PDF
    The quality of colonoscopy procedures for colorectal cancer screening is often inadequate and varies widely among physicians. Routine measurement of quality is limited by the costs of manual review of free-text patient charts. Our goal was to develop a natural language processing (NLP) application to measure colonoscopy quality

    Provider-specific quality measurement for ERCP using natural language processing

    Get PDF
    Background and Aims Natural language processing (NLP) is an information retrieval technique that has been shown to accurately identify quality measures for colonoscopy. There are no systematic methods by which to track adherence to quality measures for ERCP, the highest risk endoscopic procedure widely used in practice. Our aim was to demonstrate the feasibility of using NLP to measure adherence to ERCP quality indicators across individual providers. Methods ERCPs performed by 6 providers at a single institution from 2006 to 2014 were identified. Quality measures were defined using society guidelines and from expert opinion, and then extracted using a combination of NLP and data mining (eg, ICD9-CM codes). Validation for each quality measure was performed by manual record review. Quality measures were grouped into preprocedure (5), intraprocedure (6), and postprocedure (2). NLP was evaluated using measures of precision and accuracy. Results A total of 23,674 ERCPs were analyzed (average patient age, 52.9 ± 17.8 years, 14,113 were women [59.6%]). Among 13 quality measures, precision of NLP ranged from 84% to 100% with intraprocedure measures having lower precision (84% for precut sphincterotomy). Accuracy of NLP ranged from 90% to 100% with intraprocedure measures having lower accuracy (90% for pancreatic stent placement). Conclusions NLP in conjunction with data mining facilitates individualized tracking of ERCP providers for quality metrics without the need for manual medical record review. Incorporation of these tools across multiple centers may permit tracking of ERCP quality measures through national registries

    A Frame-Based NLP System for Cancer-Related Information Extraction.

    Get PDF
    We propose a frame-based natural language processing (NLP) method that extracts cancer-related information from clinical narratives. We focus on three frames: cancer diagnosis, cancer therapeutic procedure, and tumor description. We utilize a deep learning-based approach, bidirectional Long Short-term Memory (LSTM) Conditional Random Field (CRF), which uses both character and word embeddings. The system consists of two constituent sequence classifiers: a frame identification (lexical unit) classifier and a frame element classifier. The classifier achieves an

    Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress

    Get PDF
    Objective: To perform a review of recent research in clinical data reuse or secondary use, and envision future advances in this field. Methods: The review is based on a large literature search in MEDLINE (through PubMed), conference proceedings, and the ACM Digital Library, focusing only on research published between 2005 and early 2016. Each selected publication was reviewed by the authors, and a structured analysis and summarization of its content was developed. Results: The initial search produced 359 publications, reduced after a manual examination of abstracts and full publications. The following aspects of clinical data reuse are discussed: motivations and challenges, privacy and ethical concerns, data integration and interoperability, data models and terminologies, unstructured data reuse, structured data mining, clinical practice and research integration, and examples of clinical data reuse (quality measurement and learning healthcare systems). Conclusion: Reuse of clinical data is a fast-growing field recognized as essential to realize the potentials for high quality healthcare, improved healthcare management, reduced healthcare costs, population health management, and effective clinical research

    Automatic Population of Structured Reports from Narrative Pathology Reports

    Get PDF
    There are a number of advantages for the use of structured pathology reports: they can ensure the accuracy and completeness of pathology reporting; it is easier for the referring doctors to glean pertinent information from them. The goal of this thesis is to extract pertinent information from free-text pathology reports and automatically populate structured reports for cancer diseases and identify the commonalities and differences in processing principles to obtain maximum accuracy. Three pathology corpora were annotated with entities and relationships between the entities in this study, namely the melanoma corpus, the colorectal cancer corpus and the lymphoma corpus. A supervised machine-learning based-approach, utilising conditional random fields learners, was developed to recognise medical entities from the corpora. By feature engineering, the best feature configurations were attained, which boosted the F-scores significantly from 4.2% to 6.8% on the training sets. Without proper negation and uncertainty detection, the quality of the structured reports will be diminished. The negation and uncertainty detection modules were built to handle this problem. The modules obtained overall F-scores ranging from 76.6% to 91.0% on the test sets. A relation extraction system was presented to extract four relations from the lymphoma corpus. The system achieved very good performance on the training set, with 100% F-score obtained by the rule-based module and 97.2% F-score attained by the support vector machines classifier. Rule-based approaches were used to generate the structured outputs and populate them to predefined templates. The rule-based system attained over 97% F-scores on the training sets. A pipeline system was implemented with an assembly of all the components described above. It achieved promising results in the end-to-end evaluations, with 86.5%, 84.2% and 78.9% F-scores on the melanoma, colorectal cancer and lymphoma test sets respectively
    corecore