Search CORE

173,952 research outputs found

Recommended from our members

A common type system for clinical natural language processing

Author: Becker Lee
Chapman Wendy W
Chen Pei
Chute Christopher G
Dligach Dmitriy
Kaggal Vinod C
Liu Hongfang
Masanz James J
Savova Guergana Kirilova
Wu Stephen T
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/01/2013
Field of study

Background: One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results: We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. Conclusions: We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types

Harvard University - DASH

Springer - Publisher Connector

University of Melbourne Institutional Repository

Knowledge Author: Facilitating user-driven, Domain content development to support clinical information extraction

Author: Chapman WW
Drews FA
Liu Y
Mowery D
Scuba W
Tharp M
Tseytlin E
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/06/2016
Field of study

Background: Clinical Natural Language Processing (NLP) systems require a semantic schema comprised of domain-specific concepts, their lexical variants, and associated modifiers to accurately extract information from clinical texts. An NLP system leverages this schema to structure concepts and extract meaning from the free texts. In the clinical domain, creating a semantic schema typically requires input from both a domain expert, such as a clinician, and an NLP expert who will represent clinical concepts created from the clinician's domain expertise into a computable format usable by an NLP system. The goal of this work is to develop a web-based tool, Knowledge Author, that bridges the gap between the clinical domain expert and the NLP system development by facilitating the development of domain content represented in a semantic schema for extracting information from clinical free-text. Results: Knowledge Author is a web-based, recommendation system that supports users in developing domain content necessary for clinical NLP applications. Knowledge Author's schematic model leverages a set of semantic types derived from the Secondary Use Clinical Element Models and the Common Type System to allow the user to quickly create and modify domain-related concepts. Features such as collaborative development and providing domain content suggestions through the mapping of concepts to the Unified Medical Language System Metathesaurus database further supports the domain content creation process. Two proof of concept studies were performed to evaluate the system's performance. The first study evaluated Knowledge Author's flexibility to create a broad range of concepts. A dataset of 115 concepts was created of which 87 (76%) were able to be created using Knowledge Author. The second study evaluated the effectiveness of Knowledge Author's output in an NLP system by extracting concepts and associated modifiers representing a clinical element, carotid stenosis, from 34 clinical free-text radiology reports using Knowledge Author and an NLP system, pyConText. Knowledge Author's domain content produced high recall for concepts (targeted findings: 86%) and varied recall for modifiers (certainty: 91% sidedness: 80%, neurovascular anatomy: 46%). Conclusion: Knowledge Author can support clinical domain content development for information extraction by supporting semantic schema creation by domain experts

Springer - Publisher Connector

PubMed Central

D-Scholarship@Pitt

University of Melbourne Institutional Repository

Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective

Author: Luo Yuan
Szolovits Peter
Publication venue
Publication date: 14/11/2018
Field of study

This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements. It utilizes UMLS to perform domain adaptation when integrating generic domain NLP tools. It also features stand-off annotations that are specified by positional reference to the original document. We built an interval tree based search engine to efficiently query and retrieve the stand-off annotations by specifying positional requirements. We also developed a utility to convert an inline annotation format to stand-off annotations to enable the reuse of clinical text datasets with inline annotations. We experimented with our system on several NLP facilitated tasks including computational phenotyping for lymphoma patients and semantic relation extraction for clinical notes. These experiments showcased the broader applicability and utility of LAPNLP.Comment: 6 pages, accepted by IEEE BIBM 2018 as regular pape

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Extracting information from the text of electronic medical records to improve case detection: a systematic review

Author: Afzal
Afzal
Ananthakrishnan
Baus
Cano
Carroll
Carroll
Carroll
Castro
Chapman
Chen
Chung
Currie
de Lusignan
DeLisle
DeLisle
Donia Scott
Dorr
Elizabeth Ford
Ford
Friedlin
Friedman
Friedman
Graiser
Greenhalgh
Gulliford
Gundlapalli
Hanauer
Hanauer
Hanauer
Harkema
Helen E Smith
Imfeld
Jackie A Cassell
John A Carroll
Jones
Kalra
Karnik
Koeling
Kushida
Li
Liao
Lin
Lindberg
Love
Lovis
Ludvigsson
Manning
Manuel
McPeek Hinz
Mehrabi
Meystre
Nielen
Pakhomov
Pakhomov
Powsner
Rait
Resnik
Roch
Ryan
Savova
Soler
Stein
Stone
Tange
Tate
Tsui
Uzuner
Valkhoff
Walsh
Widdifield
Wilke
Wu
Xia
Xu
Xu
Yadav
Ye
Zeng
Zeng
Zheng
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

Background: Electronic medical records (EMRs) are revolutionizing health-related research. One key issue for study quality is the accurate identification of patients with the condition of interest. Information in EMRs can be entered as structured codes or unstructured free text. The majority of research studies have used only coded parts of EMRs for case-detection, which may bias findings, miss cases, and reduce study quality. This review examines whether incorporating information from text into case-detection algorithms can improve research quality. Methods: A systematic search returned 9659 papers, 67 of which reported on the extraction of information from free text of EMRs with the stated purpose of detecting cases of a named clinical condition. Methods for extracting information from text and the technical accuracy of case-detection algorithms were reviewed. Results: Studies mainly used US hospital-based EMRs, and extracted information from text for 41 conditions using keyword searches, rule-based algorithms, and machine learning methods. There was no clear difference in case-detection algorithm accuracy between rule-based and machine learning methods of extraction. Inclusion of information from text resulted in a significant improvement in algorithm sensitivity and area under the receiver operating characteristic in comparison to codes alone (median sensitivity 78% (codes + text) vs 62% (codes), P = .03; median area under the receiver operating characteristic 95% (codes + text) vs 88% (codes), P = .025). Conclusions: Text in EMRs is accessible, especially with open source information extraction algorithms, and significantly improves case detection when combined with codes. More harmonization of reporting within EMR studies is needed, particularly standardized reporting of algorithm accuracy metrics like positive predictive value (precision) and sensitivity (recall)

Crossref

PubMed Central

Sussex Research Online

Evaluating openEHR for storing computable representations of electronic health record phenotyping algorithms

Author: Denaxas Spiros
Hemingway Harry
Papez Vaclav
Publication venue
Publication date: 27/04/2017
Field of study

Electronic Health Records (EHR) are data generated during routine clinical care. EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the pace of precision medicine at scale. A main EHR use-case is creating phenotyping algorithms to define disease status, onset and severity. Currently, no common machine-readable standard exists for defining phenotyping algorithms which often are stored in human-readable formats. As a result, the translation of algorithms to implementation code is challenging and sharing across the scientific community is problematic. In this paper, we evaluate openEHR, a formal EHR data specification, for computable representations of EHR phenotyping algorithms.Comment: 30th IEEE International Symposium on Computer-Based Medical Systems - IEEE CBMS 201

arXiv.org e-Print Archive

Crossref

UCL Discovery