13 research outputs found
J Biomed Inform
We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP.CC999999/ImCDC/Intramural CDC HHS/United States2019-11-20T00:00:00Z28729030PMC6864736694
Recommended from our members
Identifying and reducing inappropriate use of medications using Electronic Health Records
Inappropriate use of medications (IUM) is a global problem that can lead to unnecessary harm to the patients and unnecessary costs across the health care system. Identifying and reducing IUM has been a long-lasting challenge and currently, no systematic and automated solution exists to address it. IUM can be manually identified by experts using medication appropriateness criteria (MAC).
In this research I first conducted a review of approaches used to identify IUM and reduce IUM. Next, I developed a conceptual model for representing the MAC, and then developed a tool and a workflow for translating the MAC into structured form. Because indications are an important component of the MAC, I conducted a critical appraisal of existing knowledge sources that can be used to that end, namely the medication-indication knowledge-bases. Finally, I demonstrated how these structured MAC can be used to identify patients who are potentially subject to IUM and evaluated the accuracy of this approach.
This research identifies the knowledge gaps and technological challenges in identifying and reducing IUM and addresses some of these gaps through the creation of a representation for MAC, a repository of structured MAC, and a set of tools that can assist in evaluating the impact of interventions aimed to reduce IUM or assess its downstream effects. This research also discusses the limitations of existing methods for executing computable decision support rules and proposes solutions needed to enhance these methods so they can support implementation of the MAC
Recommended from our members
Ontology-based Semantic Harmonization of HIV-associated Common Data Elements for Integration of Diverse HIV Research Datasets
Analysis of integrated, diverse, Human Immunodeficiency Virus (HIV)-associated datasets can increase knowledge and guide the development of novel and effective interventions for disease prevention and treatment by increasing breadth of variables and statistical power, particularly for sub-group analyses. This topic has been identified as a National Institutes of Health research priority, but few efforts have been made to integrate data across HIV studies. Our aims were to: 1) Characterize the semantic heterogeneity (SH) in the HIV research domain; 2) Identify HIV-associated common data elements (CDEs) in empirically generated and knowledge-based resources; 3) Create a formal representation of HIV-associated CDEs in the form of an HIV-associated Entities in Research Ontology (HERO); 4) Assess the feasibility of using HERO to semantically harmonize HIV research data. Our approach was guided by information/knowledge theory and the DIKW (Data Information Knowledge Wisdom) hierarchical model.
Our systematized review of the literature revealed that synergistic use of both ontologies and CDEs included integration, interoperability, data exchange, and data standardization. Moreover, methods and tools included use of experts for CDE identification, the Unified Medical Language System, natural language processing, Extensible Markup Language, Health Level 7, and ontology development tools (e.g., Protégé). Additionally, evaluation methods included expert assessment, quantification of mapping tasks between raters, assessment of interrater reliability, and comparison to established standards. We used these findings to inform our process for achieving the study aims.
For Aim 1, we analyzed eight disparate HIV-associated data dictionaries and developed a String Metric-assisted Assessment of Semantic Heterogeneity (SMASH) method, which aided identification of 127 (13%) homogeneous data element (DE) pairs and 1,048 (87%) semantically heterogeneous DE pairs. Most heterogeneous pairs (97%) were semantically-equivalent/syntactically-different, allowing us to determine that SH in the HIV research domain was high.
To achieve Aim 2, we used Clinicaltrials.gov, Google Search, and text mining in R to identify HIV-associated CDEs in HIV journal articles, HIV-associated datasets, AIDSinfo HIV/AIDS Glossary, AIDSinfo Drug Database, Logical Observation Identifiers Names and Codes (LOINC), Systematized Nomenclature of Medicine (SNOMED), and RxNORM (understood as prescription normalization). Two HIV experts then manually reviewed DEs from the journal articles and data dictionaries to confirm DE commonality and resolved semantic discrepancies through discussion. Ultimately, we identified 2,179 unique CDEs. Of all CDEs, data-driven approaches identified 2,055 (94%) (999 from the HIV/AIDS Glossary, 398 from the Drug Database, 91 from journal articles, and a total of 567 from LOINC, SNOMED, and RxNorm cumulatively). Expert-based approaches identified 124 (6%) unique CDEs from data dictionaries and confirmed the 91 CDEs from journal articles.
In Aim 3, we used the Protégé suite of ontology development tools and the 2,179 CDEs to develop the HERO. We modeled the ontology using the semantic structure of the Medical Entities Dictionary, available hierarchical information from the CDE knowledge resources, and expert knowledge. The ontology fulfilled most relevant criteria from Cimino’s desiderata and OntoClean ontology engineering principles, and it successfully answered eight competency questions.
Finally, for Aim 4, we assessed the feasibility of using HERO to semantically harmonize and integrate the data dictionaries from two diverse HIV-associated datasets. Two HIV experts involved in the development of HERO independently assessed each data dictionary. Of the 367 DEs in data dictionary 1 (D1), 181 (49.32%) were identified as CDEs and 186 (50.68%) were not CDEs, and of the 72 DEs in data dictionary 2 (D2), 37 (51.39%) were CDEs and 35 (48.61%) were not CDEs. The HIV experts then traversed HERO’s hierarchy to map CDEs from D1 and D2 to CDEs in HERO. Of the 181 CDEs in D1, 156 (86.19%) were found in HERO, and 25 (13.81%) were not. Similarly, of the 37 CDEs in D2 32 (86.48%) were found in HERO, and 5 (13.51%) were not. Interrater reliability for CDE identification as measured by Cohen’s Kappa was 0.900 for D1 and 0.892 for D2. Cohen’s Kappas for CDEs in D1 and D2 that were also identified in HERO were 0.885 and 0.688, respectively.
Subsequently, to demonstrate the integration of the two HIV-associated datasets, a sample of semantically harmonized CDEs in both datasets was categorically selected (e.g. administrative, demographic, and behavioral), and D2 sample size increases were calculated for race (e.g., White, African American/Black, Asian/Pacific Islander, Native American/Indian, and Hispanic/Latino) and for “intravenous drug use” from the integrated datasets. The average increase of D2 CDEs for six selected CDEs was 1,928%.
Despite the limitation of HERO developers also serving as evaluators, the contributions of the study to the fields of informatics and HIV research were substantial. Confirmatory contributions include: identification of effective CDE/ontology tools, and use of data-driven and expert-based methods. Novel contributions include: development of SMASH and HERO; and new contributions include documenting that SH is high in HIV-associated datasets, identifying 2,179 HIV-associated CDEs, creating two additional classifications of SH, and showing that using HERO for semantic harmonization of HIV-associated data dictionaries is feasible. Our future work will build upon this research by expanding the numbers and types of datasets, refining our methods and tools, and conducting an external evaluation
Cancer
BackgroundUnderstanding of cancer outcomes is limited by data fragmentation. We analyzed the information yielded by integrating breast cancer data from three sources: electronic medical records (EMRs) of two healthcare systems and the state registry.MethodsWe extracted diagnostic test and treatment data from EMRs of all breast cancer patients treated from 2000\ue2\u20ac\u201c2010 in two independent California institutions: a community-based practice (Palo Alto Medical Foundation) and an academic medical center (Stanford University). We incorporated records from the population-based California Cancer Registry (CCR), and then linked EMR-CCR datasets of Community and University patients.ResultsWe initially identified 8210 University patients and 5770 Community patients; linked datasets revealed a 16% patient overlap, yielding 12,109 unique patients. The proportion of all Community patients, but not University patients, treated at both institutions increased with worsening cancer prognostic factors. Before linking datasets, Community patients appeared to receive less intervention than University patients (mastectomy: 37.6% versus 43.2%; chemotherapy: 35% versus 41.7%; magnetic resonance imaging (MRI): 10% versus 29.3%; genetic testing: 2.5% versus 9.2%). Linked Community and University datasets revealed that patients treated at both institutions received substantially more intervention (mastectomy: 55.8%; chemotherapy: 47.2%; MRI: 38.9%; genetic testing: 10.9%; p<0.001 for each three-way institutional comparison).ConclusionData linkage identified 16% of patients who were treated in two healthcare systems and who, despite comparable prognostic factors, received far more intensive treatment than others. By integrating complementary data from EMRs and population-based registries, we obtained a more comprehensive understanding of breast cancer care and factors that drive treatment utilization.1U58 DP000807-01/DP/NCCDPHP CDC HHS/United StatesHHSN261201000034C/CA/NCI NIH HHS/United StatesHHSN261201000034C/PHS HHS/United StatesHHSN261201000035C/PHS HHS/United StatesHHSN261201000140C/CA/NCI NIH HHS/United StatesHHSN261201000140C/PHS HHS/United States2015-01-01T00:00:00Z24101577PMC386759
Recommended from our members
Simulating drug responses in laboratory test time series with deep generative modeling
Drug effects can be unpredictable and vary widely among patients with environmental, genetic, and clinical factors. Randomized control trials (RCTs) are not sufficient to identify adverse drug reactions (ADRs), and the electronic health record (EHR) along with medical claims have become an important resource for pharmacovigilance. Among all the data collected in hospitals, laboratory tests represent the most documented and reliable data type in the EHR. Laboratory tests are at the core of the clinical decision process and are used for diagnosis, monitoring, screening, and research by physicians. They can be linked to drug effects either directly, with therapeutic drug monitoring (TDM), or indirectly using drug laboratory effects (DLEs) that affect surrogate tests. Unfortunately, very few automated methods use laboratory tests to inform clinical decision making and predict drug effects, partly due to the complexity of these time series that are irregularly sampled, highly dependent on other clinical covariates, and non-stationary.
Deep learning, the branch of machine learning that relies on high-capacity artificial neural networks, has known a renewed popularity this past decade and has transformed fields such as computer vision and natural language processing. Deep learning holds the promise of better performances compared to established machine learning models, although with the necessity for larger training datasets due to their higher degrees of freedom. These models are more flexible with multi-modal inputs and can make sense of large amounts of features without extensive engineering. Both qualities make deep learning models ideal candidate for complex, multi-modal, noisy healthcare datasets.
With the development of novel deep learning methods such as generative adversarial networks (GANs), there is an unprecedented opportunity to learn how to augment existing clinical dataset with realistic synthetic data and increase predictive performances. Moreover, GANs have the potential to simulate effects of individual covariates such as drug exposures by leveraging the properties of implicit generative models.
In this dissertation, I present a body of work that aims at paving the way for next generation laboratory test-based clinical decision support systems powered by deep learning. To this end, I organized my experiments around three building blocks: (1) the evaluation of various deep learning architectures with laboratory test time series and their covariates with a forecasting task; (2) the development of implicit generative models of laboratory test time series using the Wasserstein GAN framework; (3) the inference properties of these models for the simulation of drug effects in laboratory test time series, and their application for data augmentation. Each component has its own evaluation: The forecasting task enabled me to explore the properties and performances of different learning architectures; the Wasserstein GAN models are evaluated with both intrinsic metrics and extrinsic tasks, and I always set baselines to avoid providing results in a "neural-network only" referential. Applied machine learning, and more so with deep learning, is an empirical science. While the datasets used in this dissertation are not publicly available due to patient privacy regulation, I described pre-processing steps, hyper-parameters selection and training processes with reproducibility and transparency in mind.
In the specific context of these studies involving laboratory test time series and their clinical covariates, I found that for supervised tasks, machine learning holds up well against deep learning methods. Complex recurrent architectures like long short-term memory (LSTM) do not perform well on these short time series, while convolutional neural networks (CNNs) and multi-layer perceptrons (MLPs) provide the best performances, at the cost of extensive hyper-parameter tuning. Generative adversarial networks, enabled by deep learning models, were able to generate high-fidelity laboratory test time series, and the quality of the generated samples was increased with conditional models using drug exposures as auxiliary information. Interestingly, forecasting models trained on synthetic data exclusively still retain good performances, confirming the potential of GANs in privacy-oriented applications.
Finally, conditional GANs demonstrated an ability to interpolate samples from drug exposure combinations not seen during training, opening the way for laboratory test simulation with larger auxiliary information spaces. In specific cases, augmenting real training sets with synthetic data improved performances in the forecasting tasks, and could be extended to other applications where rare cases present a high prediction error
Electronic Health Record Phenotyping in Cardiovascular Epidemiology
The secondary use of EHR data for research is a cost-effective resource for a variety of research questions and domains; however, there are many challenges when using electronic health record (EHR) data for epidemiologic research.This dissertation quantified differences in prevalence for acute myocardial infarction (MI) and heart failure (HF) using phenotyping algorithms differing in diagnosis position of ICD-10-CM codes and the inclusion of clinical components. The period of interest was January 1, 2016 to December 31, 2019 for UNC Clinical Data Warehouse for Health data and October 1, 2015 and December 31, 2019 for Atherosclerosis Risk in Communities (ARIC) Study data, the latter used for validation analyses. During the period of interest, 13,200 acute MI cases and 53,545 HF cases were identified in the UNC data. Age-standardized prevalence of acute MI and HF were highest using Any Diagnosis Position algorithm and lowest for acute MI using 1st or 2nd Diagnosis Position with Lab or Procedure and 1st Diagnosis Position for HF. Projected differences in healthcare expenditures by algorithm as well as patient and clinical characteristics, such as event severity and mortality, were also estimated. When compared to physician-adjudicated hospitalizations in the ARIC study, the phenotyping algorithms used for the UNC analysis performed well given their simplicity. The algorithm with the highest sensitivity was Any Diagnosis Position for acute MI and HF at 75.5% and 70.5%. Specificity, PPV, and NPV ranged from 80-99% for all algorithms. Requiring clinical components had little effect except for increasing PPV slightly, while restricting diagnosis position to 1st or 2nd position decreased sensitivity and increased PPV. The impact of clinical components or diagnosis position did not differ by race, age, or sex subgroups.The results from this dissertation can be used by researchers using EHR data for a variety of reasons from informing their own analytic decisions to validating their study findings. The continued use of EHR data for research requires transparency to facilitate reproducibility as well as studies focused on what we are measuring.Doctor of Philosoph