209 research outputs found

    Automatic Prediction of Recurrence of Major Cardiovascular Events: A Text Mining Study Using Chest X-Ray Reports

    Get PDF
    Background and Objective. Electronic health records (EHRs) contain free-text information on symptoms, diagnosis, treatment, and prognosis of diseases. However, this potential goldmine of health information cannot be easily accessed and used unless proper text mining techniques are applied. The aim of this project was to develop and evaluate a text mining pipeline in a multimodal learning architecture to demonstrate the value of medical text classification in chest radiograph reports for cardiovascular risk prediction. We sought to assess the integration of various text representation approaches and clinical structured data with state-of-the-art deep learning methods in the process of medical text mining. Methods. We used EHR data of patients included in the Second Manifestations of ARTerial disease (SMART) study. We propose a deep learning-based multimodal architecture for our text mining pipeline that integrates neural text representation with preprocessed clinical predictors for the prediction of recurrence of major cardiovascular events in cardiovascular patients. Text preprocessing, including cleaning and stemming, was first applied to filter out the unwanted texts from X-ray radiology reports. Thereafter, text representation methods were used to numerically represent unstructured radiology reports with vectors. Subsequently, these text representation methods were added to prediction models to assess their clinical relevance. In this step, we applied logistic regression, support vector machine (SVM), multilayer perceptron neural network, convolutional neural network, long short-term memory (LSTM), and bidirectional LSTM deep neural network (BiLSTM). Results. We performed various experiments to evaluate the added value of the text in the prediction of major cardiovascular events. The two main scenarios were the integration of radiology reports (1) with classical clinical predictors and (2) with only age and sex in the case of unavailable clinical predictors. In total, data of 5603 patients were used with 5-fold cross-validation to train the models. In the first scenario, the multimodal BiLSTM (MI-BiLSTM) model achieved an area under the curve (AUC) of 84.7%, misclassification rate of 14.3%, and F1 score of 83.8%. In this scenario, the SVM model, trained on clinical variables and bag-of-words representation, achieved the lowest misclassification rate of 12.2%. In the case of unavailable clinical predictors, the MI-BiLSTM model trained on radiology reports and demographic (age and sex) variables reached an AUC, F1 score, and misclassification rate of 74.5%, 70.8%, and 20.4%, respectively. Conclusions. Using the case study of routine care chest X-ray radiology reports, we demonstrated the clinical relevance of integrating text features and classical predictors in our text mining pipeline for cardiovascular risk prediction. The MI-BiLSTM model with word embedding representation appeared to have a desirable performance when trained on text data integrated with the clinical variables from the SMART study. Our results mined from chest X-ray reports showed that models using text data in addition to laboratory values outperform those using only known clinical predictors

    Evaluating a cardiovascular disease risk management care continuum within a learning healthcare system: a prospective cohort study

    Get PDF
    Background: Many patients now present with multimorbidity and chronicity of disease. This means that multidisciplinary management in a care continuum, integrating primary care and hospital care services, is needed to ensure high quality care. Aim: To evaluate cardiovascular risk management (CVRM) via linkage of health data sources, as an example of a multidisciplinary continuum within a learning healthcare system (LHS). Design & setting: In this prospective cohort study, data were linked from the Utrecht Cardiovascular Cohort (UCC) to the Julius General Practitioners' Network (JGPN) database. UCC offers structured CVRM at referral to the University Medical Centre (UMC) Utrecht. JGPN consists of electronic health record (EHR) data from referring GPs. Method: The cardiovascular risk factors were extracted for each patient 13 months before referral (JGPN), at UCC inclusion, and during 12 months follow-up (JGPN). The following areas were assessed: registration of risk factors; detection of risk factor(s) requiring treatment at UCC; communication of risk factors and actionable suggestions from the specialist to the GP; and change of management during follow-up. Results: In 52% of patients, >1 risk factors were registered (that is, extractable from structured fields within routine care health records) before UCC. In 12%—72% of patients, risk factor(s) existed that required (change or start of) treatment at UCC inclusion. Specialist communication included the complete risk profile in 67% of letters, but lacked actionable suggestions in 86%. In 29% of patients, at least one risk factor was registered after UCC. Change in management in GP records was seen in 21%-58% of them. Conclusion: Evaluation of a multidisciplinary LHS is possible via linkage of health data sources. Efforts have to be made to improve registration in primary care, as well as communication on findings and actionable suggestions for follow-up to bridge the gap in the CVRM continuum

    Low-Density Lipoprotein Cholesterol Target Attainment in Patients With Established Cardiovascular Disease: Analysis of Routine Care Data

    Get PDF
    BACKGROUND: Direct feedback on quality of care is one of the key features of a learning health care system (LHS), enabling health care professionals to improve upon the routine clinical care of their patients during practice. OBJECTIVE: This study aimed to evaluate the potential of routine care data extracted from electronic health records (EHRs) in order to obtain reliable information on low-density lipoprotein cholesterol (LDL-c) management in cardiovascular disease (CVD) patients referred to a tertiary care center. METHODS: We extracted all LDL-c measurements from the EHRs of patients with a history of CVD referred to the University Medical Center Utrecht. We assessed LDL-c target attainment at the time of referral and per year. In patients with multiple measurements, we analyzed LDL-c trajectories, truncated at 6 follow-up measurements. Lastly, we performed a logistic regression analysis to investigate factors associated with improvement of LDL-c at the next measurement. RESULTS: Between February 2003 and December 2017, 250,749 LDL-c measurements were taken from 95,795 patients, of whom 23,932 had a history of CVD. At the time of referral, 51% of patients had not reached their LDL-c target. A large proportion of patients (55%) had no follow-up LDL-c measurements. Most of the patients with repeated measurements showed no change in LDL-c levels over time: the transition probability to remain in the same category was up to 0.84. Sequence clustering analysis showed more women (odds ratio 1.18, 95% CI 1.07-1.10) in the cluster with both most measurements off target and the most LDL-c measurements furthest from the target. Timing of drug prescription was difficult to determine from our data, limiting the interpretation of results regarding medication management. CONCLUSIONS: Routine care data can be used to provide feedback on quality of care, such as LDL-c target attainment. These routine care data show high off-target prevalence and little change in LDL-c over time. Registrations of diagnosis; follow-up trajectory, including primary and secondary care; and medication use need to be improved in order to enhance usability of the EHR system for adequate feedback

    Recognition of Regional Water Table Patterns for Estimating Recharge Rates in Shallow Aquifers

    Get PDF
    We propose a new method for groundwater recharge rate estimation in regions with stream-aquifer interactions, at a linear scale on the order of 10 km and more. The method is based on visual identification and quantification of classically recognized water table contour patterns. Simple quantitative analysis of these patterns can be done manually from measurements on a map, or from more complex GIS data extraction and curve fitting. Recharge rate is then estimated from the groundwater table contour parameters, streambed gradients, and aquifer transmissivity using an analytical model for groundwater flow between parallel perennial streams. Recharge estimates were obtained in three regions (areas of 1500, 2200, and 3300 km2) using available water table maps produced by different methods at different times in the area of High Plains Aquifer in Nebraska. One region is located in the largely undeveloped Nebraska Sand Hills area, while the other two regions are located at a transition zone from Sand Hills to loess-covered area and include areas where groundwater is used for irrigation. Obtained recharge rates are consistent with other independent estimates. The approach is useful and robust diagnostic tool for preliminary estimates of recharge rates, evaluation of the quality of groundwater table maps, identification of priority areas for further aquifer characterization and expansion of groundwater monitoring networks prior to using more detailed methods. Includes supplemental materials
    corecore