48 research outputs found

    Improving Case Definition of Crohnʼs Disease and Ulcerative Colitis in Electronic Medical Records Using Natural Language Processing

    Get PDF
    available in PMC 2014 June 01Background: Previous studies identifying patients with inflammatory bowel disease using administrative codes have yielded inconsistent results. Our objective was to develop a robust electronic medical record–based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing. Methods: Using the electronic medical records of 2 large academic centers, we created data marts for Crohn’s disease (CD) and ulcerative colitis (UC) comprising patients with ≥1 International Classification of Diseases, 9th edition, code for each disease. We used codified (i.e., International Classification of Diseases, 9th edition codes, electronic prescriptions) and narrative data from clinical notes to develop our classification model. Model development and validation was performed in a training set of 600 randomly selected patients for each disease with medical record review as the gold standard. Logistic regression with the adaptive LASSO penalty was used to select informative variables. Results: We confirmed 399 CD cases (67%) in the CD training set and 378 UC cases (63%) in the UC training set. For both, a combined model including narrative and codified data had better accuracy (area under the curve for CD 0.95; UC 0.94) than models using only disease International Classification of Diseases, 9th edition codes (area under the curve 0.89 for CD; 0.86 for UC). Addition of natural language processing narrative terms to our final model resulted in classification of 6% to 12% more subjects with the same accuracy. Conclusions: Inclusion of narrative concepts identified using natural language processing improves the accuracy of electronic medical records case definition for CD and UC while simultaneously identifying more subjects compared with models using codified data alone.National Institutes of Health (U.S.) (NIH U54-LM008748)American Gastroenterological AssociationNational Institutes of Health (U.S.) (NIH K08 AR060257)Beth Isreal Deaconess Medical Center (Katherine Swan Ginsburg Fund)National Institutes of Health (U.S.) (NIH R01-AR056768)Burroughs Wellcome Fund (Career Award for Medical Scientists)National Institutes of Health (U.S.) (NIH U01-GM092691)National Institutes of Health (U.S.) (NIH R01-AR059648

    Normalization of Plasma 25-Hydroxy Vitamin D Is Associated with Reduced Risk of Surgery in Crohn’s Disease

    Get PDF
    available in PMC 2014 August 01AB Background: Vitamin D may have an immunologic role in Crohn's disease (CD) and ulcerative colitis (UC). Retrospective studies suggested a weak association between vitamin D status and disease activity but have significant limitations. Methods: Using a multi-institution inflammatory bowel disease cohort, we identified all patients with CD and UC who had at least one measured plasma 25-hydroxy vitamin D (25(OH)D). Plasma 25(OH)D was considered sufficient at levels >=30 ng/mL. Logistic regression models adjusting for potential confounders were used to identify impact of measured plasma 25(OH)D on subsequent risk of inflammatory bowel disease-related surgery or hospitalization. In a subset of patients where multiple measures of 25(OH)D were available, we examined impact of normalization of vitamin D status on study outcomes. Results: Our study included 3217 patients (55% CD; mean age, 49 yr). The median lowest plasma 25(OH)D was 26 ng/mL (interquartile range, 17-35 ng/mL). In CD, on multivariable analysis, plasma 25(OH)D =30 ng/mL. Similar estimates were also seen for UC. Furthermore, patients with CD who had initial levels <30 ng/mL but subsequently normalized their 25(OH)D had a reduced likelihood of surgery (odds ratio, 0.56; 95% confidence interval, 0.32-0.98) compared with those who remained deficient. Conclusion: Low plasma 25(OH)D is associated with increased risk of surgery and hospitalizations in both CD and UC, and normalization of 25(OH)D status is associated with a reduction in the risk of CD-related surgery. (C) Crohn's & Colitis Foundation of America, Inc

    Similar Risk of Depression and Anxiety Following Surgery or Hospitalization for Crohn's Disease and Ulcerative Colitis

    Get PDF
    OBJECTIVES: Psychiatric comorbidity is common in Crohn's disease (CD) and ulcerative colitis (UC). Inflammatory bowel disease (IBD)-related surgery or hospitalizations represent major events in the natural history of the disease. The objective of this study is to examine whether there is a difference in the risk of psychiatric comorbidity following surgery in CD and UC. METHODS: We used a multi-institution cohort of IBD patients without a diagnosis code for anxiety or depression preceding their IBD-related surgery or hospitalization. Demographic-, disease-, and treatment-related variables were retrieved. Multivariate logistic regression analysis was performed to individually identify risk factors for depression and anxiety. RESULTS: Our study included a total of 707 CD and 530 UC patients who underwent bowel resection surgery and did not have depression before surgery. The risk of depression 5 years after surgery was 16% and 11% in CD and UC patients, respectively. We found no difference in the risk of depression following surgery in the CD and UC patients (adjusted odds ratio, 1.11; 95% confidence interval, 0.84–1.47). Female gender, comorbidity, immunosuppressant use, perianal disease, stoma surgery, and early surgery within 3 years of care predicted depression after CD surgery; only the female gender and comorbidity predicted depression in UC patients. Only 12% of the CD cohort had ≥4 risk factors for depression, but among them nearly 44% subsequently received a diagnosis code for depression. CONCLUSIONS: IBD-related surgery or hospitalization is associated with a significant risk for depression and anxiety, with a similar magnitude of risk in both diseases.National Institutes of Health (U.S.) (U54-LM008748

    Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts

    Get PDF
    Background Typically, algorithms to classify phenotypes using electronic medical record (EMR) data were developed to perform well in a specific patient population. There is increasing interest in analyses which can allow study of a specific outcome across different diseases. Such a study in the EMR would require an algorithm that can be applied across different patient populations. Our objectives were: (1) to develop an algorithm that would enable the study of coronary artery disease (CAD) across diverse patient populations; (2) to study the impact of adding narrative data extracted using natural language processing (NLP) in the algorithm. Additionally, we demonstrate how to implement CAD algorithm to compare risk across 3 chronic diseases in a preliminary study. Methods and Results We studied 3 established EMR based patient cohorts: diabetes mellitus (DM, n = 65,099), inflammatory bowel disease (IBD, n = 10,974), and rheumatoid arthritis (RA, n = 4,453) from two large academic centers. We developed a CAD algorithm using NLP in addition to structured data (e.g. ICD9 codes) in the RA cohort and validated it in the DM and IBD cohorts. The CAD algorithm using NLP in addition to structured data achieved specificity >95% with a positive predictive value (PPV) 90% in the training (RA) and validation sets (IBD and DM). The addition of NLP data improved the sensitivity for all cohorts, classifying an additional 17% of CAD subjects in IBD and 10% in DM while maintaining PPV of 90%. The algorithm classified 16,488 DM (26.1%), 457 IBD (4.2%), and 245 RA (5.0%) with CAD. In a cross-sectional analysis, CAD risk was 63% lower in RA and 68% lower in IBD compared to DM (p<0.0001) after adjusting for traditional cardiovascular risk factors. Conclusions We developed and validated a CAD algorithm that performed well across diverse patient populations. The addition of NLP into the CAD algorithm improved the sensitivity of the algorithm, particularly in cohorts where the prevalence of CAD was low. Preliminary data suggest that CAD risk was significantly lower in RA and IBD compared to DM.National Institutes of Health (U.S.). Informatics for Integrating Biology and the Bedside Project (U54LM008748

    Modeling Disease Severity in Multiple Sclerosis Using Electronic Health Records

    Get PDF
    Objective: To optimally leverage the scalability and unique features of the electronic health records (EHR) for research that would ultimately improve patient care, we need to accurately identify patients and extract clinically meaningful measures. Using multiple sclerosis (MS) as a proof of principle, we showcased how to leverage routinely collected EHR data to identify patients with a complex neurological disorder and derive an important surrogate measure of disease severity heretofore only available in research settings. Methods: In a cross-sectional observational study, 5,495 MS patients were identified from the EHR systems of two major referral hospitals using an algorithm that includes codified and narrative information extracted using natural language processing. In the subset of patients who receive neurological care at a MS Center where disease measures have been collected, we used routinely collected EHR data to extract two aggregate indicators of MS severity of clinical relevance multiple sclerosis severity score (MSSS) and brain parenchymal fraction (BPF, a measure of whole brain volume). Results: The EHR algorithm that identifies MS patients has an area under the curve of 0.958, 83% sensitivity, 92% positive predictive value, and 89% negative predictive value when a 95% specificity threshold is used. The correlation between EHR-derived and true MSSS has a mean R[superscript 2] = 0.38±0.05, and that between EHR-derived and true BPF has a mean R[superscript 2] = 0.22±0.08. To illustrate its clinical relevance, derived MSSS captures the expected difference in disease severity between relapsing-remitting and progressive MS patients after adjusting for sex, age of symptom onset and disease duration (p = 1.56×10[superscript −12]). Conclusion: Incorporation of sophisticated codified and narrative EHR data accurately identifies MS patients and provides estimation of a well-accepted indicator of MS severity that is widely used in research settings but not part of the routine medical records. Similar approaches could be applied to other complex neurological disorders.National Institute of General Medical Sciences (U.S.) (NIH U54-LM008748

    Discerning Tumor Status from Unstructured MRI Reports—Completeness of Information in Existing Reports and Utility of Automated Natural Language Processing

    Get PDF
    Information in electronic medical records is often in an unstructured free-text format. This format presents challenges for expedient data retrieval and may fail to convey important findings. Natural language processing (NLP) is an emerging technique for rapid and efficient clinical data retrieval. While proven in disease detection, the utility of NLP in discerning disease progression from free-text reports is untested. We aimed to (1) assess whether unstructured radiology reports contained sufficient information for tumor status classification; (2) develop an NLP-based data extraction tool to determine tumor status from unstructured reports; and (3) compare NLP and human tumor status classification outcomes. Consecutive follow-up brain tumor magnetic resonance imaging reports (2000–­2007) from a tertiary center were manually annotated using consensus guidelines on tumor status. Reports were randomized to NLP training (70%) or testing (30%) groups. The NLP tool utilized a support vector machines model with statistical and rule-based outcomes. Most reports had sufficient information for tumor status classification, although 0.8% did not describe status despite reference to prior examinations. Tumor size was unreported in 68.7% of documents, while 50.3% lacked data on change magnitude when there was detectable progression or regression. Using retrospective human classification as the gold standard, NLP achieved 80.6% sensitivity and 91.6% specificity for tumor status determination (mean positive predictive value, 82.4%; negative predictive value, 92.0%). In conclusion, most reports contained sufficient information for tumor status determination, though variable features were used to describe status. NLP demonstrated good accuracy for tumor status classification and may have novel application for automated disease status classification from electronic databases

    A model study of ozone distributions over Europe during the August 2003 heat wave.

    No full text
    The European summer of 2003 was characterised by intense heat, prolonged isolation and suppressed ventilation of the boundary layer which, combined with large anthropogenic emissions and strong fires, resulted in a build up of an unprecedentedly high and long-lasting photochemical smog over large parts of the continent. In this work, a global chemistry and transport model GEOS-Chem is compared with surface O3 concentrations observed in 2003 in order to examine the extent to which the model is capable of reproducing such an extreme event. The GEOS-Chem reproduces the temporal variation of O3 at the Jungfraujoch mountain site, Switzerland, including the enhanced concentrations associated with the August 2003 heat wave (r = 0.84). The spatial distribution of the enhanced surface O3 over Spain, France, Germany and Italy is also captured to some extent (r = 0.63), although the largest concentrations appear to be located over the Italian Peninsula in the model rather than over Central Europe as suggested by the surface O3 observations. In general, the observed differences between the European averaged O3 concentrations in the summer of 2003 to those in 2004 are larger in the observations than in the model, as the model reproduces relatively well the enhanced levels in 2003 but overestimates those observed in 2004. Preliminary contributions of various sources to the O3 surface concentrations over Europe during the heat wave indicate that anthropogenic emissions from Europe contribute the most to the O3 build up near the surface (40 to 50%, i.e. 30 ppb). The contribution from anthropogenic emissions from the other major source regions of the northern hemisphere, in particular North America, tends to be smaller than those of other years. The model indicates that the large fires that occurred in that year contributed up to 5% (3 ppb) to surface O3 in close proximity to the fire regions and less elsewhere in Europe. Biogenic volatile organic compounds (VOCs) emitted by grass and forest areas contributed up to 10% (56 ppb) of surface O3 over France, Germany and northern Italy, which represents a contribution that is twice as large than that found in 2004. These results in terms of contributions from various sources, particularly biogenic emissions, should be seen as preliminary, as the response of vegetation to such extreme events may not be well represented in the model

    The decision to purchase a bundled cultural pass: The role of pre-existing attitudinal and behavioural relationships with one network member

    No full text
    This study focuses on passes offered as a bundle of complementary (museum/theatre) and/or substitutable cultural services (several museums). As many cultural institutions are developing defensive loyalty-building policies to retain their clients, this type of initiative introduces new strategies for encouraging multi-loyalty. The study looks at the relationship between loyalty to one network member and intention to purchase a bundled package combining different institutions. It investigates the role of the relationship established with that member as an antecedent of purchasing intention. This relationship is assessed using an attitudinal (satisfaction, trust, commitment) and behavioural (depth, breadth, duration, amount) approach. The main antecedent of clients’ intention to purchase a bundled pass is found to be satisfaction with their current museum. There are significant variations, depending on the type of museum frequented within the network. In high-end museums, intention to purchase a pass is essentially a function of satisfaction, while trust and commitment play a role in low-end museums. In both types the behavioural relationship plays a modest role: neither the amount spent nor the duration or depth of the relationship has an impact on intention to purchase; only breadth plays a positive role, and then only a modest one
    corecore