138 research outputs found

    Methods to Develop an Electronic Medical Record Phenotype Algorithm to Compare the Risk of Coronary Artery Disease across 3 Chronic Disease Cohorts

    Get PDF
    Background Typically, algorithms to classify phenotypes using electronic medical record (EMR) data were developed to perform well in a specific patient population. There is increasing interest in analyses which can allow study of a specific outcome across different diseases. Such a study in the EMR would require an algorithm that can be applied across different patient populations. Our objectives were: (1) to develop an algorithm that would enable the study of coronary artery disease (CAD) across diverse patient populations; (2) to study the impact of adding narrative data extracted using natural language processing (NLP) in the algorithm. Additionally, we demonstrate how to implement CAD algorithm to compare risk across 3 chronic diseases in a preliminary study. Methods and Results We studied 3 established EMR based patient cohorts: diabetes mellitus (DM, n = 65,099), inflammatory bowel disease (IBD, n = 10,974), and rheumatoid arthritis (RA, n = 4,453) from two large academic centers. We developed a CAD algorithm using NLP in addition to structured data (e.g. ICD9 codes) in the RA cohort and validated it in the DM and IBD cohorts. The CAD algorithm using NLP in addition to structured data achieved specificity >95% with a positive predictive value (PPV) 90% in the training (RA) and validation sets (IBD and DM). The addition of NLP data improved the sensitivity for all cohorts, classifying an additional 17% of CAD subjects in IBD and 10% in DM while maintaining PPV of 90%. The algorithm classified 16,488 DM (26.1%), 457 IBD (4.2%), and 245 RA (5.0%) with CAD. In a cross-sectional analysis, CAD risk was 63% lower in RA and 68% lower in IBD compared to DM (p<0.0001) after adjusting for traditional cardiovascular risk factors. Conclusions We developed and validated a CAD algorithm that performed well across diverse patient populations. The addition of NLP into the CAD algorithm improved the sensitivity of the algorithm, particularly in cohorts where the prevalence of CAD was low. Preliminary data suggest that CAD risk was significantly lower in RA and IBD compared to DM.National Institutes of Health (U.S.). Informatics for Integrating Biology and the Bedside Project (U54LM008748

    The health and care of children with Down Syndrome

    Get PDF
    Down Syndrome (DS) affects ~10,500 children in the UK. Individuals with DS continue to have poorer health outcomes compared with the general population, and other forms of intellectual disability. By systematically mapping two decades of paediatric DS literature, I found a general decline in the number of publications, since 2014. The majority of publications utilised observational methodologies, with few interventional (5.6%) or qualitative/mixed-method studies (4.3%). Most publications focused on development & cognition, oncology and neurology; relatively few looked at the prevalence of morbidities and health surveillance. Using a large electronic health record dataset I determined the prevalence of morbidities among individuals with DS (N=4,648, age range 0-75 years), and compared with matched controls. The most prevalent morbidities in the DS cohort were hypothyroidism (30.4%), congenital cardiac disease (27.8%), epilepsy (21.9%) and hearing impairment (19.2%). We also found an increased risk of autism (aOR 7.7), chronic kidney disease (aOR 2.3), inflammatory bowel disease (aOR 2.4), non-accidental injury (aOR 1.9), sleep disordered breathing (SDB) (aOR 6.6) and vitamin-D deficiency (aOR 3.1). Finally, I explored current practice with regard to the routine health surveillance of children with DS, in paediatric departments across the UK. Sixty four departments returned a copy of their local health surveillance protocol. Practice was compared across departments, and with three national guidelines. For congenital cardiac disease, hypothyroidism and hearing/visual impairment, practice appeared to be consistent and compliant with national guidelines. However, in other areas (echocardiogram at transition, SBD, vitamin-D deficiency & renal/liver function), practice was patchy and inconsistent. The findings highlight a need for ongoing research in the field of paediatric DS, targeted at areas of greatest need, and those morbidities which are prevalent in the DS cohort. Furthermore, our findings highlight a need a single, evidence based guideline for the health surveillance of children with DS, to promote high quality, consistent care

    Inflammatory bowel disease in the United Kingdom: Epidemiological trends in primary care and associations with contraception

    Get PDF
    Background: The epidemiology of inflammatory bowel disease (IBD) in the UK is poorly described. Primary care contraceptive prescribing data published by the NHS are not linked to individual patients. Studies have linked contraceptive pills to the development of IBD. However, there is a paucity of literature on how contraceptive formulation and duration of therapy affect IBD risk. Aims: To describe changes in the incidence and prevalence of IBD in the UK from 2000-2018. To describe non-barrier contraceptive prescribing patterns in primary care over the same period. To investigate the associations between exposure to contraception and development of IBD. Methods: Three epidemiological studies using IQVIATM Medical Research Data; a cohort study examining temporal trends in IBD incidence and prevalence, a repeated cross-sectional study exploring trends in contraceptive prescribing, a nested case-control study investigating the associations between a range of contraceptives and development of IBD. Results: Overall, the incidence of IBD is falling, but prevalence continues to rise. Some of the highest recorded incidence and prevalence rates globally were observed, with a 94% rise in incidence in adolescents since the year 2000. Over the same period, combined hormonal contraception prescribing has halved whereas progestogen-only pill prescribing has more than doubled. Methods of contraception prescribed by GPs are influenced by social deprivation. Withdrawal of a pay-for-performance incentive may have adversely affected adolescent long-acting reversible contraception uptake. Results suggest that oestrogen-containing contraception is associated with development of IBD whereas progestogen-only methods have minimal to no effect. Conclusion: This thesis provides evidence relating to a wide range of temporal trends in the epidemiology of IBD and patterns of contraceptive prescribing in the UK. Although previous associations between oral contraceptive pills and IBD have been made, this thesis provides the first epidemiological evidence that oestrogen-containing contraceptives, but not progestogen-only methods, are associated with development of IBD

    Exploring the utility of metabolic profiling in stratifying patient groups in Inflammatory Bowel Disease

    Get PDF
    The pathogenesis of IBD, involving dynamic interactions between the microbiome, innate and adaptive immune systems, genetics and environmental factors, is a major focus of academic interest, in order to reveal more about the heterogeneous clinical course of the disease and in pursuit of improved therapeutic targets. Metabonomics has been previously used with a variety of biofluids to successfully distinguish IBD from controls, but the complex metabolic data also have potential to unlock insights into pathogenesis and better understand how to better stratify patients for personalised clinical care. In the largest urinary metabonomics IBD study to date, changes in the white European cohort confirmed previous published findings, highlighting discriminatory metabolites of gut microbial and inflammatory pathway sources. Significant metabolic differences were seen when comparing IBD patients and controls from South Asia to white North Europeans, demonstrating the influence of ethnicity on the metabolic profile and showing metabolite changes related to host-nutrition-microbiome interactions. Results from longitudinal measurements of the IBD metabolome in the same individuals over several years indicate relative stability despite the relapsing-remitting course of the disease and different treatments. This early finding suggests clinical outcomes may only have subtly discernible changes on metabolic profiles, potentially limiting its application as a disease-monitoring tool. 16S rRNA profiling, employed to characterise the microbiome, showed reduced microbial diversity in IBD and 4 key bacterial genera - Veillonella, Acidaminococcus, Lactobacillus and Streptococcus - associated with disease. Significant urinary and faecal metabolites in the same patients were correlated with these bacteria to demonstrate the feasibility of multi-omic integration in IBD. Furthermore, the breath VOC profiles of IBD patients obtained by SIFT-MS were distinct from those of heathy controls, with the significant compounds originating from microbial sources, and inflammatory pathways, demonstrating the potential of this technology and another facet to metabolic profiling in IBD.Open Acces

    Group-based trajectory modeling for longitudinal data of healthcare financial charges in patients with inflammatory bowel disease

    Get PDF
    Inflammatory bowel disease (IBD) is a heterogeneous group of lifelong chronic inflammatory diseases with variable and unpredictable disease courses which often require significant healthcare expenditures. There exists no uniform severity measure to capture the activity and the healthcare utilization of the disease. This study seeks to identify disease trajectories for the IBD patients based on their annual financial healthcare charges over time. We performed a longitudinal study of annual financial charges using a consented, prospective, natural history registry of 2,400 IBD patients at the University of Pittsburgh Medical Center from 2009 to 2013. The annual charges were calculated as the sum of inpatient admission charges and professional service charges, with (ChargeF) or without (ChargeR) biological medicine charges. Patients who completed a five-year follow-up were included in the study. The continuous financial charges were first categorized into sections of different price range, and then the data was fitted with a latent group-based zero-inflated Poisson model to identify different homogeneous trajectory patterns of financial charges. We identified six distinct trajectory groups of total annual charges obtained from each of the two calculation methods (ChargeF and ChargeR). We further compared between these trajectories for patient characteristics, disease activity indices (Harvey-Bradshaw Index and ulcerative colitis activity index), disease activity markers (high-sensitivity C-reactive protein and erythrocyte sedimentation rate), health-related quality of life index (short inflammatory bowel disease questionnaire, SIBDQ), healthcare utilization (emergency department, hospitalization, and surgery), and corticosteroid prescriptions. We concluded that the healthcare financial charge could be a novel and uniform metric to evaluate the disease severity and the response of IBD patients to treatments. The present study is the first of its kind using latent group-based trajectory modeling of financial charges to identify distinct subsets of IBD patients with their response to treatments. The model could be used to determine the genetic, environmental, and other factors that influence disease severity and the patient’s response to medical therapies. It will provide important information for the development of personalized or precision medical interventions for IBD patients and the reduction of their health care cost. Public Health Relevance: This study proposed a new metric which could be an accurate reflection of classic disease activity parameters, biochemical markers of inflammation, disease activity indices, and health-related quality of life in a cohort of patients with inflammatory bowel disease. The model developed would be of great significance to exploring the risk factors that influence the response to medical interventions. It will provide important information for the development of personalized or precision medical interventions for patients with inflammatory bowel disease and the reduction of their health care cost

    Epidemiology and natural history of paediatric-onset inflammatory bowel disease in Scotland

    Get PDF
    Background Inflammatory bowel disease (IBD) is a chronic lifelong condition which comprises Crohn’s disease (CD), ulcerative colitis (UC) and inflammatory bowel disease unclassified (IBDU). Around 8% of IBD cases diagnosed each year present in childhood (under 18 years)(1) and can cause impairment of linear growth and pubertal development, affecting education and future employment. The incidence of paediatric IBD (PIBD) is increasing both within Scotland, as evidenced by previous publications, but also worldwide as demonstrated in a recent systematic review. As the number of cases are increasing, it has become critical that effective treatments are available to manage symptoms in this patient cohort. Anti-TNF alpha antagonists have been used to treat PIBD and shown in large single centre studies to be effective at the induction and maintenance of remission, however, these studies may not reflect the general PIBD patients’ clinicians treat daily so “real life” experience are needed to inform clinical practice. Aims The aims of my thesis were 1) to determine if the incidence and prevalence of PIBD continue to increase worldwide and to examine the durability of any incidence rise seen in Scotland, 2) to investigate in a nationwide population-based study the incidence and natural history of IBDU, 3) to examine the efficacy, safety and long-term effects of anti-TNF alpha drugs and lastly 4) to assess the long-term risk of PIBD on cancer and mortality rates in a nationwide population-based study. Methods Data was collected from all 4 PIBD centers across Scotland (Glasgow, Edinburgh, Aberdeen and Dundee) from 2009-2014 on new cases of IBD as well as those diagnosed with IBDU from 2003-2013, those treated with anti-TNF drugs from 2000-2012 and cases of cancer/deaths within the PIBD population from 2003-2013. Results Thirty-six studies from 18 countries were included in the incidence systematic review, most from North America and Western Europe. The highest incidence was 15.2 per 100,000 in Nova Scotia, Canada with the lowest 0.47 per 100,000 in Saudi Arabia, for CD the highest incidence was 9.2 per 100,000 in Nova Scotia and lowest in Saudi Arabia at 0.27 per 100,000 whilst for UC rates were highest in Finland at 8 per 100,000 and lowest in Saudi Arabia at 0.2 per 100,000. In the prevalence systematic review, 27 studies were included from 12 countries with the highest prevalence of 301 per 100,000 in Israel and lowest in Libya at 3.6 per 100,000. CD was highest in Sweden at 41 per 100,000 and lowest in Libya at 2.0 per 100,000 which was similar for UC with a high of 30.7 per 100,000 in Sweden and lowest at 1.36 in Libya. Most studies that reported on temporal trends saw an increase in PIBD, CD and UC. Significant heterogeneity existed in studies in both incidence and prevalence due to varying methodological approaches, age cut offs and diagnostic algorithms so meta-analysis was not performed. The incidence of PIBD in Scotland demonstrated a significant and sustained rise from 430 cases in 2003-2008 with an incidence rate of 7.6 per 100,000 (95%CI 7.1-8.6) to 582 cases in 2009-2014 and incidence of 10.6 per 100,000 (95%CI 9.8-11.5) (p<0.001); primarily due to an increase in paediatric Crohn’s disease. When compared with historical data there was a sustained and durable increase over the last 40 years, again mostly driven by increasing CD. The incidence of IBDU also increased from 2003-2013, accounting for around 20% of new PIBD cases in Scotland. Most children with this subtype had a relatively mild disease course, however 43% required immunosuppression and a small number escalated to anti-TNF therapy. 23% of IBDU patients had their diagnosis changed after endoscopic re evaluation, most 62%, to CD. A Scottish nationwide registry of all children treated with anti TNF drugs (infliximab (IFX) and adalimumab (ADA) was created from 2000-2012. 87% had improvement of their symptoms within 3 months post induction to IFX and 86% achieved remission with ADA. Growth was improved after one year of treatment with IFX but only in those children who responded after induction, had been diagnosed for over 2 years with IBD and were in the early stages of puberty (Tanner stage 1 and 2). Anti-TNF agents were generally safe and well tolerated with only 13% having an acute adverse reaction to IFX, ADA was also well tolerated with 16/57 having an adverse event. Death in children with PIBD was a rare occurrence with only 3 cases over 10 years, 2 cases were PIBD related with 2 cases of malignancy were observed, both had been treated with azathioprine with one subsequent death. Conclusions The incidence and prevalence of PIBD is increasing worldwide with the highest incidence rates from Nova Scotia, Canada and highest prevalence rates from Israel, although there is a propensity of data from North America and Western Europe. In these population-based studies of paediatric-onset inflammatory bowel disease in Scotland, the number of new cases continue to rise with IBDU, as a subtype of IBD, more commonly diagnosed compared to other countries. Most children with IBDU had a mild disease course with 23% changing diagnosis following endoscopic reassessment most, 62% to CD. In Scotland, anti-TNF drugs are effective at managing symptoms of IBD with relatively few serious side effects with other benefits including improving linear growth in those treated with infliximab. Finally, cancer and death are a rare outcome in children with IBD in Scotland. The continued increase in incidence of PIBD with higher rates of IBDU observed in Scotland may suggest environmental factors, such as urbanization or latitude, influencing the onset of PIBD. Prospective case control studies can further explore these environmental risk factors taking advantage of the nationwide collaborative approach to care and research within Scotland

    Repeatable and reusable research - Exploring the needs of users for a Data Portal for Disease Phenotyping

    Get PDF
    Background: Big data research in the field of health sciences is hindered by a lack of agreement on how to identify and define different conditions and their medications. This means that researchers and health professionals often have different phenotype definitions for the same condition. This lack of agreement makes it hard to compare different study findings and hinders the ability to conduct repeatable and reusable research. Objective: This thesis aims to examine the requirements of various users, such as researchers, clinicians, machine learning experts, and managers, for both new and existing data portals for phenotypes (concept libraries). Methods: Exploratory sequential mixed methods were used in this thesis to look at which concept libraries are available, how they are used, what their characteristics are, where there are gaps, and what needs to be done in the future from the point of view of the people who use them. This thesis consists of three phases: 1) two qualitative studies, including one-to-one interviews with researchers, clinicians, machine learning experts, and senior research managers in health data science, as well as focus group discussions with researchers working with the Secured Anonymized Information Linkage databank, 2) the creation of an email survey (i.e., the Concept Library Usability Scale), and 3) a quantitative study with researchers, health professionals, and clinicians. Results: Most of the participants thought that the prototype concept library would be a very helpful resource for conducting repeatable research, but they specified that many requirements are needed before its development. Although all the participants stated that they were aware of some existing concept libraries, most of them expressed negative perceptions about them. The participants mentioned several facilitators that would encourage them to: 1) share their work, such as receiving citations from other researchers; and 2) reuse the work of others, such as saving a lot of time and effort, which they frequently spend on creating new code lists from scratch. They also pointed out several barriers that could inhibit them from: 1) sharing their work, such as concerns about intellectual property (e.g., if they shared their methods before publication, other researchers would use them as their own); and 2) reusing others' work, such as a lack of confidence in the quality and validity of their code lists. Participants suggested some developments that they would like to see happen in order to make research that is done with routine data more reproducible, such as the availability of a drive for more transparency in research methods documentation, such as publishing complete phenotype definitions and clear code lists. Conclusions: The findings of this thesis indicated that most participants valued a concept library for phenotypes. However, only half of the participants felt that they would contribute by providing definitions for the concept library, and they reported many barriers regarding sharing their work on a publicly accessible platform such as the CALIBER research platform. Analysis of interviews, focus group discussions, and qualitative studies revealed that different users have different requirements, facilitators, barriers, and concerns about concept libraries. This work was to investigate if we should develop concept libraries in Kuwait to facilitate the development of improved data sharing. However, at the end of this thesis the recommendation is this would be unlikely to be cost effective or highly valued by users and investment in open access research publications may be of more value to the Kuwait research/academic community

    Paediatric inflammatory bowel disease - bench to bedside and nationwide : a detailed analysis of Scottish children with IBD

    Get PDF
    The inflammatory bowel diseases (IBDs) are a group of chronic conditions affecting the gastrointestinal tract, often presenting with non-specific clinical features such as abdominal pain, weight loss and diarrhoea. Approximately 25% of patients are diagnosed with IBD in childhood. For epidemiological studies, previously collected (1990-1995) and original (2003-2008) Scottish incidence data were used to determine national trends in newly diagnosed paediatric IBD (PIBD). A smaller, geographically defined, prospective 14-year cohort (1997- 2011) in South-East Scotland (SES) was used to assess regional trends in incidence, point prevalence, disease extent, medication use and PIBD surgery rates in 326 children. For the detailed analysis of the role of ICOSLG and CRP in Scottish children with PIBD, haplotype-tagging of both genes in 448 children (and their parents) registered on the Paediatric Inflammatory bowel disease Cohort and Treatment Study (PICTS) database was performed. Further clinical information from this database and previously gathered adult mRNA microarray data were also used to inform the analysis. For the faecal calprotectin (FC) case-control study, all PIBD patients diagnosed in SES between 01.01.05 and 31.12.10 (aged 1- 17yrs) with a FC performed during initial workup were identified; controls were matched non- IBD patients who had similarly undergone endoscopy with a referral FC level available. The systematic review and meta-analysis of FC case-control studies was performed with keywords relating to IBD and calprotectin in electronic resources from 1946 to May 2012. Inclusion criteria were studies that reported FC levels prior to the endoscopic investigation of IBD in children less than 18 years old. Laboratory work used newly derived HEK293 and HCT116 cell lines stably expressing wild-type NOD2 and the CD-associated NOD2 frameshift mutant, as well as utilising previously derived HEK293 and HCT116 cells stably expressing green fluorescent-labelled protein LC3 during the assessment of autophagy. Western blot, immunofluorescent microscopy and flow cytometry were used for analysis. There was a significant rise in PIBD incidence in Scotland since the early 1990s, with 260 new cases between 1990-1995 (4.45/100,000/year) and 436 in the 2003-2008 epoch (7.82/100,000/year) (p<0.001). A five-fold increase in Crohn's disease (CD) in the last 40 years was also demonstrated. SES was shown to have the highest recorded PIBD incidence rate in the UK for the six-year epoch from 2006-2011 (9.50/100,000/year) with a significant rise in ulcerative colitis (UC) to 2.67/100,000/year (p=0.010). Point prevalence rates for PIBD in SES had also risen significantly to 41.2/100,000 between the 2000-2005 and 2006-2011 epochs (p=0.016). With a follow up of 1577 patient years, the severe phenotype in children with PIBD was confirmed; 34% of children with CD presented with pan-enteric disease (44% at follow up), and 76% of children with UC had pancolonic disease at diagnosis (81% at follow up). 26% of patients required methotrexate and 18% were exposed to infliximab/adalimumab, with the time to first exposure of both significantly lower in children diagnosed between 2006-2011 (p=0.001 and p<0.001 respectively). A total of 70% of children were exposed to azathioprine and 20% underwent IBD-related surgery. Using a haplotype-tagging approach and transmission disequilibrium testing (TDT) in 230 PIBD case-parent trios there was significant overtransmission of the rs8126734-A single nucleotide polymorphism (SNP) in ICOSLG following correction (p=0.0467). In the CD TDT analysis the same SNP was overtransmitted (p=0.0084). The strongest susceptibility signal was evident across the two marker haplotype rs762421-A / rs8126734-G (p=0.0072), suggesting that the 3-prime untranslated region in ICOSLG may be targeted for deep sequencing. mRNA microarray data from adult patients showed downregulation of ICOSLG expression in the ascending colon (p=0.023) and upregulation in the descending colon (p=0.0351) in uninflamed biopsies from CD patients and non-IBD controls; no difference in gene expression was shown in UC patients. Using a similar approach, the A allele of two SNPs tagging CRP showed significant over-transmission to affected IBD patients after correction (rs1417938, p=0.006; rs1130864, p=0.015). The six-marker haplotype (ACACAC) showed significant distortion of transmission to affected individuals (p=8x10-4). CD and UC patients demonstrated differences in rs1205 genotype (p=0.0085) and CRP haplotype (p=0.0024), with the influence of the rs1205 SNP on response to anti-tumour necrosis factor-alpha therapy also shown (p=0.021). During the FC case-control study significantly elevated FC levels at diagnosis were demonstrated compared to controls (1265 μg/g vs 65 μg/g; p<0.001). FC also outperformed commonly used blood parameters (e.g. CRP, ESR, platelets), with an area under the curve of 0.93 (95% CI 0.89-0.97) and good sensitivity (0.93 [95% CI 0.86-0.98]) and specificity (0.74 [95% CI 0.64-0.82]) when values above 200μg/g were used. FC levels were not influenced by disease location in CD or UC. The systematic review and meta-analysis highlighted the often poor methodological quality of previous studies and concluded that across all studies FC had a pooled sensitivity of 0.98 (95% CI 0.95-1.00) and pooled specificity of 0.68 (95% CI 0.50-0.86) for PIBD at diagnosis. Characterisation of cells stably-expressing wild-type NOD2 or the CD-associated NOD2 frameshift mutation demonstrated increased cell proliferation compared to empty vector, and an accentuated apoptotic response to serum starvation. The NOD2 frameshift protein had a shorter half-life (at 11 hours) than the wild-type protein, with degradation of the NOD2 protein shown to be mediated through a proteasome-dependent pathway, possibly through lysine residues on the CARD domain. Following the establishment of a robust method of assessing autophagy in a cell culture system, experimental work showed that muramyl dipeptide-induced autophagy is unlikely to signal through the mammalian target of rapamycin, with the intermediate filament vimentin shown to be intimately involved in this pathway; the vimentin gene (Vim) was also shown to be a candidate susceptibility gene for CD. Using a panel of PIBD drugs azathioprine was shown to induce autophagy in a dose-dependent manner through an mTOR-dependent, ERK-independent pathway. It can be seen that with the increasing incidence and prevalence of PIBD in Scotland that a greater understanding of epidemiological trends, the role of genetic susceptibility, the optimal use of biomarkers and translational functional biology are all needed to understand further the aetiopathogenesis of PIBD. This future work will undoubtedly help to inform service design and the clinical care pathways utilised to provide the best care for children in addition to targeting pathways for potential drug development, with these measures helping to prepare for the increasing disease burden generated by PIBD

    Deep Risk Prediction and Embedding of Patient Data: Application to Acute Gastrointestinal Bleeding

    Get PDF
    Acute gastrointestinal bleeding is a common and costly condition, accounting for over 2.2 million hospital days and 19.2 billion dollars of medical charges annually. Risk stratification is a critical part of initial assessment of patients with acute gastrointestinal bleeding. Although all national and international guidelines recommend the use of risk-assessment scoring systems, they are not commonly used in practice, have sub-optimal performance, may be applied incorrectly, and are not easily updated. With the advent of widespread electronic health record adoption, longitudinal clinical data captured during the clinical encounter is now available. However, this data is often noisy, sparse, and heterogeneous. Unsupervised machine learning algorithms may be able to identify structure within electronic health record data while accounting for key issues with the data generation process: measurements missing-not-at-random and information captured in unstructured clinical note text. Deep learning tools can create electronic health record-based models that perform better than clinical risk scores for gastrointestinal bleeding and are well-suited for learning from new data. Furthermore, these models can be used to predict risk trajectories over time, leveraging the longitudinal nature of the electronic health record. The foundation of creating relevant tools is the definition of a relevant outcome measure; in acute gastrointestinal bleeding, a composite outcome of red blood cell transfusion, hemostatic intervention, and all-cause 30-day mortality is a relevant, actionable outcome that reflects the need for hospital-based intervention. However, epidemiological trends may affect the relevance and effectiveness of the outcome measure when applied across multiple settings and patient populations. Understanding the trends in practice, potential areas of disparities, and value proposition for using risk stratification in patients presenting to the Emergency Department with acute gastrointestinal bleeding is important in understanding how to best implement a robust, generalizable risk stratification tool. Key findings include a decrease in the rate of red blood cell transfusion since 2014 and disparities in access to upper endoscopy for patients with upper gastrointestinal bleeding by race/ethnicity across urban and rural hospitals. Projected accumulated savings of consistent implementation of risk stratification tools for upper gastrointestinal bleeding total approximately $1 billion 5 years after implementation. Most current risk scores were designed for use based on the location of the bleeding source: upper or lower gastrointestinal tract. However, the location of the bleeding source is not always clear at presentation. I develop and validate electronic health record based deep learning and machine learning tools for patients presenting with symptoms of acute gastrointestinal bleeding (e.g., hematemesis, melena, hematochezia), which is more relevant and useful in clinical practice. I show that they outperform leading clinical risk scores for upper and lower gastrointestinal bleeding, the Glasgow Blatchford Score and the Oakland score. While the best performing gradient boosted decision tree model has equivalent overall performance to the fully connected feedforward neural network model, at the very low risk threshold of 99% sensitivity the deep learning model identifies more very low risk patients. Using another deep learning model that can model longitudinal risk, the long-short-term memory recurrent neural network, need for transfusion of red blood cells can be predicted at every 4-hour interval in the first 24 hours of intensive care unit stay for high risk patients with acute gastrointestinal bleeding. Finally, for implementation it is important to find patients with symptoms of acute gastrointestinal bleeding in real time and characterize patients by risk using available data in the electronic health record. A decision rule-based electronic health record phenotype has equivalent performance as measured by positive predictive value compared to deep learning and natural language processing-based models, and after live implementation appears to have increased the use of the Acute Gastrointestinal Bleeding Clinical Care pathway. Patients with acute gastrointestinal bleeding but with other groups of disease concepts can be differentiated by directly mapping unstructured clinical text to a common ontology and treating the vector of concepts as signals on a knowledge graph; these patients can be differentiated using unbalanced diffusion earth mover’s distances on the graph. For electronic health record data with data missing not at random, MURAL, an unsupervised random forest-based method, handles data with missing values and generates visualizations that characterize patients with gastrointestinal bleeding. This thesis forms a basis for understanding the potential for machine learning and deep learning tools to characterize risk for patients with acute gastrointestinal bleeding. In the future, these tools may be critical in implementing integrated risk assessment to keep low risk patients out of the hospital and guide resuscitation and timely endoscopic procedures for patients at higher risk for clinical decompensation
    corecore