1,248 research outputs found

    Discovery of Type 2 Diabetes Trajectories from Electronic Health Records

    Get PDF
    University of Minnesota Ph.D. dissertation. September 2020. Major: Health Informatics. Advisor: Gyorgy Simon. 1 computer file (PDF); xiii, 110 pages.Type 2 diabetes (T2D) is one of the fastest growing public health concerns in the United States. There were 30.3 million patients (9.4% of the US populations) suffering from diabetes in 2015. Diabetes, which is the seventh leading cause of death in the United States, is known to be a non-reversible (incurable) chronic disease, leading to severe complications, including chronic kidney disease, amputation, blindness, and various cardiac and vascular diseases. Early identification of patients at high risk is regarded as the most effective clinical tool to prevent or delay the development of diabetes, allowing patients to change their life style or to receive medication earlier. In turn, these interventions can help decrease the risk of diabetes by 30-60%. Many studies have been conducted aiming at the early identification of patients at high risk in the clinical settings. These studies typically only consider the patient's current state at the time of the assessment and do not fully utilize all available information such as patient's medical history. Past history is important. It has been shown that laboratory results and vital signs can differ between diabetic and non-diabetic patients as many as 15-20 years before the onset of diabetes. We have also shown in our study that the order in which patients develop diabetes-related comorbidities is predictive of their diabetes risk even after adjusting for the severity of the comorbidities. In this thesis, we develop multiple novel methods to discover T2D trajectories from Electronic Health Records (EHR). We define trajectory as an order of in which diseases developed. We aim to discover typical and atypical trajectories where typical trajectories represent predominant patterns of progressions and atypical trajectories refer to the rest of the trajectories. Revealing trajectories can allow us to divide patients into subpopulations that can uncover the underlying etiology of diabetes. More importantly, by assessing the risk correctly and by a better understanding of the heterogeneity of diabetes, we can provide better care. Since data collected from EHR poses several challenges to directly identify trajectories from EHR data, we devise four specific studies to address the challenges: First, we propose a new knowledge-driven representation for clinical data mining, second, we demonstrate a method for estimating the onset time of slow-onset diseases from intermittently observable laboratory results in the specific context of T2D, third, we present a method to infer trajectories, the sequence of comorbidities potentially leading up to a particular disease of interest, and finally, we propose a novel method to discover multiple trajectories from EHR data. The patterns we discovered from above four studies address a clinical issue, are clinically verifiable and are amenable to deployment in practice to improve the quality of individual patient care towards promoting public health in the United States

    Natural Language Processing of Clinical Notes on Chronic Diseases: Systematic Review

    Get PDF
    Novel approaches that complement and go beyond evidence-based medicine are required in the domain of chronic diseases, given the growing incidence of such conditions on the worldwide population. A promising avenue is the secondary use of electronic health records (EHRs), where patient data are analyzed to conduct clinical and translational research. Methods based on machine learning to process EHRs are resulting in improved understanding of patient clinical trajectories and chronic disease risk prediction, creating a unique opportunity to derive previously unknown clinical insights. However, a wealth of clinical histories remains locked behind clinical narratives in free-form text. Consequently, unlocking the full potential of EHR data is contingent on the development of natural language processing (NLP) methods to automatically transform clinical text into structured clinical data that can guide clinical decisions and potentially delay or prevent disease onset

    Adiabatic dynamic causal modelling

    Get PDF
    This technical note introduces adiabatic dynamic causal modelling, a method for inferring slow changes in biophysical parameters that control fluctuations of fast neuronal states. The application domain we have in mind is inferring slow changes in variables (e.g., extracellular ion concentrations or synaptic efficacy) that underlie phase transitions in brain activity (e.g., paroxysmal seizure activity). The scheme is efficient and yet retains a biophysical interpretation, in virtue of being based on established neural mass models that are equipped with a slow dynamic on the parameters (such as synaptic rate constants or effective connectivity). In brief, we use an adiabatic approximation to summarise fast fluctuations in hidden neuronal states (and their expression in sensors) in terms of their second order statistics; namely, their complex cross spectra. This allows one to specify and compare models of slowly changing parameters (using Bayesian model reduction) that generate a sequence of empirical cross spectra of electrophysiological recordings. Crucially, we use the slow fluctuations in the spectral power of neuronal activity as empirical priors on changes in synaptic parameters. This introduces a circular causality, in which synaptic parameters underwrite fast neuronal activity that, in turn, induces activity-dependent plasticity in synaptic parameters. In this foundational paper, we describe the underlying model, establish its face validity using simulations and provide an illustrative application to a chemoconvulsant animal model of seizure activity

    Improving Diagnostics with Deep Forest Applied to Electronic Health Records

    Get PDF
    An electronic health record (EHR) is a vital high-dimensional part of medical concepts. Discovering implicit correlations in the information of this data set and the research and informative aspects can improve the treatment and management process. The challenge of concern is the data sources’ limitations in finding a stable model to relate medical concepts and use these existing connections. This paper presents Patient Forest, a novel end-to-end approach for learning patient representations from tree-structured data for readmission and mortality prediction tasks. By leveraging statistical features, the proposed model is able to provide an accurate and reliable classifier for predicting readmission and mortality. Experiments on MIMIC-III and eICU datasets demonstrate Patient Forest outperforms existing machine learning models, especially when the training data are limited. Additionally, a qualitative evaluation of Patient Forest is conducted by visualising the learnt representations in 2D space using the t-SNE, which further confirms the effectiveness of the proposed model in learning EHR representations

    Machine learning in the social and health sciences

    Get PDF
    The uptake of machine learning (ML) approaches in the social and health sciences has been rather slow, and research using ML for social and health research questions remains fragmented. This may be due to the separate development of research in the computational/data versus social and health sciences as well as a lack of accessible overviews and adequate training in ML techniques for non data science researchers. This paper provides a meta-mapping of research questions in the social and health sciences to appropriate ML approaches, by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, and causal inference to common research goals, such as estimating prevalence of adverse health or social outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes. This meta-mapping aims at overcoming disciplinary barriers and starting a fluid dialogue between researchers from the social and health sciences and methodologically trained researchers. Such mapping may also help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences, and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research

    Monotonic Gaussian Process for Spatio-Temporal Disease Progression Modeling in Brain Imaging Data

    Get PDF
    We introduce a probabilistic generative model for disentangling spatio-temporal disease trajectories from series of high-dimensional brain images. The model is based on spatio-temporal matrix factorization, where inference on the sources is constrained by anatomically plausible statistical priors. To model realistic trajectories, the temporal sources are defined as monotonic and time-reparametrized Gaussian Processes. To account for the non-stationarity of brain images, we model the spatial sources as sparse codes convolved at multiple scales. The method was tested on synthetic data favourably comparing with standard blind source separation approaches. The application on large-scale imaging data from a clinical study allows to disentangle differential temporal progression patterns mapping brain regions key to neurodegeneration, while revealing a disease-specific time scale associated to the clinical diagnosis

    Monotonic Gaussian Process for Spatio-Temporal Disease Progression Modeling in Brain Imaging Data

    Get PDF
    International audienceWe introduce a probabilistic generative model for disentangling spatio-temporal disease trajectories from series of high-dimensional brain images. The model is based on spatio-temporal matrix factorization, where inference on the sources is constrained by anatomically plausible statistical priors. To model realistic trajectories, the temporal sources are defined as monotonic and time-reparametrized Gaussian Processes. To account for the non-stationarity of brain images, we model the spatial sources as sparse codes convolved at multiple scales. The method was tested on synthetic data favourably comparing with standard blind source separation approaches. The application on large-scale imaging data from a clinical study allows to disentangle differential temporal progression patterns mapping brain regions key to neurodegeneration, while revealing a disease-specific time scale associated to the clinical diagnosis

    Predicting brain age from functional connectivity in symptomatic and preclinical Alzheimer disease

    Get PDF
    Brain-predicted age quantifies apparent brain age compared to normative neuroimaging trajectories. Advanced brain-predicted age has been well established in symptomatic Alzheimer disease (AD), but is underexplored in preclinical AD. Prior brain-predicted age studies have typically used structural MRI, but resting-state functional connectivity (FC) remains underexplored. Our model predicted age from FC in 391 cognitively normal, amyloid-negative controls (ages 18-89). We applied the trained model to 145 amyloid-negative, 151 preclinical AD, and 156 symptomatic AD participants to test group differences. The model accurately predicted age in the training set. FC-predicted brain age gaps (FC-BAG) were significantly older in symptomatic AD and significantly younger in preclinical AD compared to controls. There was minimal correspondence between networks predictive of age and AD. Elevated FC-BAG may reflect network disruption during symptomatic AD. Reduced FC-BAG in preclinical AD was opposite to the expected direction, and may reflect a biphasic response to preclinical AD pathology or may be driven by inconsistency between age-related vs. AD-related networks. Overall, FC-predicted brain age may be a sensitive AD biomarker
    corecore