103,568 research outputs found

    Time Series Modeling of Irregularly Sampled Multivariate Clinical Data

    Get PDF
    Building of an accurate predictive model of clinical time series for a patient is critical for understanding of the patient condition, its dynamics, and optimal patient management. Unfortunately, this process is challenging because of: (1) multivariate behaviors: the real-world dynamics is multivariate and it is better described by multivariate time series (MTS); (2) irregular samples: sequential observations are collected at different times, and the time elapsed between two consecutive observations may vary; and (3) patient variability: clinical MTS vary from patient to patient and an individual patient may exhibit short-term variability reflecting the different events affecting the care and patient state. In this dissertation, we investigate the different ways of developing and refining forecasting models from the irregularly sampled clinical MTS data collection. First, we focus on the refinements of a popular model for MTS analysis: the linear dynamical system (LDS) (a.k.a Kalman filter) and its application to MTS forecasting. We propose (1) a regularized LDS learning framework which automatically shuts down LDSs' spurious and unnecessary dimensions, and consequently, prevents the overfitting problem given a small amount of data; and (2) a generalized LDS learning framework via matrix factorization, which allows various constraints can be easily incorporated to guide the learning process. Second, we study ways of modeling irregularly sampled univariate clinical time series. We develop a new two-layer hierarchical dynamical system model for irregularly sampled clinical time series prediction. We demonstrate that our new system adapts better to irregular samples and it supports more accurate predictions. Finally, we propose, develop and experiment with two personalized forecasting frameworks for modeling and predicting clinical MTS of an individual patient. The first approach relies on model adaptation techniques. It calibrates the population based model's predictions with patient specific residual models, which are learned from the difference between the patient observations and the population based model's predictions. The second framework relies on adaptive model selection strategies to combine advantages of the population based, patient specific and short-term individualized predictive models. We demonstrate the benefits and advantages of the aforementioned frameworks on synthetic data sets, public time series data sets and clinical data extracted from EHRs

    Counterfactual Image Synthesis for Discovery of Personalized Predictive Image Markers

    Full text link
    The discovery of patient-specific imaging markers that are predictive of future disease outcomes can help us better understand individual-level heterogeneity of disease evolution. In fact, deep learning models that can provide data-driven personalized markers are much more likely to be adopted in medical practice. In this work, we demonstrate that data-driven biomarker discovery can be achieved through a counterfactual synthesis process. We show how a deep conditional generative model can be used to perturb local imaging features in baseline images that are pertinent to subject-specific future disease evolution and result in a counterfactual image that is expected to have a different future outcome. Candidate biomarkers, therefore, result from examining the set of features that are perturbed in this process. Through several experiments on a large-scale, multi-scanner, multi-center multiple sclerosis (MS) clinical trial magnetic resonance imaging (MRI) dataset of relapsing-remitting (RRMS) patients, we demonstrate that our model produces counterfactuals with changes in imaging features that reflect established clinical markers predictive of future MRI lesional activity at the population level. Additional qualitative results illustrate that our model has the potential to discover novel and subject-specific predictive markers of future activity.Comment: Accepted to the MIABID workshop at MICCAI 202

    Personalized Pancreatic Tumor Growth Prediction via Group Learning

    Full text link
    Tumor growth prediction, a highly challenging task, has long been viewed as a mathematical modeling problem, where the tumor growth pattern is personalized based on imaging and clinical data of a target patient. Though mathematical models yield promising results, their prediction accuracy may be limited by the absence of population trend data and personalized clinical characteristics. In this paper, we propose a statistical group learning approach to predict the tumor growth pattern that incorporates both the population trend and personalized data, in order to discover high-level features from multimodal imaging data. A deep convolutional neural network approach is developed to model the voxel-wise spatio-temporal tumor progression. The deep features are combined with the time intervals and the clinical factors to feed a process of feature selection. Our predictive model is pretrained on a group data set and personalized on the target patient data to estimate the future spatio-temporal progression of the patient's tumor. Multimodal imaging data at multiple time points are used in the learning, personalization and inference stages. Our method achieves a Dice coefficient of 86.8% +- 3.6% and RVD of 7.9% +- 5.4% on a pancreatic tumor data set, outperforming the DSC of 84.4% +- 4.0% and RVD 13.9% +- 9.8% obtained by a previous state-of-the-art model-based method

    Learning Tasks for Multitask Learning: Heterogenous Patient Populations in the ICU

    Full text link
    Machine learning approaches have been effective in predicting adverse outcomes in different clinical settings. These models are often developed and evaluated on datasets with heterogeneous patient populations. However, good predictive performance on the aggregate population does not imply good performance for specific groups. In this work, we present a two-step framework to 1) learn relevant patient subgroups, and 2) predict an outcome for separate patient populations in a multi-task framework, where each population is a separate task. We demonstrate how to discover relevant groups in an unsupervised way with a sequence-to-sequence autoencoder. We show that using these groups in a multi-task framework leads to better predictive performance of in-hospital mortality both across groups and overall. We also highlight the need for more granular evaluation of performance when dealing with heterogeneous populations.Comment: KDD 201

    Identifying the presence and severity of dementia by applying interpretable machine learning techniques on structured clinical records

    Get PDF
    Background: Dementia develops as cognitive abilities deteriorate, and early detection is critical for effective preventive interventions. However, mainstream diagnostic tests and screening tools, such as CAMCOG and MMSE, often fail to detect dementia accurately. Various graph-based or feature-dependent prediction and progression models have been proposed. Whenever these models exploit information in the patients’ Electronic Medical Records, they represent promising options to identify the presence and severity of dementia more precisely. Methods: The methods presented in this paper aim to address two problems related to dementia: (a) Basic diagnosis: identifying the presence of dementia in individuals, and (b) Severity diagnosis: predicting the presence of dementia, as well as the severity of the disease. We formulate these two tasks as classification problems and address them using machine learning models based on random forests and decision tree, analysing structured clinical data from an elderly population cohort. We perform a hybrid data curation strategy in which a dementia expert is involved to verify that curation decisions are meaningful. We then employ the machine learning algorithms that classify individual episodes into a specific dementia class. Decision trees are also used for enhancing the explainability of decisions made by prediction models, allowing medical experts to identify the most crucial patient features and their threshold values for the classification of dementia. Results: Our experiment results prove that baseline arithmetic or cognitive tests, along with demographic features, can predict dementia and its severity with high accuracy. In specific, our prediction models have reached an average f1-score of 0.93 and 0.81 for problems (a) and (b), respectively. Moreover, the decision trees produced for the two issues empower the interpretability of the prediction models. Conclusions: This study proves that there can be an accurate estimation of the existence and severity of dementia disease by analysing various electronic medical record features and cognitive tests from the episodes of the elderly population. Moreover, a set of decision rules may comprise the building blocks for an efficient patient classification. Relevant clinical and screening test features (e.g. simple arithmetic or animal fluency tasks) represent precise predictors without calculating the scores of mainstream cognitive tests such as MMSE and CAMCOG. Such predictive model can identify not only meaningful features, but also justifications of classification. As a result, the predictive power of machine learning models over curated clinical data is proved, paving the path for a more accurate diagnosis of dementia
    • …
    corecore