68 research outputs found

    Advancing Statistical Inference For Population Studies In Neuroimaging Using Machine Learning

    Get PDF
    Modern neuroimaging techniques allow us to investigate the brain in vivo and in high resolution, providing us with high dimensional information regarding the structure and the function of the brain in health and disease. Statistical analysis techniques transform this rich imaging information into accessible and interpretable knowledge that can be used for investigative as well as diagnostic and prognostic purposes. A prevalent area of research in neuroimaging is group comparison, i.e., the comparison of the imaging data of two groups (e.g. patients vs. healthy controls or people who respond to treatment vs. people who don\u27t) to identify discriminative imaging patterns that characterize different conditions. In recent years, the neuroimaging community has adopted techniques from mathematics, statistics, and machine learning to introduce novel methodologies targeting the improvement of our understanding of various neuropsychiatric and neurodegenerative disorders. However, existing statistical methods are limited by their reliance on ad-hoc assumptions regarding the homogeneity of disease effect, spatial properties of the underlying signal and the covariate structure of data, which imposes certain constraints about the sampling of datasets. 1. First, the overarching assumption behind most analytical tools, which are commonly used in neuroimaging studies, is that there is a single disease effect that differentiates the patients from controls. In reality, however, the disease effect may be heterogeneously expressed across the patient population. As a consequence, when searching for a single imaging pattern that characterizes the difference between healthy controls and patients, we may only get a partial or incomplete picture of the disease effect. 2. Second, and importantly, most analyses assume a uniform shape and size of disease effect. As a consequence, a common step in most neuroimaging analyses it to apply uniform smoothing of the data to aggregate regional information to each voxel to improve the signal to noise ratio. However, the shape and size of the disease patterns may not be uniformly represented across the brain. 3. Lastly, in practical scenarios, imaging datasets commonly include variations due to multiple covariates, which often have effects that overlap with the searched disease effects. To minimize the covariate effects, studies are carefully designed by appropriately matching the populations under observation. The difficulty of this task is further exacerbated by the advent of big data analyses that often entail the aggregation of large datasets collected across many clinical sites. The goal of this thesis is to address each of the aforementioned assumptions and limitations by introducing robust mathematical formulations, which are founded on multivariate machine learning techniques that integrate discriminative and generative approaches. Specifically, 1. First, we introduce an algorithm termed HYDRA which stands for heterogeneity through discriminative analysis. This method parses the heterogeneity in neuroimaging studies by simultaneously performing clustering and classification by use of piecewise linear decision boundaries. 2. Second, we propose to perform regionally linear multivariate discriminative statistical mapping (MIDAS) toward finding the optimal level of variable smoothing across the brain anatomy and tease out group differences in neuroimaging datasets. This method makes use of overlapping regional discriminative filters to approximate a matched filter that best delineates the underlying disease effect. 3. Lastly, we develop a method termed generative discriminative machines (GDM) toward reducing the effect of confounds in biased samples. The proposed method solves for a discriminative model that can also optimally generate the data when taking into account the covariate structure. We extensively validated the performance of the developed frameworks in the presence of diverse types of simulated scenarios. Furthermore, we applied our methods on a large number of clinical datasets that included structural and functional neuroimaging data as well as genetic data. Specifically, HYDRA was used for identifying distinct subtypes of Alzheimer\u27s Disease. MIDAS was applied for identifying the optimally discriminative patterns that differentiated between truth-telling and lying functional tasks. GDM was applied on a multi-site prediction setting with severely confounded samples. Our promising results demonstrate the potential of our methods to advance neuroimaging analysis beyond the set of assumptions that limit its capacity and improve statistical power

    Computational Methods for Analysis of Resting State Functional Connectivity and Their Application to Study of Aging

    Get PDF
    The functional organization of the brain and its variability over the life-span can be studied using resting state functional MRI (rsfMRI). It can be used to define a macro-connectome\u27 describing functional interactions in the brain at the scale of major brain regions, facilitating the description of large-scale functional systems and their change over the lifespan. The connectome typically consists of thousands of links between hundreds of brain regions, making subsequent group-level analyses difficult. Furthermore, existing methods for group-level analyses are not equipped to identify heterogeneity in patient or otherwise affected populations. In this thesis, we incorporated recent advances in sparse representations for modeling spatial patterns of functional connectivity. We show that the resulting Sparse Connectivity Patterns (SCPs) are reproducible and capture major directions of variance in the data. Each SCP is associated with a scalar value that is proportional to the average connectivity within all the regions of that SCP. Thus, the SCP framework provides an interpretable basis for subsequent group-level analyses. Traditional univariate approaches are limited in their ability to detect heterogeneity in diseased/aging populations in a two-group comparison framework. To address this issue, we developed a Mixture-Of-Experts (MOE) method that combines unsupervised modeling of mixtures of distributions with supervised learning of classifiers, allowing discovery of multiple disease/aging phenotypes and the affected individuals associated with each pattern. We applied our methods to the Baltimore Longitudinal Study of Aging (BLSA), to find multiple advanced aging phenotypes. We built normative trajectories of functional and structural brain aging, which were used to identify individuals who seem resilient to aging, as well as individuals who show advanced signs of aging. Using MOE, we discovered five distinct patterns of advanced aging. Combined with neuro-cognitive data, we were able to further characterize one group as consisting of individuals with early-stage dementia. Another group had focal hippocampal atrophy, yet had higher levels of connectivity and somewhat higher cognitive performance, suggesting these individuals were recruiting their cognitive reserve to compensate for structural losses. These results demonstrate the utility of the developed methods, and pave the way for a broader understanding of the complexity of brain aging

    Fast Machine Learning Algorithms for Massive Datasets with Applications in the Biomedical Domain

    Get PDF
    The continuous increase in the size of datasets introduces computational challenges for machine learning algorithms. In this dissertation, we cover the machine learning algorithms and applications in large-scale data analysis in manufacturing and healthcare. We begin with introducing a multilevel framework to scale the support vector machine (SVM), a popular supervised learning algorithm with a few tunable hyperparameters and highly accurate prediction. The computational complexity of nonlinear SVM is prohibitive on large-scale datasets compared to the linear SVM, which is more scalable for massive datasets. The nonlinear SVM has shown to produce significantly higher classification quality on complex and highly imbalanced datasets. However, a higher classification quality requires a computationally expensive quadratic programming solver and extra kernel parameters for model selection. We introduce a generalized fast multilevel framework for regular, weighted, and instance weighted SVM that achieves similar or better classification quality compared to the state-of-the-art SVM libraries such as LIBSVM. Our framework improves the runtime more than two orders of magnitude for some of the well-known benchmark datasets. We cover multiple versions of our proposed framework and its implementation in detail. The framework is implemented using PETSc library which allows easy integration with scientific computing tasks. Next, we propose an adaptive multilevel learning framework for SVM to reduce the variance between prediction qualities across the levels, improve the overall prediction accuracy, and boost the runtime. We implement multi-threaded support to speed up the parameter fitting runtime that results in more than an order of magnitude speed-up. We design an early stopping criteria to reduce the extra computational cost when we achieve expected prediction quality. This approach provides significant speed-up, especially for massive datasets. Finally, we propose an efficient low dimensional feature extraction over massive knowledge networks. Knowledge networks are becoming more popular in the biomedical domain for knowledge representation. Each layer in knowledge networks can store the information from one or multiple sources of data. The relationships between concepts or between layers represent valuable information. The proposed feature engineering approach provides an efficient and highly accurate prediction of the relationship between biomedical concepts on massive datasets. Our proposed approach utilizes semantics and probabilities to reduce the potential search space for the exploration and learning of machine learning algorithms. The calculation of probabilities is highly scalable with the size of the knowledge network. The number of features is fixed and equivalent to the number of relationships or classes in the data. A comprehensive comparison of well-known classifiers such as random forest, SVM, and deep learning over various features extracted from the same dataset, provides an overview for performance and computational trade-offs. Our source code, documentation and parameters will be available at https://github.com/esadr/

    Discriminative Representations for Heterogeneous Images and Multimodal Data

    Get PDF
    Histology images of tumor tissue are an important diagnostic and prognostic tool for pathologists. Recently developed molecular methods group tumors into subtypes to further guide treatment decisions, but they are not routinely performed on all patients. A lower cost and repeatable method to predict tumor subtypes from histology could bring benefits to more cancer patients. Further, combining imaging and genomic data types provides a more complete view of the tumor and may improve prognostication and treatment decisions. While molecular and genomic methods capture the state of a small sample of tumor, histological image analysis provides a spatial view and can identify multiple subtypes in a single tumor. This intra-tumor heterogeneity has yet to be fully understood and its quantification may lead to future insights into tumor progression. In this work, I develop methods to learn appropriate features directly from images using dictionary learning or deep learning. I use multiple instance learning to account for intra-tumor variations in subtype during training, improving subtype predictions and providing insights into tumor heterogeneity. I also integrate image and genomic features to learn a projection to a shared space that is also discriminative. This method can be used for cross-modal classification or to improve predictions from images by also learning from genomic data during training, even if only image data is available at test time.Doctor of Philosoph

    Elucidating the efficacy and response to social cognitive training in recent-onset psychosis

    Get PDF
    Neurocognitive deficits are one of the core features of psychosis spectrum disorders (PSD), and they are predictive of poor functional outcome and negative symptoms many years later (Green, Kern, Braff, Mintz, 2000). Neurocognitive interventions (NCIs) have emerged in the last two decades as a strong potential supplementary treatment option to improve cognitive deficits and drop in functioning affecting patients with PSD. Social cognitive training (SCT) involving e.g., facial stimuli, has gained considerably more attention in recent studies than computerized NCIs, that use basic visual or auditory stimuli. This is due to the complex character of social cognition (SC) that draws on multiple brain structures involved in behaviors and perception beyond default cognitive function. SC is also tightly interlinked with psychosocial functioning. Although they are cost-effective and quite independent of clinical staff, such technological approaches as SCT are currently not integrated into routine clinical practice. Recent studies have mapped the effects of SCT in task-based studies on multiple brain regions such as the amygdala, putamen, medial prefrontal cortex, and postcentral gyrus (Ramsay MacDonald III, 2015). Yet, the degree to which alterations in brain function are associated with response to such interventions is still poorly understood. Importantly, resting-state functional connectivity (rsFC) may be a viable neuromarker as it has shown greater sensitivity in distinguishing patients from healthy controls (HC) across neuroimaging studies, and is relatively easy to administer especially in patients with acute symptoms (Kambeitz et al., 2015). In this dissertation, we employed 1) a univariate statistical approach to elucidate the efficacy of a 10-hour SCT in improving cognition, symptoms, functioning and the restoration of rsFC in patients undergoing SCT as compared to the treatment as usual (TAU) group, and 2) multivariate methods. In particular, we used a Support Vector Machine (SVM) approach to neuromonitor the recovery of rsFC in the SCT group compared to TAU. We also investigated the potential utility of rsFC as a baseline (T0) neuromarker viable of predicting role functioning approximately 2 months later. First, current findings suggest a 10-hour SCT has the capability of improving role functioning in recent-onset psychosis (ROP) patients. Second, we have shown intervention-specific rsFC changes within parts of default mode and social cognitive network. Moreover, patients with worse SC performance at T0 showed greater rsFC changes following the intervention, suggestive of a greater degree of rsFC restoration potential in patients with worse social cognitive deficits. Third, when referring to neuromonitoring results, it is important to state that only greater transition from ROP to ?HC-like? SVM decision scores, based on the resting-state modality, was paralleled by intervention specific significantly greater improvement in global cognition and attention. Finally, we were able to show the early prediction of good versus poor role functioning is feasible at the individual subject level using a rsFC-based linear SVM classifier with a Balanced Accuracy (BAC) of 74 %. This dissertation sheds light on the effects and feasibility of a relatively short computerized SCT, and the potential utility of multivariate pattern analysis (MVPA) for better clinical stratification of predicted treatment response based on rsFC neuromarkers

    Data Mining the Brain to Decode the Mind

    Get PDF
    In recent years, neuroscience has begun to transform itself into a “big data” enterprise with the importation of computational and statistical techniques from machine learning and informatics. In addition to their translational applications such as brain-computer interfaces and early diagnosis of neuropathology, these tools promise to advance new solutions to longstanding theoretical quandaries. Here I critically assess whether these promises will pay off, focusing on the application of multivariate pattern analysis (MVPA) to the problem of reverse inference. I argue that MVPA does not inherently provide a new answer to classical worries about reverse inference, and that the method faces pervasive interpretive problems of its own. Further, the epistemic setting of MVPA and other decoding methods contributes to a potentially worrisome shift towards prediction and away from explanation in fundamental neuroscience

    Techniques for Analysis and Motion Correction of Arterial Spin Labelling (ASL) Data from Dementia Group Studies

    Get PDF
    This investigation examines how Arterial Spin Labelling (ASL) Magnetic Resonance Imaging can be optimised to assist in the early diagnosis of diseases which cause dementia, by considering group study analysis and control of motion artefacts. ASL can produce quantitative cerebral blood flow maps noninvasively - without a radioactive or paramagnetic contrast agent being injected. ASL studies have already shown perfusion changes which correlate with the metabolic changes measured by Positron Emission Tomography in the early stages of dementia, before structural changes are evident. But the clinical use of ASL for dementia diagnosis is not yet widespread, due to a combination of a lack of protocol consistency, lack of accepted biomarkers, and sensitivity to motion artefacts. Applying ASL to improve early diagnosis of dementia may allow emerging treatments to be administered earlier, thus with greater effect. In this project, ASL data acquired from two separate patient cohorts ( (i) Young Onset Alzheimer’s Disease (YOAD) study, acquired at Queen Square; and (ii) Incidence and RISk of dementia (IRIS) study, acquired in Rotterdam) were analysed using a pipeline optimised for each acquisition protocol, with several statistical approaches considered including support-vector machine learning. Machine learning was also applied to improve the compatibility of the two studies, and to demonstrate a novel method to disentangle perfusion changes measured by ASL from grey matter atrophy. Also in this project, retrospective motion correction techniques for specific ASL sequences were developed, based on autofocusing and exploiting parallel imaging algorithms. These were tested using a specially developed simulation of the 3D GRASE ASL protocol, which is capable of modelling motion. The parallel imaging based approach was verified by performing a specifically designed MRI experiment involving deliberate motion, then applying the algorithm to demonstrably reduce motion artefacts retrospectively

    Statistical Learning Methods for Electronic Health Record Data

    Full text link
    In the current era of electronic health records (EHR), use of data to make informed clinical decisions is at an all-time high. Although the collection, upkeep and accessibility of EHR data continues to grow, statistical methodology focused on aiding real-time clinical decision making is lacking. Improved decision making tools generally lead to improved patient outcomes and lower healthcare costs. In this dissertation, we propose three statistical learning methods to improve clinical decision making based on EHR data. In the first chapter we propose a new classifier: SVM-CART, that combines features of Support Vector Machines (SVM) and Classification and Regression Trees (CART) to produce a flexible classifier that outperforms either method in terms of prediction accuracy and ease of use. The method is especially powerful in situations where the disease-exposure mechanisms may be different across subgroups of the population. Through simulation, under settings with high levels of interaction, the SVM-CART classifier resulted in significant prediction accuracy improvements. We illustrate our method to diagnose neuropathy using various components of the metabolic syndrome. In predicting neuropathy, SVM-CART outperformed CART in terms of prediction accuracy and provided improved interpretability compared to SVM. In the second chapter, we develop regression tree and ensemble methods for multivariate outcomes. We propose two general approaches to develop multivariate regression trees by: (1) minimizing within-node homogeneity, and (2) maximizing between-node separation. Within-node homogeneity is measured using the average Mahalanobis distance and the determinant of the covariance matrix. For between-node separation, we propose using the Mahalanobis and Euclidean distances. The proposed multivariate regression trees are illustrated using two clinical datasets of neuropathy and pediatric cardiac surgery. In high variance scenarios or when the dimension of the outcome was large, the Mahalanobis distance split trees had the best prediction performance. The determinant split trees generally had a simple structure and the Euclidean distance metrics performed well in large sample settings. In both applications, the resulting multivariate trees improve usability and validity compared to predictions made using multiple univariate regression trees. In the third chapter we develop a sequential method to make prediction using shallow (large-scale EHR) data in tandem with deep (health system specific) patient data. Specifically, we utilize machine learning based methods to first give prediction based on a large-scale EHR, then for a select group of patients, refine prediction based on the deep EHR data. We develop a novel framework that is time and cost-effective, for identifying patient subgroups that would most benefit from a second-stage prediction refinement. Final tandem prediction is obtained by combining predictions from both the first and second stage classifiers. We apply our tandem approach to predict extubation failure for pediatric patients that have undergone a critical cardiac operation using shallow data from a national registry and deep continuously streamed data captured in the intensive care unit. Using these two EHR data sources in tandem increased our ability to identify extubation failures in terms of the area under the ROC curve (AUC: 0.639) compared to using just the national registry (AUC: 0.607) or physiologic ICU data (AUC: 0.634) alone. Additionally, identifying a specific patient subgroup for second stage prediction refinement resulted in additional prediction improvement, as opposed to giving each patient a deep-data prediction (AUC: 0.682).PHDBiostatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/149829/1/evanlr_1.pd

    Studies on using data-driven decision support systems to improve personalized medicine processes

    Full text link
    This dissertation looks at how new sources of information should be incorporated into medical decision-making processes to improve patient outcomes and reduce costs. There are three fundamental challenges that must be overcome to effectively use personalized medicine, we need to understand: 1) how best to appropriately designate which patients will receive the greatest value from these processes; 2) how physicians and caregivers interpret additional patient-specific information and how that affects their decision-making processes; and finally, (3) how to account for a patient’s ability to engage in their own healthcare decisions. The first study looks at how we can infer which patients will receive the most value from genomic testing. The difficult statistical problem is how to separate the distribution of patients, based on ex-ante factors, to identify the best candidates for personalized testing. A model was constructed to infer a healthcare provider’s decision on whether this test would provide beneficial information in selecting a patient’s medication. Model analysis shows that healthcare providers’ primary focus is to maximize patient health outcomes while considering the impact the patient’s economic welfare. The second study focuses on understanding how technology-enabled continuity of care (TECC) for Chronic Obstructive Pulmonary Disease (COPD) and Congestive Heart Failure (CHF) patients can be utilized to improve patient engagement, measured in terms of patient activation. We shed light on the fact that different types of patients garnered different levels of value from the use of TECC. The third study looks at how data-driven decision support systems can allow physicians to more accurately understand which patients are at high-risk of readmission. We look at how we can use available patient-specific information for patients admitted with CHF to more accurately identify which patients are most likely to be readmitted, and also why – whether for condition-related reasons versus for non- related reasons, allowing physicians to suggest different patient-specific readmission prevention strategies. Taken together, these three studies allow us to build a robust theory to tackle these challenges, both operational and policy-related, that need to be addressed for physicians to take advantage of the growing availability of patient-specific information to improve personalized medication processes

    Elucidating the efficacy and response to social cognitive training in recent-onset psychosis

    Get PDF
    Neurocognitive deficits are one of the core features of psychosis spectrum disorders (PSD), and they are predictive of poor functional outcome and negative symptoms many years later (Green, Kern, Braff, & Mintz, 2000). Neurocognitive interventions (NCIs) have emerged in the last two decades as a strong potential supplementary treatment option to improve cognitive deficits and drop in functioning affecting patients with PSD. Social cognitive training (SCT) involving e.g., facial stimuli, has gained considerably more attention in recent studies than computerized NCIs, that use basic visual or auditory stimuli. This is due to the complex character of social cognition (SC) that draws on multiple brain structures involved in behaviors and perception beyond default cognitive function. SC is also tightly interlinked with psychosocial functioning. Although they are cost-effective and quite independent of clinical staff, such technological approaches as SCT are currently not integrated into routine clinical practice. Recent studies have mapped the effects of SCT in task-based studies on multiple brain regions such as the amygdala, putamen, medial prefrontal cortex, and postcentral gyrus (Ramsay & MacDonald III, 2015). Yet, the degree to which alterations in brain function are associated with response to such interventions is still poorly understood. Importantly, resting-state functional connectivity (rsFC) may be a viable neuromarker as it has shown greater sensitivity in distinguishing patients from healthy controls (HC) across neuroimaging studies, and is relatively easy to administer especially in patients with acute symptoms (Kambeitz et al., 2015). In this dissertation, we employed 1) a univariate statistical approach to elucidate the efficacy of a 10-hour SCT in improving cognition, symptoms, functioning and the restoration of rsFC in patients undergoing SCT as compared to the treatment as usual (TAU) group, and 2) multivariate methods. In particular, we used a Support Vector Machine (SVM) approach to neuromonitor the recovery of rsFC in the SCT group compared to TAU. We also investigated the potential utility of rsFC as a baseline (T0) neuromarker viable of predicting role functioning approximately 2 months later. First, current findings suggest a 10-hour SCT has the capability of improving role functioning in recent-onset psychosis (ROP) patients. Second, we have shown intervention-specific rsFC changes within parts of default mode and social cognitive network. Moreover, patients with worse SC performance at T0 showed greater rsFC changes following the intervention, suggestive of a greater degree of rsFC restoration potential in patients with worse social cognitive deficits. Third, when referring to neuromonitoring results, it is important to state that only greater transition from ROP to “HC-like” SVM decision scores, based on the resting-state modality, was paralleled by intervention specific significantly greater improvement in global cognition and attention. Finally, we were able to show the early prediction of good versus poor role functioning is feasible at the individual subject level using a rsFC-based linear SVM classifier with a Balanced Accuracy (BAC) of 74 %. This dissertation sheds light on the effects and feasibility of a relatively short computerized SCT, and the potential utility of multivariate pattern analysis (MVPA) for better clinical stratification of predicted treatment response based on rsFC neuromarkers
    • …
    corecore