87 research outputs found

    Clustered blockwise PCA for representing visual data

    Full text link

    Covariance and correlation analysis of resting state functional magnetic resonance imaging data acquired in a clinical trial of mindfulness-based stress reduction and exercise in older individuals

    Get PDF
    We describe and apply novel methodology for whole-brain analysis of resting state fMRI functional connectivity data, combining conventional multi-channel Pearson correlation with covariance analysis. Unlike correlation, covariance analysis preserves signal amplitude information, which feature of fMRI time series may carry physiological significance. Additionally, we demonstrate that dimensionality reduction of the fMRI data offers several computational advantages including projection onto a space of manageable dimension, enabling linear operations on functional connectivity measures and exclusion of variance unrelated to resting state network structure. We show that group-averaged, dimensionality reduced, covariance and correlation matrices are related, to reasonable approximation, by a single scalar factor. We apply this methodology to the analysis of a large, resting state fMRI data set acquired in a prospective, controlled study of mindfulness training and exercise in older, sedentary participants at risk for developing cognitive decline. Results show marginally significant effects of both mindfulness training and exercise in both covariance and correlation measures of functional connectivity

    Cocaine Use Prediction with Tensor-based Machine Learning on Multimodal MRI Connectome Data

    Full text link
    This paper considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study utilized functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the datasets were transformed into tensor form. We developed a tensor-based unsupervised machine learning algorithm to reduce the size of the data tensor from 275275 (individuals) ×2\times 2 (fMRI and dMRI) ×246\times 246 (ROIs) ×246\times 246 (ROIs) to 275275 (individuals) ×2\times 2 (fMRI and dMRI) ×6\times 6 (clusters) ×6\times 6 (clusters). This was achieved by applying the high-order Lloyd algorithm to group the ROI data into 6 clusters. Features were extracted from the reduced tensor and combined with demographic features (age, gender, race, and HIV status). The resulting dataset was used to train a Catboost model using subsampling and nested cross-validation techniques, which achieved a prediction accuracy of 0.857 for identifying cocaine users. The model was also compared with other models, and the feature importance of the model was presented. Overall, this study highlights the potential for using tensor-based machine learning algorithms to predict cocaine use based on MRI connectomic data and presents a promising approach for identifying individuals at risk of substance abuse

    Blockwise simple component analysis via rotation, constraints or penalties, with an application to product × attribute × panelist data

    Get PDF
    Sensory profiling data consisting of judgements on a number of products with respect to a number of attributes by a number of panelists can be summarized in various ways. Besides finding components describing the main product features, there is an interest in individual panelist behavior. Earlier methods identify this by means of separate PCAs, Procrustes analyses, or three-way component methods, but these give only global comparisons of panelists. In the present paper, methods that can distinguish panelist behavior related to separate attributes, are described. These methods model the data in such a way that blocks of loadings pertaining to the attributes are either small or large. At the same time, one can zoom in on the loadings for panelists within each block of loadings associated with an attribute to inspect differences in panelist behavior. Two types of methods have been proposed for this earlier (rotation to simple blocks and penalizing blocks of loadings), and a third one is proposed in the present paper (constraining blocks of loadings to zero). The new approach is compared here to the other two methods. It is found that the rotation and constraints approaches work about equally well and better than the penalty approach. However, the rotation approach offers richer panelist behavior information, as is illustrated by the analysis of empirical data. It is also shown how, in this example, the reliability of idiosyncratic panelist behavior indicators can be evaluated.status: publishe

    Paragraph: A graph-based structural variant genotyper for short-read sequence data

    Get PDF
    Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies. © 2019 The Author(s)

    EXPLOITING LOW-DIMENSIONAL STRUCTURES IN MOTION PROBLEMS.

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Covariance-based vs. correlation-based functional connectivity dissociates healthy aging from Alzheimer disease

    Get PDF
    Prior studies of aging and Alzheimer disease have evaluated resting state functional connectivity (FC) using either seed-based correlation (SBC) or independent component analysis (ICA), with a focus on particular functional systems. SBC and ICA both are insensitive to differences in signal amplitude. At the same time, accumulating evidence indicates that the amplitude of spontaneous BOLD signal fluctuations is physiologically meaningful. We systematically compared covariance-based FC, which is sensitive to amplitude, vs. correlation-based FC, which is not, in affected individuals and controls drawn from two cohorts of participants including autosomal dominant Alzheimer disease (ADAD), late onset Alzheimer disease (LOAD), and age-matched controls. Functional connectivity was computed over 222 regions of interest and group differences were evaluated in terms of components projected onto a space of lower dimension. Our principal observations are: (1) Aging is associated with global loss of resting state fMRI signal amplitude that is approximately uniform across resting state networks. (2) Thus, covariance FC measures decrease with age whereas correlation FC is relatively preserved in healthy aging. (3) In contrast, symptomatic ADAD and LOAD both lead to loss of spontaneous activity amplitude as well as severely degraded correlation structure. These results demonstrate a double dissociation between age vs. Alzheimer disease and the amplitude vs. correlation structure of resting state BOLD signals. Modeling results suggest that the AD-associated loss of correlation structure is attributable to a relative increase in the fraction of locally restricted as opposed to widely shared variance

    A Novel Hybrid Dimensionality Reduction Method using Support Vector Machines and Independent Component Analysis

    Get PDF
    Due to the increasing demand for high dimensional data analysis from various applications such as electrocardiogram signal analysis and gene expression analysis for cancer detection, dimensionality reduction becomes a viable process to extracts essential information from data such that the high-dimensional data can be represented in a more condensed form with much lower dimensionality to both improve classification accuracy and reduce computational complexity. Conventional dimensionality reduction methods can be categorized into stand-alone and hybrid approaches. The stand-alone method utilizes a single criterion from either supervised or unsupervised perspective. On the other hand, the hybrid method integrates both criteria. Compared with a variety of stand-alone dimensionality reduction methods, the hybrid approach is promising as it takes advantage of both the supervised criterion for better classification accuracy and the unsupervised criterion for better data representation, simultaneously. However, several issues always exist that challenge the efficiency of the hybrid approach, including (1) the difficulty in finding a subspace that seamlessly integrates both criteria in a single hybrid framework, (2) the robustness of the performance regarding noisy data, and (3) nonlinear data representation capability. This dissertation presents a new hybrid dimensionality reduction method to seek projection through optimization of both structural risk (supervised criterion) from Support Vector Machine (SVM) and data independence (unsupervised criterion) from Independent Component Analysis (ICA). The projection from SVM directly contributes to classification performance improvement in a supervised perspective whereas maximum independence among features by ICA construct projection indirectly achieving classification accuracy improvement due to better intrinsic data representation in an unsupervised perspective. For linear dimensionality reduction model, I introduce orthogonality to interrelate both projections from SVM and ICA while redundancy removal process eliminates a part of the projection vectors from SVM, leading to more effective dimensionality reduction. The orthogonality-based linear hybrid dimensionality reduction method is extended to uncorrelatedness-based algorithm with nonlinear data representation capability. In the proposed approach, SVM and ICA are integrated into a single framework by the uncorrelated subspace based on kernel implementation. Experimental results show that the proposed approaches give higher classification performance with better robustness in relatively lower dimensions than conventional methods for high-dimensional datasets
    corecore