56 research outputs found

    Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

    Get PDF
    <div><p>We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.</p></div

    Results for data 2 of the facial image data.

    No full text
    <p>Contingency table of the true labels (useid) and yielded clustering of the multiple co-clustering (Mul), COALA, decorrelated <i>K</i>-means (DecK), and restricted multiple (rMul) method from (a) to (d). T1 and T2 are true classifications (an2i, at33); C1, C2, C3 and C4 are yielded clusters.</p

    Prediction of clinical depression scores and detection of changes in whole-brain using resting-state functional MRI data with partial least squares regression

    Get PDF
    <div><p>In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, <i>the right superior frontal gyrus</i>, and <i>the superior motor area</i>.</p></div

    Samples from image datasets for person ‘an2i’ and ‘at33’.

    No full text
    <p>Pixels surrounded by color boxes are selected features that yielded relevant sample clustering to useid in data2. Image configurations are (‘an2i’, non sunglass, straight), (‘at33’, non sunglass, straight), ‘an2i’, sunglass, left), and (‘at33’, sunglass, left), respectively. Expression is neutral for all samples. In these examples, the multiple clustering method correctly identified these persons.</p

    Relative performance of gLASSO, sgLASSO and SVM, depended on the dataset, while sLASSO and Random Forest were generally outperformed by the other algorithms.

    No full text
    <p>(a) semantic verbal fluency, (b) phonological verbal fluency and (c) combined datasets. Classification performance was significantly different between all algorithms and significantly higher for each algorithm with the combined dataset (<i>p</i> < 0.001, u-test).</p

    Comparison of predicted performance by means of the root mean squared errors.

    No full text
    <p>Linear and kernel variants of PLS achieved significantly better performance than did OLS in all clinical scores. Subject age was used as the output along with clinical scores (output-age model).</p
    • …
    corecore