2,202 research outputs found

    Element-centric clustering comparison unifies overlaps and hierarchy

    Full text link
    Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for many tasks such as clustering evaluation, consensus clustering, and tracking the temporal evolution of clusters. In particular, the extrinsic evaluation of clustering methods requires comparing the uncovered clusterings to planted clusterings or known metadata. Yet, as we demonstrate, existing clustering comparison measures have critical biases which undermine their usefulness, and no measure accommodates both overlapping and hierarchical clusterings. Here we unify the comparison of disjoint, overlapping, and hierarchically structured clusterings by proposing a new element-centric framework: elements are compared based on the relationships induced by the cluster structure, as opposed to the traditional cluster-centric philosophy. We demonstrate that, in contrast to standard clustering similarity measures, our framework does not suffer from critical biases and naturally provides unique insights into how the clusterings differ. We illustrate the strengths of our framework by revealing new insights into the organization of clusters in two applications: the improved classification of schizophrenia based on the overlapping and hierarchical community structure of fMRI brain networks, and the disentanglement of various social homophily factors in Facebook social networks. The universality of clustering suggests far-reaching impact of our framework throughout all areas of science

    Choosing Wavelet Methods, Filters, and Lengths for Functional Brain Network Construction

    Full text link
    Wavelet methods are widely used to decompose fMRI, EEG, or MEG signals into time series representing neurophysiological activity in fixed frequency bands. Using these time series, one can estimate frequency-band specific functional connectivity between sensors or regions of interest, and thereby construct functional brain networks that can be examined from a graph theoretic perspective. Despite their common use, however, practical guidelines for the choice of wavelet method, filter, and length have remained largely undelineated. Here, we explicitly explore the effects of wavelet method (MODWT vs. DWT), wavelet filter (Daubechies Extremal Phase, Daubechies Least Asymmetric, and Coiflet families), and wavelet length (2 to 24) - each essential parameters in wavelet-based methods - on the estimated values of network diagnostics and in their sensitivity to alterations in psychiatric disease. We observe that the MODWT method produces less variable estimates than the DWT method. We also observe that the length of the wavelet filter chosen has a greater impact on the estimated values of network diagnostics than the type of wavelet chosen. Furthermore, wavelet length impacts the sensitivity of the method to detect differences between health and disease and tunes classification accuracy. Collectively, our results suggest that the choice of wavelet method and length significantly alters the reliability and sensitivity of these methods in estimating values of network diagnostics drawn from graph theory. They furthermore demonstrate the importance of reporting the choices utilized in neuroimaging studies and support the utility of exploring wavelet parameters to maximize classification accuracy in the development of biomarkers of psychiatric disease and neurological disorders.Comment: working pape

    Non-parametric Bayesian modeling of complex networks

    Full text link
    Modeling structure in complex networks using Bayesian non-parametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This paper provides a gentle introduction to non-parametric Bayesian modeling of complex networks: Using an infinite mixture model as running example we go through the steps of deriving the model as an infinite limit of a finite parametric model, inferring the model parameters by Markov chain Monte Carlo, and checking the model's fit and predictive performance. We explain how advanced non-parametric models for complex networks can be derived and point out relevant literature

    Topics in social network analysis and network science

    Full text link
    This chapter introduces statistical methods used in the analysis of social networks and in the rapidly evolving parallel-field of network science. Although several instances of social network analysis in health services research have appeared recently, the majority involve only the most basic methods and thus scratch the surface of what might be accomplished. Cutting-edge methods using relevant examples and illustrations in health services research are provided

    Latent class analysis by regularized spectral clustering

    Full text link
    The latent class model is a powerful tool for identifying latent classes within populations that share common characteristics for categorical data in social, psychological, and behavioral sciences. In this article, we propose two new algorithms to estimate a latent class model for categorical data. Our algorithms are developed by using a newly defined regularized Laplacian matrix calculated from the response matrix. We provide theoretical convergence rates of our algorithms by considering a sparsity parameter and show that our algorithms stably yield consistent latent class analysis under mild conditions. Additionally, we propose a metric to capture the strength of latent class analysis and several procedures designed based on this metric to infer how many latent classes one should use for real-world categorical data. The efficiency and accuracy of our algorithms are verified by extensive simulated experiments, and we further apply our algorithms to real-world categorical data with promising results.Comment: 22 pages, 7 figures, 2 table
    • …
    corecore