2,202 research outputs found
Element-centric clustering comparison unifies overlaps and hierarchy
Clustering is one of the most universal approaches for understanding complex
data. A pivotal aspect of clustering analysis is quantitatively comparing
clusterings; clustering comparison is the basis for many tasks such as
clustering evaluation, consensus clustering, and tracking the temporal
evolution of clusters. In particular, the extrinsic evaluation of clustering
methods requires comparing the uncovered clusterings to planted clusterings or
known metadata. Yet, as we demonstrate, existing clustering comparison measures
have critical biases which undermine their usefulness, and no measure
accommodates both overlapping and hierarchical clusterings. Here we unify the
comparison of disjoint, overlapping, and hierarchically structured clusterings
by proposing a new element-centric framework: elements are compared based on
the relationships induced by the cluster structure, as opposed to the
traditional cluster-centric philosophy. We demonstrate that, in contrast to
standard clustering similarity measures, our framework does not suffer from
critical biases and naturally provides unique insights into how the clusterings
differ. We illustrate the strengths of our framework by revealing new insights
into the organization of clusters in two applications: the improved
classification of schizophrenia based on the overlapping and hierarchical
community structure of fMRI brain networks, and the disentanglement of various
social homophily factors in Facebook social networks. The universality of
clustering suggests far-reaching impact of our framework throughout all areas
of science
Choosing Wavelet Methods, Filters, and Lengths for Functional Brain Network Construction
Wavelet methods are widely used to decompose fMRI, EEG, or MEG signals into
time series representing neurophysiological activity in fixed frequency bands.
Using these time series, one can estimate frequency-band specific functional
connectivity between sensors or regions of interest, and thereby construct
functional brain networks that can be examined from a graph theoretic
perspective. Despite their common use, however, practical guidelines for the
choice of wavelet method, filter, and length have remained largely
undelineated. Here, we explicitly explore the effects of wavelet method (MODWT
vs. DWT), wavelet filter (Daubechies Extremal Phase, Daubechies Least
Asymmetric, and Coiflet families), and wavelet length (2 to 24) - each
essential parameters in wavelet-based methods - on the estimated values of
network diagnostics and in their sensitivity to alterations in psychiatric
disease. We observe that the MODWT method produces less variable estimates than
the DWT method. We also observe that the length of the wavelet filter chosen
has a greater impact on the estimated values of network diagnostics than the
type of wavelet chosen. Furthermore, wavelet length impacts the sensitivity of
the method to detect differences between health and disease and tunes
classification accuracy. Collectively, our results suggest that the choice of
wavelet method and length significantly alters the reliability and sensitivity
of these methods in estimating values of network diagnostics drawn from graph
theory. They furthermore demonstrate the importance of reporting the choices
utilized in neuroimaging studies and support the utility of exploring wavelet
parameters to maximize classification accuracy in the development of biomarkers
of psychiatric disease and neurological disorders.Comment: working pape
Non-parametric Bayesian modeling of complex networks
Modeling structure in complex networks using Bayesian non-parametrics makes
it possible to specify flexible model structures and infer the adequate model
complexity from the observed data. This paper provides a gentle introduction to
non-parametric Bayesian modeling of complex networks: Using an infinite mixture
model as running example we go through the steps of deriving the model as an
infinite limit of a finite parametric model, inferring the model parameters by
Markov chain Monte Carlo, and checking the model's fit and predictive
performance. We explain how advanced non-parametric models for complex networks
can be derived and point out relevant literature
Topics in social network analysis and network science
This chapter introduces statistical methods used in the analysis of social
networks and in the rapidly evolving parallel-field of network science.
Although several instances of social network analysis in health services
research have appeared recently, the majority involve only the most basic
methods and thus scratch the surface of what might be accomplished.
Cutting-edge methods using relevant examples and illustrations in health
services research are provided
Latent class analysis by regularized spectral clustering
The latent class model is a powerful tool for identifying latent classes
within populations that share common characteristics for categorical data in
social, psychological, and behavioral sciences. In this article, we propose two
new algorithms to estimate a latent class model for categorical data. Our
algorithms are developed by using a newly defined regularized Laplacian matrix
calculated from the response matrix. We provide theoretical convergence rates
of our algorithms by considering a sparsity parameter and show that our
algorithms stably yield consistent latent class analysis under mild conditions.
Additionally, we propose a metric to capture the strength of latent class
analysis and several procedures designed based on this metric to infer how many
latent classes one should use for real-world categorical data. The efficiency
and accuracy of our algorithms are verified by extensive simulated experiments,
and we further apply our algorithms to real-world categorical data with
promising results.Comment: 22 pages, 7 figures, 2 table
- …