11 research outputs found
Community detection in networks: Structural communities versus ground truth
Algorithms to find communities in networks rely just on structural
information and search for cohesive subsets of nodes. On the other hand, most
scholars implicitly or explicitly assume that structural communities represent
groups of nodes with similar (non-topological) properties or functions. This
hypothesis could not be verified, so far, because of the lack of network
datasets with information on the classification of the nodes. We show that
traditional community detection methods fail to find the metadata groups in
many large networks. Our results show that there is a marked separation between
structural communities and metadata groups, in line with recent findings. That
means that either our current modeling of community structure has to be
substantially modified, or that metadata groups may not be recoverable from
topology alone.Comment: 21 pages, 19 figure
Benchmark model to assess community structure in evolving networks
Detecting the time evolution of the community structure of networks is
crucial to identify major changes in the internal organization of many complex
systems, which may undergo important endogenous or exogenous events. This
analysis can be done in two ways: considering each snapshot as an independent
community detection problem or taking into account the whole evolution of the
network. In the first case, one can apply static methods on the temporal
snapshots, which correspond to configurations of the system in short time
windows, and match afterwards the communities across layers. Alternatively, one
can develop dedicated dynamic procedures, so that multiple snapshots are
simultaneously taken into account while detecting communities, which allows us
to keep memory of the flow. To check how well a method of any kind could
capture the evolution of communities, suitable benchmarks are needed. Here we
propose a model for generating simple dynamic benchmark graphs, based on
stochastic block models. In them, the time evolution consists of a periodic
oscillation of the system's structure between configurations with built-in
community structure. We also propose the extension of quality comparison
indices to the dynamic scenario.Comment: 11 pages, 7 figures, 3 table
Inference of hidden structures in complex physical systems by multi-scale clustering
We survey the application of a relatively new branch of statistical
physics--"community detection"-- to data mining. In particular, we focus on the
diagnosis of materials and automated image segmentation. Community detection
describes the quest of partitioning a complex system involving many elements
into optimally decoupled subsets or communities of such elements. We review a
multiresolution variant which is used to ascertain structures at different
spatial and temporal scales. Significant patterns are obtained by examining the
correlations between different independent solvers. Similar to other
combinatorial optimization problems in the NP complexity class, community
detection exhibits several phases. Typically, illuminating orders are revealed
by choosing parameters that lead to extremal information theory correlations.Comment: 25 pages, 16 Figures; a review of earlier work
Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction.
Prostate cancer is a highly heritable disease with large disparities in incidence rates across ancestry populations. We conducted a multiancestry meta-analysis of prostate cancer genome-wide association studies (107,247 cases and 127,006 controls) and identified 86 new genetic risk variants independently associated with prostate cancer risk, bringing the total to 269 known risk variants. The top genetic risk score (GRS) decile was associated with odds ratios that ranged from 5.06 (95% confidence interval (CI), 4.84-5.29) for men of European ancestry to 3.74 (95% CI, 3.36-4.17) for men of African ancestry. Men of African ancestry were estimated to have a mean GRS that was 2.18-times higher (95% CI, 2.14-2.22), and men of East Asian ancestry 0.73-times lower (95% CI, 0.71-0.76), than men of European ancestry. These findings support the role of germline variation contributing to population differences in prostate cancer risk, with the GRS offering an approach for personalized risk prediction
Characterizing prostate cancer risk through multi-ancestry genome-wide discovery of 187 novel risk variants.
The transferability and clinical value of genetic risk scores (GRSs) across populations remain limited due to an imbalance in genetic studies across ancestrally diverse populations. Here we conducted a multi-ancestry genome-wide association study of 156,319 prostate cancer cases and 788,443 controls of European, African, Asian and Hispanic men, reflecting a 57% increase in the number of non-European cases over previous prostate cancer genome-wide association studies. We identified 187 novel risk variants for prostate cancer, increasing the total number of risk variants to 451. An externally replicated multi-ancestry GRS was associated with risk that ranged from 1.8 (per standard deviation) in African ancestry men to 2.2 in European ancestry men. The GRS was associated with a greater risk of aggressive versus non-aggressive disease in men of African ancestry (P = 0.03). Our study presents novel prostate cancer susceptibility loci and a GRS with effective risk stratification across ancestry groups