11 research outputs found

    Community detection in networks: Structural communities versus ground truth

    Full text link
    Algorithms to find communities in networks rely just on structural information and search for cohesive subsets of nodes. On the other hand, most scholars implicitly or explicitly assume that structural communities represent groups of nodes with similar (non-topological) properties or functions. This hypothesis could not be verified, so far, because of the lack of network datasets with information on the classification of the nodes. We show that traditional community detection methods fail to find the metadata groups in many large networks. Our results show that there is a marked separation between structural communities and metadata groups, in line with recent findings. That means that either our current modeling of community structure has to be substantially modified, or that metadata groups may not be recoverable from topology alone.Comment: 21 pages, 19 figure

    Benchmark model to assess community structure in evolving networks

    Full text link
    Detecting the time evolution of the community structure of networks is crucial to identify major changes in the internal organization of many complex systems, which may undergo important endogenous or exogenous events. This analysis can be done in two ways: considering each snapshot as an independent community detection problem or taking into account the whole evolution of the network. In the first case, one can apply static methods on the temporal snapshots, which correspond to configurations of the system in short time windows, and match afterwards the communities across layers. Alternatively, one can develop dedicated dynamic procedures, so that multiple snapshots are simultaneously taken into account while detecting communities, which allows us to keep memory of the flow. To check how well a method of any kind could capture the evolution of communities, suitable benchmarks are needed. Here we propose a model for generating simple dynamic benchmark graphs, based on stochastic block models. In them, the time evolution consists of a periodic oscillation of the system's structure between configurations with built-in community structure. We also propose the extension of quality comparison indices to the dynamic scenario.Comment: 11 pages, 7 figures, 3 table

    Inference of hidden structures in complex physical systems by multi-scale clustering

    Full text link
    We survey the application of a relatively new branch of statistical physics--"community detection"-- to data mining. In particular, we focus on the diagnosis of materials and automated image segmentation. Community detection describes the quest of partitioning a complex system involving many elements into optimally decoupled subsets or communities of such elements. We review a multiresolution variant which is used to ascertain structures at different spatial and temporal scales. Significant patterns are obtained by examining the correlations between different independent solvers. Similar to other combinatorial optimization problems in the NP complexity class, community detection exhibits several phases. Typically, illuminating orders are revealed by choosing parameters that lead to extremal information theory correlations.Comment: 25 pages, 16 Figures; a review of earlier work

    Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction.

    Get PDF
    Prostate cancer is a highly heritable disease with large disparities in incidence rates across ancestry populations. We conducted a multiancestry meta-analysis of prostate cancer genome-wide association studies (107,247 cases and 127,006 controls) and identified 86 new genetic risk variants independently associated with prostate cancer risk, bringing the total to 269 known risk variants. The top genetic risk score (GRS) decile was associated with odds ratios that ranged from 5.06 (95% confidence interval (CI), 4.84-5.29) for men of European ancestry to 3.74 (95% CI, 3.36-4.17) for men of African ancestry. Men of African ancestry were estimated to have a mean GRS that was 2.18-times higher (95% CI, 2.14-2.22), and men of East Asian ancestry 0.73-times lower (95% CI, 0.71-0.76), than men of European ancestry. These findings support the role of germline variation contributing to population differences in prostate cancer risk, with the GRS offering an approach for personalized risk prediction

    Characterizing prostate cancer risk through multi-ancestry genome-wide discovery of 187 novel risk variants.

    No full text
    The transferability and clinical value of genetic risk scores (GRSs) across populations remain limited due to an imbalance in genetic studies across ancestrally diverse populations. Here we conducted a multi-ancestry genome-wide association study of 156,319 prostate cancer cases and 788,443 controls of European, African, Asian and Hispanic men, reflecting a 57% increase in the number of non-European cases over previous prostate cancer genome-wide association studies. We identified 187 novel risk variants for prostate cancer, increasing the total number of risk variants to 451. An externally replicated multi-ancestry GRS was associated with risk that ranged from 1.8 (per standard deviation) in African ancestry men to 2.2 in European ancestry men. The GRS was associated with a greater risk of aggressive versus non-aggressive disease in men of African ancestry (P = 0.03). Our study presents novel prostate cancer susceptibility loci and a GRS with effective risk stratification across ancestry groups
    corecore