637 research outputs found

    Faster unfolding of communities: speeding up the Louvain algorithm

    Full text link
    Many complex networks exhibit a modular structure of densely connected groups of nodes. Usually, such a modular structure is uncovered by the optimization of some quality function. Although flawed, modularity remains one of the most popular quality functions. The Louvain algorithm was originally developed for optimizing modularity, but has been applied to a variety of methods. As such, speeding up the Louvain algorithm, enables the analysis of larger graphs in a shorter time for various methods. We here suggest to consider moving nodes to a random neighbor community, instead of the best neighbor community. Although incredibly simple, it reduces the theoretical runtime complexity from O(m)\mathcal{O}(m) to O(nlog⁡⟹k⟩)\mathcal{O}(n \log \langle k \rangle) in networks with a clear community structure. In benchmark networks, it speeds up the algorithm roughly 2-3 times, while in some real networks it even reaches 10 times faster runtimes. This improvement is due to two factors: (1) a random neighbor is likely to be in a "good" community; and (2) random neighbors are likely to be hubs, helping the convergence. Finally, the performance gain only slightly diminishes the quality, especially for modularity, thus providing a good quality-performance ratio. However, these gains are less pronounced, or even disappear, for some other measures such as significance or surprise

    Systematic analysis of agreement between metrics and peer review in the UK REF

    Get PDF
    When performing a national research assessment, some countries rely on citation metrics whereas others, such as the UK, primarily use peer review. In the influential Metric Tide report, a low agreement between metrics and peer review in the UK Research Excellence Framework (REF) was found. However, earlier studies observed much higher agreement between metrics and peer review in the REF and argued in favour of using metrics. This shows that there is considerable ambiguity in the discussion on agreement between metrics and peer review. We provide clarity in this discussion by considering four important points: (1) the level of aggregation of the analysis; (2) the use of either a size-dependent or a size-independent perspective; (3) the suitability of different measures of agreement; and (4) the uncertainty in peer review. In the context of the REF, we argue that agreement between metrics and peer review should be assessed at the institutional level rather than at the publication level. Both a size-dependent and a size-independent perspective are relevant in the REF. The interpretation of correlations may be problematic and as an alternative we therefore use measures of agreement that are based on the absolute or relative differences between metrics and peer review. To get an idea of the uncertainty in peer review, we rely on a model to bootstrap peer review outcomes. We conclude that particularly in Physics, Clinical Medicine, and Public Health, metrics agree quite well with peer review and may offer an alternative to peer review

    From Louvain to Leiden: guaranteeing well-connected communities

    Get PDF
    Community detection is often used to understand the structure of large and complex networks. One of the most popular algorithms for uncovering community structure is the so-called Louvain algorithm. We show that this algorithm has a major defect that largely went unnoticed until now: the Louvain algorithm may yield arbitrarily badly connected communities. In the worst case, communities may even be disconnected, especially when running the algorithm iteratively. In our experimental analysis, we observe that up to 25% of the communities are badly connected and up to 16% are disconnected. To address this problem, we introduce the Leiden algorithm. We prove that the Leiden algorithm yields communities that are guaranteed to be connected. In addition, we prove that, when the Leiden algorithm is applied iteratively, it converges to a partition in which all subsets of all communities are locally optimally assigned. Furthermore, by relying on a fast local move approach, the Leiden algorithm runs faster than the Louvain algorithm. We demonstrate the performance of the Leiden algorithm for several benchmark and real-world networks. We find that the Leiden algorithm is faster than the Louvain algorithm and uncovers better partitions, in addition to providing explicit guarantees

    Significant Scales in Community Structure

    Get PDF
    Many complex networks show signs of modular structure, uncovered by community detection. Although many methods succeed in revealing various partitions, it remains difficult to detect at what scale some partition is significant. This problem shows foremost in multi-resolution methods. We here introduce an efficient method for scanning for resolutions in one such method. Additionally, we introduce the notion of "significance" of a partition, based on subgraph probabilities. Significance is independent of the exact method used, so could also be applied in other methods, and can be interpreted as the gain in encoding a graph by making use of a partition. Using significance, we can determine "good" resolution parameters, which we demonstrate on benchmark networks. Moreover, optimizing significance itself also shows excellent performance. We demonstrate our method on voting data from the European Parliament. Our analysis suggests the European Parliament has become increasingly ideologically divided and that nationality plays no role.Comment: To appear in Scientific Report

    Detecting communities using asymptotical Surprise

    Full text link
    Nodes in real-world networks are repeatedly observed to form dense clusters, often referred to as communities. Methods to detect these groups of nodes usually maximize an objective function, which implicitly contains the definition of a community. We here analyze a recently proposed measure called surprise, which assesses the quality of the partition of a network into communities. In its current form, the formulation of surprise is rather difficult to analyze. We here therefore develop an accurate asymptotic approximation. This allows for the development of an efficient algorithm for optimizing surprise. Incidentally, this leads to a straightforward extension of surprise to weighted graphs. Additionally, the approximation makes it possible to analyze surprise more closely and compare it to other methods, especially modularity. We show that surprise is (nearly) unaffected by the well known resolution limit, a particular problem for modularity. However, surprise may tend to overestimate the number of communities, whereas they may be underestimated by modularity. In short, surprise works well in the limit of many small communities, whereas modularity works better in the limit of few large communities. In this sense, surprise is more discriminative than modularity, and may find communities where modularity fails to discern any structure

    Early school-leaving in the Netherlands

    Get PDF
    The role of student-, family- and school factors for early school-leaving in lower secondary educationMost studies on early school-leaving address only partial causes of why some students leave school early. This study aims to develop a more elaborate model to explain early school-leaving in lower secondary education, taking into account individual, family and school factors at the same time. By using a longitudinal dataset we are able to attribute clear causal relations between the different factors. We distinguish four groups of school-leavers, separating ‘dropouts’ (those without any qualification) from those who left school after attaining a diploma in lower secondary education (‘low qualified’), those who pursued education as an apprentice (‘apprentices’) and the ones who continued education and received a full upper secondary qualification (‘full qualification). Discerning these four groups shows clear differences in the background of different types of early school-leavers and in the effects of school factors.labour market entry and occupational careers;
    • 

    corecore