397 research outputs found

    Hierarchies of Predominantly Connected Communities

    Full text link
    We consider communities whose vertices are predominantly connected, i.e., the vertices in each community are stronger connected to other community members of the same community than to vertices outside the community. Flake et al. introduced a hierarchical clustering algorithm that finds such predominantly connected communities of different coarseness depending on an input parameter. We present a simple and efficient method for constructing a clustering hierarchy according to Flake et al. that supersedes the necessity of choosing feasible parameter values and guarantees the completeness of the resulting hierarchy, i.e., the hierarchy contains all clusterings that can be constructed by the original algorithm for any parameter value. However, predominantly connected communities are not organized in a single hierarchy. Thus, we develop a framework that, after precomputing at most 2(n−1)2(n-1) maximum flows, admits a linear time construction of a clustering \C(S) of predominantly connected communities that contains a given community SS and is maximum in the sense that any further clustering of predominantly connected communities that also contains SS is hierarchically nested in \C(S). We further generalize this construction yielding a clustering with similar properties for kk given communities in O(kn)O(kn) time. This admits the analysis of a network's structure with respect to various communities in different hierarchies.Comment: to appear (WADS 2013

    Labor Market Effects of Immigration – Evidence from Neighborhood Data

    Get PDF
    This paper combines individual-level data from the German Socio-Economic Panel (SOEP) with economic and demographic postcode-level data from administrative records to analyze the effects of immigration on wages and unemployment probabilities of high- and low-skilled natives. Employing an instrumental variable strategy and utilizing the variation in the population share of foreigners across regions and time, we find no support for the hypothesis of adverse labor market effects of immigration.International migration; effects of immigration

    Push-Pull Block Puzzles are Hard

    Full text link
    This paper proves that push-pull block puzzles in 3D are PSPACE-complete to solve, and push-pull block puzzles in 2D with thin walls are NP-hard to solve, settling an open question by Zubaran and Ritt. Push-pull block puzzles are a type of recreational motion planning problem, similar to Sokoban, that involve moving a `robot' on a square grid with 1×11 \times 1 obstacles. The obstacles cannot be traversed by the robot, but some can be pushed and pulled by the robot into adjacent squares. Thin walls prevent movement between two adjacent squares. This work follows in a long line of algorithms and complexity work on similar problems. The 2D push-pull block puzzle shows up in the video games Pukoban as well as The Legend of Zelda: A Link to the Past, giving another proof of hardness for the latter. This variant of block-pushing puzzles is of particular interest because of its connections to reversibility, since any action (e.g., push or pull) can be inverted by another valid action (e.g., pull or push).Comment: Full version of CIAC 2017 paper. 17 page

    Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities

    Full text link
    Many complex networks display a mesoscopic structure with groups of nodes sharing many links with the other nodes in their group and comparatively few with nodes of different groups. This feature is known as community structure and encodes precious information about the organization and the function of the nodes. Many algorithms have been proposed but it is not yet clear how they should be tested. Recently we have proposed a general class of undirected and unweighted benchmark graphs, with heterogenous distributions of node degree and community size. An increasing attention has been recently devoted to develop algorithms able to consider the direction and the weight of the links, which require suitable benchmark graphs for testing. In this paper we extend the basic ideas behind our previous benchmark to generate directed and weighted networks with built-in community structure. We also consider the possibility that nodes belong to more communities, a feature occurring in real systems, like, e. g., social networks. As a practical application, we show how modularity optimization performs on our new benchmark.Comment: 9 pages, 13 figures. Final version published in Physical Review E. The code to create the benchmark graphs can be freely downloaded from http://santo.fortunato.googlepages.com/inthepress

    Finding local community structure in networks

    Full text link
    Although the inference of global community structure in networks has recently become a topic of great interest in the physics community, all such algorithms require that the graph be completely known. Here, we define both a measure of local community structure and an algorithm that infers the hierarchy of communities that enclose a given vertex by exploring the graph one vertex at a time. This algorithm runs in time O(d*k^2) for general graphs when dd is the mean degree and k is the number of vertices to be explored. For graphs where exploring a new vertex is time-consuming, the running time is linear, O(k). We show that on computer-generated graphs this technique compares favorably to algorithms that require global knowledge. We also use this algorithm to extract meaningful local clustering information in the large recommender network of an online retailer and show the existence of mesoscopic structure.Comment: 7 pages, 6 figure

    Finding community structure in very large networks

    Full text link
    The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O(m d log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m ~ n and d ~ log n, in which case our algorithm runs in essentially linear time, O(n log^2 n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web-site of a large online retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400,000 vertices and 2 million edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers

    Identifying network communities with a high resolution

    Full text link
    Community structure is an important property of complex networks. An automatic discovery of such structure is a fundamental task in many disciplines, including sociology, biology, engineering, and computer science. Recently, several community discovery algorithms have been proposed based on the optimization of a quantity called modularity (Q). However, the problem of modularity optimization is NP-hard, and the existing approaches often suffer from prohibitively long running time or poor quality. Furthermore, it has been recently pointed out that algorithms based on optimizing Q will have a resolution limit, i.e., communities below a certain scale may not be detected. In this research, we first propose an efficient heuristic algorithm, Qcut, which combines spectral graph partitioning and local search to optimize Q. Using both synthetic and real networks, we show that Qcut can find higher modularities and is more scalable than the existing algorithms. Furthermore, using Qcut as an essential component, we propose a recursive algorithm, HQcut, to solve the resolution limit problem. We show that HQcut can successfully detect communities at a much finer scale and with a higher accuracy than the existing algorithms. Finally, we apply Qcut and HQcut to study a protein-protein interaction network, and show that the combination of the two algorithms can reveal interesting biological results that may be otherwise undetectable.Comment: 14 pages, 5 figures. 1 supplemental file at http://cic.cs.wustl.edu/qcut/supplemental.pd

    Individual 'trace' in knowledge space : a novel design approach for human-systems interaction

    Get PDF
    Data mining design is an approach through which system operational improvements in the search and retrieval of data activity can be augmented. This study explores optimisation processes, including data harvest, analytics and visualisation plus covers a wide range of efforts, including identifying the growing need of ‘making-sense’ of data which requires contextual understanding. In both cyberspace and physical world experiences the exploring of challenges and linkages between the cyber-physical knowledge spaces in data are emerging with excessive amounts of raw data. Possibilities to improve User-interface-design through better visualisation infographics in this study propose a novel mapping approach called ‘Trace’ in the Knowledge Space enabling design opportunities that help articulate unique human-system interaction, which provide potential in re-imagining and re-structuring uses of interaction and user-experience. These experienced through the design, use and context of languages enabling the building of new interactive apparatus, algorithms and dynamics in collective intelligence

    Coexistence of opposite opinions in a network with communities

    Get PDF
    The Majority Rule is applied to a topology that consists of two coupled random networks, thereby mimicking the modular structure observed in social networks. We calculate analytically the asymptotic behaviour of the model and derive a phase diagram that depends on the frequency of random opinion flips and on the inter-connectivity between the two communities. It is shown that three regimes may take place: a disordered regime, where no collective phenomena takes place; a symmetric regime, where the nodes in both communities reach the same average opinion; an asymmetric regime, where the nodes in each community reach an opposite average opinion. The transition from the asymmetric regime to the symmetric regime is shown to be discontinuous.Comment: 14 pages, 4 figure

    Benchmark graphs for testing community detection algorithms

    Full text link
    Community structure is one of the most important features of real networks and reveals the internal organization of the nodes. Many algorithms have been proposed but the crucial issue of testing, i.e. the question of how good an algorithm is, with respect to others, is still open. Standard tests include the analysis of simple artificial graphs with a built-in community structure, that the algorithm has to recover. However, the special graphs adopted in actual tests have a structure that does not reflect the real properties of nodes and communities found in real networks. Here we introduce a new class of benchmark graphs, that account for the heterogeneity in the distributions of node degrees and of community sizes. We use this new benchmark to test two popular methods of community detection, modularity optimization and Potts model clustering. The results show that the new benchmark poses a much more severe test to algorithms than standard benchmarks, revealing limits that may not be apparent at a first analysis.Comment: 6 pages, 8 figures. Extended version published on Physical Review E. The code to build the new benchmark graphs can be downloaded from http://santo.fortunato.googlepages.com/inthepress
    • …
    corecore