398 research outputs found
Hierarchies of Predominantly Connected Communities
We consider communities whose vertices are predominantly connected, i.e., the
vertices in each community are stronger connected to other community members of
the same community than to vertices outside the community. Flake et al.
introduced a hierarchical clustering algorithm that finds such predominantly
connected communities of different coarseness depending on an input parameter.
We present a simple and efficient method for constructing a clustering
hierarchy according to Flake et al. that supersedes the necessity of choosing
feasible parameter values and guarantees the completeness of the resulting
hierarchy, i.e., the hierarchy contains all clusterings that can be constructed
by the original algorithm for any parameter value. However, predominantly
connected communities are not organized in a single hierarchy. Thus, we develop
a framework that, after precomputing at most maximum flows, admits a
linear time construction of a clustering \C(S) of predominantly connected
communities that contains a given community and is maximum in the sense
that any further clustering of predominantly connected communities that also
contains is hierarchically nested in \C(S). We further generalize this
construction yielding a clustering with similar properties for given
communities in time. This admits the analysis of a network's structure
with respect to various communities in different hierarchies.Comment: to appear (WADS 2013
Labor Market Effects of Immigration – Evidence from Neighborhood Data
This paper combines individual-level data from the German Socio-Economic Panel (SOEP) with economic and demographic postcode-level data from administrative records to analyze the effects of immigration on wages and unemployment probabilities of high- and low-skilled natives. Employing an instrumental variable strategy and utilizing the variation in the population share of foreigners across regions and time, we find no support for the hypothesis of adverse labor market effects of immigration.International migration; effects of immigration
Push-Pull Block Puzzles are Hard
This paper proves that push-pull block puzzles in 3D are PSPACE-complete to
solve, and push-pull block puzzles in 2D with thin walls are NP-hard to solve,
settling an open question by Zubaran and Ritt. Push-pull block puzzles are a
type of recreational motion planning problem, similar to Sokoban, that involve
moving a `robot' on a square grid with obstacles. The obstacles
cannot be traversed by the robot, but some can be pushed and pulled by the
robot into adjacent squares. Thin walls prevent movement between two adjacent
squares. This work follows in a long line of algorithms and complexity work on
similar problems. The 2D push-pull block puzzle shows up in the video games
Pukoban as well as The Legend of Zelda: A Link to the Past, giving another
proof of hardness for the latter. This variant of block-pushing puzzles is of
particular interest because of its connections to reversibility, since any
action (e.g., push or pull) can be inverted by another valid action (e.g., pull
or push).Comment: Full version of CIAC 2017 paper. 17 page
Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities
Many complex networks display a mesoscopic structure with groups of nodes
sharing many links with the other nodes in their group and comparatively few
with nodes of different groups. This feature is known as community structure
and encodes precious information about the organization and the function of the
nodes. Many algorithms have been proposed but it is not yet clear how they
should be tested. Recently we have proposed a general class of undirected and
unweighted benchmark graphs, with heterogenous distributions of node degree and
community size. An increasing attention has been recently devoted to develop
algorithms able to consider the direction and the weight of the links, which
require suitable benchmark graphs for testing. In this paper we extend the
basic ideas behind our previous benchmark to generate directed and weighted
networks with built-in community structure. We also consider the possibility
that nodes belong to more communities, a feature occurring in real systems,
like, e. g., social networks. As a practical application, we show how
modularity optimization performs on our new benchmark.Comment: 9 pages, 13 figures. Final version published in Physical Review E.
The code to create the benchmark graphs can be freely downloaded from
http://santo.fortunato.googlepages.com/inthepress
Finding local community structure in networks
Although the inference of global community structure in networks has recently
become a topic of great interest in the physics community, all such algorithms
require that the graph be completely known. Here, we define both a measure of
local community structure and an algorithm that infers the hierarchy of
communities that enclose a given vertex by exploring the graph one vertex at a
time. This algorithm runs in time O(d*k^2) for general graphs when is the
mean degree and k is the number of vertices to be explored. For graphs where
exploring a new vertex is time-consuming, the running time is linear, O(k). We
show that on computer-generated graphs this technique compares favorably to
algorithms that require global knowledge. We also use this algorithm to extract
meaningful local clustering information in the large recommender network of an
online retailer and show the existence of mesoscopic structure.Comment: 7 pages, 6 figure
Finding community structure in very large networks
The discovery and analysis of community structure in networks is a topic of
considerable recent interest within the physics community, but most methods
proposed so far are unsuitable for very large networks because of their
computational cost. Here we present a hierarchical agglomeration algorithm for
detecting community structure which is faster than many competing algorithms:
its running time on a network with n vertices and m edges is O(m d log n) where
d is the depth of the dendrogram describing the community structure. Many
real-world networks are sparse and hierarchical, with m ~ n and d ~ log n, in
which case our algorithm runs in essentially linear time, O(n log^2 n). As an
example of the application of this algorithm we use it to analyze a network of
items for sale on the web-site of a large online retailer, items in the network
being linked if they are frequently purchased by the same buyer. The network
has more than 400,000 vertices and 2 million edges. We show that our algorithm
can extract meaningful communities from this network, revealing large-scale
patterns present in the purchasing habits of customers
Identifying network communities with a high resolution
Community structure is an important property of complex networks. An
automatic discovery of such structure is a fundamental task in many
disciplines, including sociology, biology, engineering, and computer science.
Recently, several community discovery algorithms have been proposed based on
the optimization of a quantity called modularity (Q). However, the problem of
modularity optimization is NP-hard, and the existing approaches often suffer
from prohibitively long running time or poor quality. Furthermore, it has been
recently pointed out that algorithms based on optimizing Q will have a
resolution limit, i.e., communities below a certain scale may not be detected.
In this research, we first propose an efficient heuristic algorithm, Qcut,
which combines spectral graph partitioning and local search to optimize Q.
Using both synthetic and real networks, we show that Qcut can find higher
modularities and is more scalable than the existing algorithms. Furthermore,
using Qcut as an essential component, we propose a recursive algorithm, HQcut,
to solve the resolution limit problem. We show that HQcut can successfully
detect communities at a much finer scale and with a higher accuracy than the
existing algorithms. Finally, we apply Qcut and HQcut to study a
protein-protein interaction network, and show that the combination of the two
algorithms can reveal interesting biological results that may be otherwise
undetectable.Comment: 14 pages, 5 figures. 1 supplemental file at
http://cic.cs.wustl.edu/qcut/supplemental.pd
Individual 'trace' in knowledge space : a novel design approach for human-systems interaction
Data mining design is an approach through which system operational improvements in the search and retrieval of data activity can be augmented. This study explores optimisation processes, including data harvest, analytics and visualisation plus covers a wide range of efforts, including identifying the growing need of ‘making-sense’ of data which requires contextual understanding. In both cyberspace and physical world experiences the exploring of challenges and linkages between the cyber-physical knowledge spaces in data are emerging with excessive amounts of raw data. Possibilities to improve User-interface-design through better visualisation infographics in this study propose a novel mapping approach called ‘Trace’ in the Knowledge Space enabling design opportunities that help articulate unique human-system interaction, which provide potential in re-imagining and re-structuring uses of interaction and user-experience. These experienced through the design, use and context of languages enabling the building of new interactive apparatus, algorithms and dynamics in collective intelligence
Coexistence of opposite opinions in a network with communities
The Majority Rule is applied to a topology that consists of two coupled
random networks, thereby mimicking the modular structure observed in social
networks. We calculate analytically the asymptotic behaviour of the model and
derive a phase diagram that depends on the frequency of random opinion flips
and on the inter-connectivity between the two communities. It is shown that
three regimes may take place: a disordered regime, where no collective
phenomena takes place; a symmetric regime, where the nodes in both communities
reach the same average opinion; an asymmetric regime, where the nodes in each
community reach an opposite average opinion. The transition from the asymmetric
regime to the symmetric regime is shown to be discontinuous.Comment: 14 pages, 4 figure
Benchmark graphs for testing community detection algorithms
Community structure is one of the most important features of real networks
and reveals the internal organization of the nodes. Many algorithms have been
proposed but the crucial issue of testing, i.e. the question of how good an
algorithm is, with respect to others, is still open. Standard tests include the
analysis of simple artificial graphs with a built-in community structure, that
the algorithm has to recover. However, the special graphs adopted in actual
tests have a structure that does not reflect the real properties of nodes and
communities found in real networks. Here we introduce a new class of benchmark
graphs, that account for the heterogeneity in the distributions of node degrees
and of community sizes. We use this new benchmark to test two popular methods
of community detection, modularity optimization and Potts model clustering. The
results show that the new benchmark poses a much more severe test to algorithms
than standard benchmarks, revealing limits that may not be apparent at a first
analysis.Comment: 6 pages, 8 figures. Extended version published on Physical Review E.
The code to build the new benchmark graphs can be downloaded from
http://santo.fortunato.googlepages.com/inthepress
- …