557 research outputs found
Optimal random search for a single hidden target
A single target is hidden at a location chosen from a predetermined
probability distribution. Then, a searcher must find a second probability
distribution from which random search points are sampled such that the target
is found in the minimum number of trials. Here it will be shown that if the
searcher must get very close to the target to find it, then the best search
distribution is proportional to the square root of the target distribution. For
a Gaussian target distribution, the optimum search distribution is
approximately a Gaussian with a standard deviation that varies inversely with
how close the searcher must be to the target to find it. For a network, where
the searcher randomly samples nodes and looks for the fixed target along edges,
the optimum is to either sample a node with probability proportional to the
square root of the out degree plus one or not at all.Comment: 13 pages, 5 figure
Ranking and clustering of nodes in networks with smart teleportation
Random teleportation is a necessary evil for ranking and clustering directed
networks based on random walks. Teleportation enables ergodic solutions, but
the solutions must necessarily depend on the exact implementation and
parametrization of the teleportation. For example, in the commonly used
PageRank algorithm, the teleportation rate must trade off a heavily biased
solution with a uniform solution. Here we show that teleportation to links
rather than nodes enables a much smoother trade-off and effectively more robust
results. We also show that, by not recording the teleportation steps of the
random walker, we can further reduce the effect of teleportation with dramatic
effects on clustering.Comment: 10 pages, 7 figure
Coexistence of opposite opinions in a network with communities
The Majority Rule is applied to a topology that consists of two coupled
random networks, thereby mimicking the modular structure observed in social
networks. We calculate analytically the asymptotic behaviour of the model and
derive a phase diagram that depends on the frequency of random opinion flips
and on the inter-connectivity between the two communities. It is shown that
three regimes may take place: a disordered regime, where no collective
phenomena takes place; a symmetric regime, where the nodes in both communities
reach the same average opinion; an asymmetric regime, where the nodes in each
community reach an opposite average opinion. The transition from the asymmetric
regime to the symmetric regime is shown to be discontinuous.Comment: 14 pages, 4 figure
Map equation for link community
Community structure exists in many real-world networks and has been reported
being related to several functional properties of the networks. The
conventional approach was partitioning nodes into communities, while some
recent studies start partitioning links instead of nodes to find overlapping
communities of nodes efficiently. We extended the map equation method, which
was originally developed for node communities, to find link communities in
networks. This method is tested on various kinds of networks and compared with
the metadata of the networks, and the results show that our method can identify
the overlapping role of nodes effectively. The advantage of this method is that
the node community scheme and link community scheme can be compared
quantitatively by measuring the unknown information left in the networks
besides the community structure. It can be used to decide quantitatively
whether or not the link community scheme should be used instead of the node
community scheme. Furthermore, this method can be easily extended to the
directed and weighted networks since it is based on the random walk.Comment: 9 pages,5 figure
Organizational Chart Inference
Nowadays, to facilitate the communication and cooperation among employees, a
new family of online social networks has been adopted in many companies, which
are called the "enterprise social networks" (ESNs). ESNs can provide employees
with various professional services to help them deal with daily work issues.
Meanwhile, employees in companies are usually organized into different
hierarchies according to the relative ranks of their positions. The company
internal management structure can be outlined with the organizational chart
visually, which is normally confidential to the public out of the privacy and
security concerns. In this paper, we want to study the IOC (Inference of
Organizational Chart) problem to identify company internal organizational chart
based on the heterogeneous online ESN launched in it. IOC is very challenging
to address as, to guarantee smooth operations, the internal organizational
charts of companies need to meet certain structural requirements (about its
depth and width). To solve the IOC problem, a novel unsupervised method Create
(ChArT REcovEr) is proposed in this paper, which consists of 3 steps: (1)
social stratification of ESN users into different social classes, (2)
supervision link inference from managers to subordinates, and (3) consecutive
social classes matching to prune the redundant supervision links. Extensive
experiments conducted on real-world online ESN dataset demonstrate that Create
can perform very well in addressing the IOC problem.Comment: 10 pages, 9 figures, 1 table. The paper is accepted by KDD 201
Power-law distributions in empirical data
Power-law distributions occur in many situations of scientific interest and
have significant consequences for our understanding of natural and man-made
phenomena. Unfortunately, the detection and characterization of power laws is
complicated by the large fluctuations that occur in the tail of the
distribution -- the part of the distribution representing large but rare events
-- and by the difficulty of identifying the range over which power-law behavior
holds. Commonly used methods for analyzing power-law data, such as
least-squares fitting, can produce substantially inaccurate estimates of
parameters for power-law distributions, and even in cases where such methods
return accurate answers they are still unsatisfactory because they give no
indication of whether the data obey a power law at all. Here we present a
principled statistical framework for discerning and quantifying power-law
behavior in empirical data. Our approach combines maximum-likelihood fitting
methods with goodness-of-fit tests based on the Kolmogorov-Smirnov statistic
and likelihood ratios. We evaluate the effectiveness of the approach with tests
on synthetic data and give critical comparisons to previous approaches. We also
apply the proposed methods to twenty-four real-world data sets from a range of
different disciplines, each of which has been conjectured to follow a power-law
distribution. In some cases we find these conjectures to be consistent with the
data while in others the power law is ruled out.Comment: 43 pages, 11 figures, 7 tables, 4 appendices; code available at
http://www.santafe.edu/~aaronc/powerlaws
Assortative mixing in networks
A network is said to show assortative mixing if the nodes in the network that
have many connections tend to be connected to other nodes with many
connections. We define a measure of assortative mixing for networks and use it
to show that social networks are often assortatively mixed, but that
technological and biological networks tend to be disassortative. We propose a
model of an assortative network, which we study both analytically and
numerically. Within the framework of this model we find that assortative
networks tend to percolate more easily than their disassortative counterparts
and that they are also more robust to vertex removal.Comment: 5 pages, 1 table, 1 figur
Handling oversampling in dynamic networks using link prediction
Oversampling is a common characteristic of data representing dynamic
networks. It introduces noise into representations of dynamic networks, but
there has been little work so far to compensate for it. Oversampling can affect
the quality of many important algorithmic problems on dynamic networks,
including link prediction. Link prediction seeks to predict edges that will be
added to the network given previous snapshots. We show that not only does
oversampling affect the quality of link prediction, but that we can use link
prediction to recover from the effects of oversampling. We also introduce a
novel generative model of noise in dynamic networks that represents
oversampling. We demonstrate the results of our approach on both synthetic and
real-world data.Comment: ECML/PKDD 201
Asymptotic behavior of the Kleinberg model
We study Kleinberg navigation (the search of a target in a d-dimensional
lattice, where each site is connected to one other random site at distance r,
with probability proportional to r^{-a}) by means of an exact master equation
for the process. We show that the asymptotic scaling behavior for the delivery
time T to a target at distance L scales as (ln L)^2 when a=d, and otherwise as
L^x, with x=(d-a)/(d+1-a) for ad+1. These
values of x exceed the rigorous lower-bounds established by Kleinberg. We also
address the situation where there is a finite probability for the message to
get lost along its way and find short delivery times (conditioned upon arrival)
for a wide range of a's
- …