1,263 research outputs found
Analysis of the Web Graph Aggregated by Host and Pay-Level Domain
In this paper the web is analyzed as a graph aggregated by host and pay-level
domain (PLD). The web graph datasets, publicly available, have been released by
the Common Crawl Foundation and are based on a web crawl performed during the
period May-June-July 2017. The host graph has 1.3 billion nodes and
5.3 billion arcs. The PLD graph has 91 million nodes and 1.1
billion arcs. We study the distributions of degree and sizes of strongly/weakly
connected components (SCC/WCC) focusing on power laws detection using
statistical methods. The statistical plausibility of the power law model is
compared with that of several alternative distributions. While there is no
evidence of power law tails on host level, they emerge on PLD aggregation for
indegree, SCC and WCC size distributions. Finally, we analyze distance-related
features by studying the cumulative distributions of the shortest path lengths,
and give an estimation of the diameters of the graphs
Eigenvector-Based Centrality Measures for Temporal Networks
Numerous centrality measures have been developed to quantify the importances
of nodes in time-independent networks, and many of them can be expressed as the
leading eigenvector of some matrix. With the increasing availability of network
data that changes in time, it is important to extend such eigenvector-based
centrality measures to time-dependent networks. In this paper, we introduce a
principled generalization of network centrality measures that is valid for any
eigenvector-based centrality. We consider a temporal network with N nodes as a
sequence of T layers that describe the network during different time windows,
and we couple centrality matrices for the layers into a supra-centrality matrix
of size NTxNT whose dominant eigenvector gives the centrality of each node i at
each time t. We refer to this eigenvector and its components as a joint
centrality, as it reflects the importances of both the node i and the time
layer t. We also introduce the concepts of marginal and conditional
centralities, which facilitate the study of centrality trajectories over time.
We find that the strength of coupling between layers is important for
determining multiscale properties of centrality, such as localization phenomena
and the time scale of centrality changes. In the strong-coupling regime, we
derive expressions for time-averaged centralities, which are given by the
zeroth-order terms of a singular perturbation expansion. We also study
first-order terms to obtain first-order-mover scores, which concisely describe
the magnitude of nodes' centrality changes over time. As examples, we apply our
method to three empirical temporal networks: the United States Ph.D. exchange
in mathematics, costarring relationships among top-billed actors during the
Golden Age of Hollywood, and citations of decisions from the United States
Supreme Court.Comment: 38 pages, 7 figures, and 5 table
A novel approach to study realistic navigations on networks
We consider navigation or search schemes on networks which are realistic in
the sense that not all search chains can be completed. We show that the
quantity , where is the average dynamic shortest distance
and the success rate of completion of a search, is a consistent measure
for the quality of a search strategy. Taking the example of realistic searches
on scale-free networks, we find that scales with the system size as
, where decreases as the searching strategy is improved.
This measure is also shown to be sensitive to the distintinguishing
characteristics of networks. In this new approach, a dynamic small world (DSW)
effect is said to exist when . We show that such a DSW indeed
exists in social networks in which the linking probability is dependent on
social distances.Comment: Text revised, references added; accepted version in Journal of
Statistical Mechanic
Detecting Community Structure in Dynamic Social Networks Using the Concept of Leadership
Detecting community structure in social networks is a fundamental problem
empowering us to identify groups of actors with similar interests. There have
been extensive works focusing on finding communities in static networks,
however, in reality, due to dynamic nature of social networks, they are
evolving continuously. Ignoring the dynamic aspect of social networks, neither
allows us to capture evolutionary behavior of the network nor to predict the
future status of individuals. Aside from being dynamic, another significant
characteristic of real-world social networks is the presence of leaders, i.e.
nodes with high degree centrality having a high attraction to absorb other
members and hence to form a local community. In this paper, we devised an
efficient method to incrementally detect communities in highly dynamic social
networks using the intuitive idea of importance and persistence of community
leaders over time. Our proposed method is able to find new communities based on
the previous structure of the network without recomputing them from scratch.
This unique feature, enables us to efficiently detect and track communities
over time rapidly. Experimental results on the synthetic and real-world social
networks demonstrate that our method is both effective and efficient in
discovering communities in dynamic social networks
The developmental dynamics of terrorist organizations
We identify robust statistical patterns in the frequency and severity of
violent attacks by terrorist organizations as they grow and age. Using
group-level static and dynamic analyses of terrorist events worldwide from
1968-2008 and a simulation model of organizational dynamics, we show that the
production of violent events tends to accelerate with increasing size and
experience. This coupling of frequency, experience and size arises from a
fundamental positive feedback loop in which attacks lead to growth which leads
to increased production of new attacks. In contrast, event severity is
independent of both size and experience. Thus larger, more experienced
organizations are more deadly because they attack more frequently, not because
their attacks are more deadly, and large events are equally likely to come from
large and small organizations. These results hold across political ideologies
and time, suggesting that the frequency and severity of terrorism may be
constrained by fundamental processes.Comment: 28 pages, 8 figures, 4 tables, supplementary materia
Semi-Supervised Overlapping Community Finding based on Label Propagation with Pairwise Constraints
Algorithms for detecting communities in complex networks are generally
unsupervised, relying solely on the structure of the network. However, these
methods can often fail to uncover meaningful groupings that reflect the
underlying communities in the data, particularly when those structures are
highly overlapping. One way to improve the usefulness of these algorithms is by
incorporating additional background information, which can be used as a source
of constraints to direct the community detection process. In this work, we
explore the potential of semi-supervised strategies to improve algorithms for
finding overlapping communities in networks. Specifically, we propose a new
method, based on label propagation, for finding communities using a limited
number of pairwise constraints. Evaluations on synthetic and real-world
datasets demonstrate the potential of this approach for uncovering meaningful
community structures in cases where each node can potentially belong to more
than one community.Comment: Fix table
Consensus clustering in complex networks
The community structure of complex networks reveals both their organization
and hidden relationships among their constituents. Most community detection
methods currently available are not deterministic, and their results typically
depend on the specific random seeds, initial conditions and tie-break rules
adopted for their execution. Consensus clustering is used in data analysis to
generate stable results out of a set of partitions delivered by stochastic
methods. Here we show that consensus clustering can be combined with any
existing method in a self-consistent way, enhancing considerably both the
stability and the accuracy of the resulting partitions. This framework is also
particularly suitable to monitor the evolution of community structure in
temporal networks. An application of consensus clustering to a large citation
network of physics papers demonstrates its capability to keep track of the
birth, death and diversification of topics.Comment: 11 pages, 12 figures. Published in Scientific Report
Hawkes process as a model of social interactions: a view on video dynamics
We study by computer simulation the "Hawkes process" that was proposed in a
recent paper by Crane and Sornette (Proc. Nat. Acad. Sci. USA 105, 15649
(2008)) as a plausible model for the dynamics of YouTube video viewing numbers.
We test the claims made there that robust identification is possible for
classes of dynamic response following activity bursts. Our simulated timeseries
for the Hawkes process indeed fall into the different categories predicted by
Crane and Sornette. However the Hawkes process gives a much narrower spread of
decay exponents than the YouTube data, suggesting limits to the universality of
the Hawkes-based analysis.Comment: Added errors to parameter estimates and further description. IOP
style, 13 pages, 5 figure
- …