7,279 research outputs found
A New Methodology for Generalizing Unweighted Network Measures
Several important complex network measures that helped discovering common
patterns across real-world networks ignore edge weights, an important
information in real-world networks. We propose a new methodology for
generalizing measures of unweighted networks through a generalization of the
cardinality concept of a set of weights. The key observation here is that many
measures of unweighted networks use the cardinality (the size) of some subset
of edges in their computation. For example, the node degree is the number of
edges incident to a node. We define the effective cardinality, a new metric
that quantifies how many edges are effectively being used, assuming that an
edge's weight reflects the amount of interaction across that edge. We prove
that a generalized measure, using our method, reduces to the original
unweighted measure if there is no disparity between weights, which ensures that
the laws that govern the original unweighted measure will also govern the
generalized measure when the weights are equal. We also prove that our
generalization ensures a partial ordering (among sets of weighted edges) that
is consistent with the original unweighted measure, unlike previously developed
generalizations. We illustrate the applicability of our method by generalizing
four unweighted network measures. As a case study, we analyze four real-world
weighted networks using our generalized degree and clustering coefficient. The
analysis shows that the generalized degree distribution is consistent with the
power-law hypothesis but with steeper decline and that there is a common
pattern governing the ratio between the generalized degree and the traditional
degree. The analysis also shows that nodes with more uniform weights tend to
cluster with nodes that also have more uniform weights among themselves.Comment: 23 pages, 10 figure
Assessing Code Authorship: The Case of the Linux Kernel
Code authorship is a key information in large-scale open source systems.
Among others, it allows maintainers to assess division of work and identify key
collaborators. Interestingly, open-source communities lack guidelines on how to
manage authorship. This could be mitigated by setting to build an empirical
body of knowledge on how authorship-related measures evolve in successful
open-source communities. Towards that direction, we perform a case study on the
Linux kernel. Our results show that: (a) only a small portion of developers (26
%) makes significant contributions to the code base; (b) the distribution of
the number of files per author is highly skewed --- a small group of top
authors (3 %) is responsible for hundreds of files, while most authors (75 %)
are responsible for at most 11 files; (c) most authors (62 %) have a specialist
profile; (d) authors with a high number of co-authorship connections tend to
collaborate with others with less connections.Comment: Accepted at 13th International Conference on Open Source Systems
(OSS). 12 page
The Routing of Complex Contagion in Kleinberg's Small-World Networks
In Kleinberg's small-world network model, strong ties are modeled as
deterministic edges in the underlying base grid and weak ties are modeled as
random edges connecting remote nodes. The probability of connecting a node
with node through a weak tie is proportional to , where
is the grid distance between and and is the
parameter of the model. Complex contagion refers to the propagation mechanism
in a network where each node is activated only after neighbors of the
node are activated.
In this paper, we propose the concept of routing of complex contagion (or
complex routing), where we can activate one node at one time step with the goal
of activating the targeted node in the end. We consider decentralized routing
scheme where only the weak ties from the activated nodes are revealed. We study
the routing time of complex contagion and compare the result with simple
routing and complex diffusion (the diffusion of complex contagion, where all
nodes that could be activated are activated immediately in the same step with
the goal of activating all nodes in the end).
We show that for decentralized complex routing, the routing time is lower
bounded by a polynomial in (the number of nodes in the network) for all
range of both in expectation and with high probability (in particular,
for and
for in expectation),
while the routing time of simple contagion has polylogarithmic upper bound when
. Our results indicate that complex routing is harder than complex
diffusion and the routing time of complex contagion differs exponentially
compared to simple contagion at sweetspot.Comment: Conference version will appear in COCOON 201
Graph Metrics for Temporal Networks
Temporal networks, i.e., networks in which the interactions among a set of
elementary units change over time, can be modelled in terms of time-varying
graphs, which are time-ordered sequences of graphs over a set of nodes. In such
graphs, the concepts of node adjacency and reachability crucially depend on the
exact temporal ordering of the links. Consequently, all the concepts and
metrics proposed and used for the characterisation of static complex networks
have to be redefined or appropriately extended to time-varying graphs, in order
to take into account the effects of time ordering on causality. In this chapter
we discuss how to represent temporal networks and we review the definitions of
walks, paths, connectedness and connected components valid for graphs in which
the links fluctuate over time. We then focus on temporal node-node distance,
and we discuss how to characterise link persistence and the temporal
small-world behaviour in this class of networks. Finally, we discuss the
extension of classic centrality measures, including closeness, betweenness and
spectral centrality, to the case of time-varying graphs, and we review the work
on temporal motifs analysis and the definition of modularity for temporal
graphs.Comment: 26 pages, 5 figures, Chapter in Temporal Networks (Petter Holme and
Jari Saram\"aki editors). Springer. Berlin, Heidelberg 201
Detecting rich-club ordering in complex networks
Uncovering the hidden regularities and organizational principles of networks
arising in physical systems ranging from the molecular level to the scale of
large communication infrastructures is the key issue for the understanding of
their fabric and dynamical properties [1-5]. The ``rich-club'' phenomenon
refers to the tendency of nodes with high centrality, the dominant elements of
the system, to form tightly interconnected communities and it is one of the
crucial properties accounting for the formation of dominant communities in both
computer and social sciences [4-8]. Here we provide the analytical expression
and the correct null models which allow for a quantitative discussion of the
rich-club phenomenon. The presented analysis enables the measurement of the
rich-club ordering and its relation with the function and dynamics of networks
in examples drawn from the biological, social and technological domains.Comment: 1 table, 3 figure
Outlier Edge Detection Using Random Graph Generation Models and Applications
Outliers are samples that are generated by different mechanisms from other
normal data samples. Graphs, in particular social network graphs, may contain
nodes and edges that are made by scammers, malicious programs or mistakenly by
normal users. Detecting outlier nodes and edges is important for data mining
and graph analytics. However, previous research in the field has merely focused
on detecting outlier nodes. In this article, we study the properties of edges
and propose outlier edge detection algorithms using two random graph generation
models. We found that the edge-ego-network, which can be defined as the induced
graph that contains two end nodes of an edge, their neighboring nodes and the
edges that link these nodes, contains critical information to detect outlier
edges. We evaluated the proposed algorithms by injecting outlier edges into
some real-world graph data. Experiment results show that the proposed
algorithms can effectively detect outlier edges. In particular, the algorithm
based on the Preferential Attachment Random Graph Generation model consistently
gives good performance regardless of the test graph data. Further more, the
proposed algorithms are not limited in the area of outlier edge detection. We
demonstrate three different applications that benefit from the proposed
algorithms: 1) a preprocessing tool that improves the performance of graph
clustering algorithms; 2) an outlier node detection algorithm; and 3) a novel
noisy data clustering algorithm. These applications show the great potential of
the proposed outlier edge detection techniques.Comment: 14 pages, 5 figures, journal pape
Network 'small-world-ness': a quantitative method for determining canonical network equivalence
Background: Many technological, biological, social, and information networks fall into the broad class of 'small-world' networks: they have tightly interconnected clusters of nodes, and a shortest mean path length that is similar to a matched random graph (same number of nodes and edges). This semi-quantitative definition leads to a categorical distinction ('small/not-small') rather than a quantitative, continuous grading of networks, and can lead to uncertainty about a network's small-world status. Moreover, systems described by small-world networks are often studied using an equivalent canonical network model-the Watts-Strogatz (WS) model. However, the process of establishing an equivalent WS model is imprecise and there is a pressing need to discover ways in which this equivalence may be quantified.
Methodology/Principal Findings: We defined a precise measure of 'small-world-ness' S based on the trade off between high local clustering and short path length. A network is now deemed a 'small-world' if S. 1-an assertion which may be tested statistically. We then examined the behavior of S on a large data-set of real-world systems. We found that all these systems were linked by a linear relationship between their S values and the network size n. Moreover, we show a method for assigning a unique Watts-Strogatz (WS) model to any real-world network, and show analytically that the WS models associated with our sample of networks also show linearity between S and n. Linearity between S and n is not, however, inevitable, and neither is S maximal for an arbitrary network of given size. Linearity may, however, be explained by a common limiting growth process.
Conclusions/Significance: We have shown how the notion of a small-world network may be quantified. Several key properties of the metric are described and the use of WS canonical models is placed on a more secure footing
Community Aliveness: Discovering Interaction Decay Patterns in Online Social Communities
Online Social Communities (OSCs) provide a medium for connecting people,
sharing news, eliciting information, and finding jobs, among others. The
dynamics of the interaction among the members of OSCs is not always growth
dynamics. Instead, a or dynamics often
happens, which makes an OSC obsolete. Understanding the behavior and the
characteristics of the members of an inactive community help to sustain the
growth dynamics of these communities and, possibly, prevents them from being
out of service. In this work, we provide two prediction models for predicting
the interaction decay of community members, namely: a Simple Threshold Model
(STM) and a supervised machine learning classification framework. We conducted
evaluation experiments for our prediction models supported by a of decayed communities extracted from the StackExchange platform. The
results of the experiments revealed that it is possible, with satisfactory
prediction performance in terms of the F1-score and the accuracy, to predict
the decay of the activity of the members of these communities using
network-based attributes and network-exogenous attributes of the members. The
upper bound of the prediction performance of the methods we used is and
for the F1-score and the accuracy, respectively. These results indicate
that network-based attributes are correlated with the activity of the members
and that we can find decay patterns in terms of these attributes. The results
also showed that the structure of the decayed communities can be used to
support the alive communities by discovering inactive members.Comment: pre-print for the 4th European Network Intelligence Conference -
11-12 September 2017 Duisburg, German
Worldwide food recall patterns over an eleven month period: A country perspective.
<p>Abstract</p> <p>Background</p> <p>Following the World Health Organization Forum in November 2007, the Beijing Declaration recognized the importance of food safety along with the rights of all individuals to a safe and adequate diet. The aim of this study is to retrospectively analyze the patterns in food alert and recall by countries to identify the principal hazard generators and gatekeepers of food safety in the eleven months leading up to the Declaration.</p> <p>Methods</p> <p>The food recall data set was collected by the Laboratory of the Government Chemist (LGC, UK) over the period from January to November 2007. Statistics were computed with the focus reporting patterns by the 117 countries. The complexity of the recorded interrelations was depicted as a network constructed from structural properties contained in the data. The analysed network properties included degrees, weighted degrees, modularity and <it>k</it>-core decomposition. Network analyses of the reports, based on 'country making report' (<it>detector</it>) and 'country reported on' (<it>transgressor</it>), revealed that the network is organized around a dominant core.</p> <p>Results</p> <p>Ten countries were reported for sixty per cent of all faulty products marketed, with the top 5 countries having received between 100 to 281 reports. Further analysis of the dominant core revealed that out of the top five transgressors three made no reports (in the order China > Turkey > Iran). The top ten detectors account for three quarters of reports with three > 300 (Italy: 406, Germany: 340, United Kingdom: 322).</p> <p>Conclusion</p> <p>Of the 117 countries studied, the vast majority of food reports are made by 10 countries, with EU countries predominating. The majority of the faulty foodstuffs originate in ten countries with four major producers making no reports. This pattern is very distant from that proposed by the Beijing Declaration which urges all countries to take responsibility for the provision of safe and adequate diets for their nationals.</p
- …