4,364 research outputs found
Inferring offline hierarchical ties from online social networks
Social networks can represent many different types of relationships between actors, some explicit and some implicit. For example, email communications between users may be represented explicitly in a network, while managerial relationships may not. In this paper we focus on analyzing explicit interactions among actors in order to detect hierarchical social relationships that may be implicit. We start by employing three well-known ranking-based methods, PageRank, Degree Centrality, and Rooted-PageRank (RPR) to infer such implicit relationships from interactions between actors. Then we propose two novel
approaches which take into account the time-dimension of interactions in the process of detecting hierarchical ties. We experiment on two datasets, the Enron email dataset to infer manager-subordinate relationships from email exchanges, and a scientific publication co-authorship dataset to detect PhD advisor-advisee relationships from paper co-authorships. Our experiments show that time-based methods perform considerably better than ranking-based methods. In the Enron dataset, they detect 48% of manager-subordinate ties versus 32% found by Rooted-PageRank. Similarly, in co-author dataset, they detect 62% of advisor-advisee ties compared to only 39% by Rooted-PageRank
Centrality Metric for Dynamic Networks
Centrality is an important notion in network analysis and is used to measure
the degree to which network structure contributes to the importance of a node
in a network. While many different centrality measures exist, most of them
apply to static networks. Most networks, on the other hand, are dynamic in
nature, evolving over time through the addition or deletion of nodes and edges.
A popular approach to analyzing such networks represents them by a static
network that aggregates all edges observed over some time period. This
approach, however, under or overestimates centrality of some nodes. We address
this problem by introducing a novel centrality metric for dynamic network
analysis. This metric exploits an intuition that in order for one node in a
dynamic network to influence another over some period of time, there must exist
a path that connects the source and destination nodes through intermediaries at
different times. We demonstrate on an example network that the proposed metric
leads to a very different ranking than analysis of an equivalent static
network. We use dynamic centrality to study a dynamic citations network and
contrast results to those reached by static network analysis.Comment: in KDD workshop on Mining and Learning in Graphs (MLG
Early identification of important patents through network centrality
One of the most challenging problems in technological forecasting is to
identify as early as possible those technologies that have the potential to
lead to radical changes in our society. In this paper, we use the US patent
citation network (1926-2010) to test our ability to early identify a list of
historically significant patents through citation network analysis. We show
that in order to effectively uncover these patents shortly after they are
issued, we need to go beyond raw citation counts and take into account both the
citation network topology and temporal information. In particular, an
age-normalized measure of patent centrality, called rescaled PageRank, allows
us to identify the significant patents earlier than citation count and PageRank
score. In addition, we find that while high-impact patents tend to rely on
other high-impact patents in a similar way as scientific papers, the patents'
citation dynamics is significantly slower than that of papers, which makes the
early identification of significant patents more challenging than that of
significant papers.Comment: 14 page
When is a Network a Network? Multi-Order Graphical Model Selection in Pathways and Temporal Networks
We introduce a framework for the modeling of sequential data capturing
pathways of varying lengths observed in a network. Such data are important,
e.g., when studying click streams in information networks, travel patterns in
transportation systems, information cascades in social networks, biological
pathways or time-stamped social interactions. While it is common to apply graph
analytics and network analysis to such data, recent works have shown that
temporal correlations can invalidate the results of such methods. This raises a
fundamental question: when is a network abstraction of sequential data
justified? Addressing this open question, we propose a framework which combines
Markov chains of multiple, higher orders into a multi-layer graphical model
that captures temporal correlations in pathways at multiple length scales
simultaneously. We develop a model selection technique to infer the optimal
number of layers of such a model and show that it outperforms previously used
Markov order detection techniques. An application to eight real-world data sets
on pathways and temporal networks shows that it allows to infer graphical
models which capture both topological and temporal characteristics of such
data. Our work highlights fallacies of network abstractions and provides a
principled answer to the open question when they are justified. Generalizing
network representations to multi-order graphical models, it opens perspectives
for new data mining and knowledge discovery algorithms.Comment: 10 pages, 4 figures, 1 table, companion python package pathpy
available on gitHu
Network-based ranking in social systems: three challenges
Ranking algorithms are pervasive in our increasingly digitized societies,
with important real-world applications including recommender systems, search
engines, and influencer marketing practices. From a network science
perspective, network-based ranking algorithms solve fundamental problems
related to the identification of vital nodes for the stability and dynamics of
a complex system. Despite the ubiquitous and successful applications of these
algorithms, we argue that our understanding of their performance and their
applications to real-world problems face three fundamental challenges: (i)
Rankings might be biased by various factors; (2) their effectiveness might be
limited to specific problems; and (3) agents' decisions driven by rankings
might result in potentially vicious feedback mechanisms and unhealthy systemic
consequences. Methods rooted in network science and agent-based modeling can
help us to understand and overcome these challenges.Comment: Perspective article. 9 pages, 3 figure
- …