54,731 research outputs found
Study of Heterogeneous Academic Networks
Academic networks are derived from scholarly data. They are heterogeneous in the sense that different types of nodes are involved, such as papers and authors. This dissertation studies such heterogeneous networks for measuring the academic influence and learning vector representations of authors. Academic influence has been traditionally measured by the citation count and metrics derived from it. PageRank based algorithms have been used to give higher weight to citations from more influential papers. A better metric is to add authors into the citation network so that the importance of authors and papers are evaluated recursively within the same framework. Based on such heterogeneous academic networks, we propose a new algorithm for ranking authors. Tested on two large networks, we find that our method outperforms the other 10 methods in terms of the number of award winners among top-ranked authors. We further improve the method by finding and dealing with the long reference issue. Moreover, we find the mutual citation in paper networks and the self citation issue in author networks. Our new method can reduce the impact of the above three issues and identify more rising stars. To learn efficient author representations from heterogeneous academic networks, we propose a new embedding method called Stratified Embedding for Heterogeneous Networks (SEHN) based on Skip-Gram Negative Sampling (SGNS). We conduct Random Walks to generate the traces that represent the structure of the network, then separate the traces into different layers so that each layer contains the nodes of one type only. Such stratification improves embeddings that are derived from the mixed traces by a large margin. SEHN improves the state-of-the-art Metapath2vec by up to 24% at a certain point. The efficacy of stratification is also demonstrated on two classic network embedding algorithms DeepWalk and Node2vec. The results are validated in two heterogeneous networks. We also demonstrate that SEHN outperforms the embedding of homogeneous author networks that are induced from their corresponding heterogeneous networks
Centrality Metric for Dynamic Networks
Centrality is an important notion in network analysis and is used to measure
the degree to which network structure contributes to the importance of a node
in a network. While many different centrality measures exist, most of them
apply to static networks. Most networks, on the other hand, are dynamic in
nature, evolving over time through the addition or deletion of nodes and edges.
A popular approach to analyzing such networks represents them by a static
network that aggregates all edges observed over some time period. This
approach, however, under or overestimates centrality of some nodes. We address
this problem by introducing a novel centrality metric for dynamic network
analysis. This metric exploits an intuition that in order for one node in a
dynamic network to influence another over some period of time, there must exist
a path that connects the source and destination nodes through intermediaries at
different times. We demonstrate on an example network that the proposed metric
leads to a very different ranking than analysis of an equivalent static
network. We use dynamic centrality to study a dynamic citations network and
contrast results to those reached by static network analysis.Comment: in KDD workshop on Mining and Learning in Graphs (MLG
Network-based ranking in social systems: three challenges
Ranking algorithms are pervasive in our increasingly digitized societies,
with important real-world applications including recommender systems, search
engines, and influencer marketing practices. From a network science
perspective, network-based ranking algorithms solve fundamental problems
related to the identification of vital nodes for the stability and dynamics of
a complex system. Despite the ubiquitous and successful applications of these
algorithms, we argue that our understanding of their performance and their
applications to real-world problems face three fundamental challenges: (i)
Rankings might be biased by various factors; (2) their effectiveness might be
limited to specific problems; and (3) agents' decisions driven by rankings
might result in potentially vicious feedback mechanisms and unhealthy systemic
consequences. Methods rooted in network science and agent-based modeling can
help us to understand and overcome these challenges.Comment: Perspective article. 9 pages, 3 figure
Measuring academic influence: Not all citations are equal
The importance of a research article is routinely measured by counting how
many times it has been cited. However, treating all citations with equal weight
ignores the wide variety of functions that citations perform. We want to
automatically identify the subset of references in a bibliography that have a
central academic influence on the citing paper. For this purpose, we examine
the effectiveness of a variety of features for determining the academic
influence of a citation. By asking authors to identify the key references in
their own work, we created a data set in which citations were labeled according
to their academic influence. Using automatic feature selection with supervised
machine learning, we found a model for predicting academic influence that
achieves good performance on this data set using only four features. The best
features, among those we evaluated, were those based on the number of times a
reference is mentioned in the body of a citing paper. The performance of these
features inspired us to design an influence-primed h-index (the hip-index).
Unlike the conventional h-index, it weights citations by how many times a
reference is mentioned. According to our experiments, the hip-index is a better
indicator of researcher performance than the conventional h-index
Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks
Identifying the most influential spreaders that maximize information flow is
a central question in network theory. Recently, a scalable method called
"Collective Influence (CI)" has been put forward through collective influence
maximization. In contrast to heuristic methods evaluating nodes' significance
separately, CI method inspects the collective influence of multiple spreaders.
Despite that CI applies to the influence maximization problem in percolation
model, it is still important to examine its efficacy in realistic information
spreading. Here, we examine real-world information flow in various social and
scientific platforms including American Physical Society, Facebook, Twitter and
LiveJournal. Since empirical data cannot be directly mapped to ideal
multi-source spreading, we leverage the behavioral patterns of users extracted
from data to construct "virtual" information spreading processes. Our results
demonstrate that the set of spreaders selected by CI can induce larger scale of
information propagation. Moreover, local measures as the number of connections
or citations are not necessarily the deterministic factors of nodes' importance
in realistic information spreading. This result has significance for rankings
scientists in scientific networks like the APS, where the commonly used number
of citations can be a poor indicator of the collective influence of authors in
the community.Comment: 11 pages, 4 figure
- …