54,731 research outputs found

    Study of Heterogeneous Academic Networks

    Get PDF
    Academic networks are derived from scholarly data. They are heterogeneous in the sense that different types of nodes are involved, such as papers and authors. This dissertation studies such heterogeneous networks for measuring the academic influence and learning vector representations of authors. Academic influence has been traditionally measured by the citation count and metrics derived from it. PageRank based algorithms have been used to give higher weight to citations from more influential papers. A better metric is to add authors into the citation network so that the importance of authors and papers are evaluated recursively within the same framework. Based on such heterogeneous academic networks, we propose a new algorithm for ranking authors. Tested on two large networks, we find that our method outperforms the other 10 methods in terms of the number of award winners among top-ranked authors. We further improve the method by finding and dealing with the long reference issue. Moreover, we find the mutual citation in paper networks and the self citation issue in author networks. Our new method can reduce the impact of the above three issues and identify more rising stars. To learn efficient author representations from heterogeneous academic networks, we propose a new embedding method called Stratified Embedding for Heterogeneous Networks (SEHN) based on Skip-Gram Negative Sampling (SGNS). We conduct Random Walks to generate the traces that represent the structure of the network, then separate the traces into different layers so that each layer contains the nodes of one type only. Such stratification improves embeddings that are derived from the mixed traces by a large margin. SEHN improves the state-of-the-art Metapath2vec by up to 24% at a certain point. The efficacy of stratification is also demonstrated on two classic network embedding algorithms DeepWalk and Node2vec. The results are validated in two heterogeneous networks. We also demonstrate that SEHN outperforms the embedding of homogeneous author networks that are induced from their corresponding heterogeneous networks

    Centrality Metric for Dynamic Networks

    Full text link
    Centrality is an important notion in network analysis and is used to measure the degree to which network structure contributes to the importance of a node in a network. While many different centrality measures exist, most of them apply to static networks. Most networks, on the other hand, are dynamic in nature, evolving over time through the addition or deletion of nodes and edges. A popular approach to analyzing such networks represents them by a static network that aggregates all edges observed over some time period. This approach, however, under or overestimates centrality of some nodes. We address this problem by introducing a novel centrality metric for dynamic network analysis. This metric exploits an intuition that in order for one node in a dynamic network to influence another over some period of time, there must exist a path that connects the source and destination nodes through intermediaries at different times. We demonstrate on an example network that the proposed metric leads to a very different ranking than analysis of an equivalent static network. We use dynamic centrality to study a dynamic citations network and contrast results to those reached by static network analysis.Comment: in KDD workshop on Mining and Learning in Graphs (MLG

    Network-based ranking in social systems: three challenges

    Get PDF
    Ranking algorithms are pervasive in our increasingly digitized societies, with important real-world applications including recommender systems, search engines, and influencer marketing practices. From a network science perspective, network-based ranking algorithms solve fundamental problems related to the identification of vital nodes for the stability and dynamics of a complex system. Despite the ubiquitous and successful applications of these algorithms, we argue that our understanding of their performance and their applications to real-world problems face three fundamental challenges: (i) Rankings might be biased by various factors; (2) their effectiveness might be limited to specific problems; and (3) agents' decisions driven by rankings might result in potentially vicious feedback mechanisms and unhealthy systemic consequences. Methods rooted in network science and agent-based modeling can help us to understand and overcome these challenges.Comment: Perspective article. 9 pages, 3 figure

    Measuring academic influence: Not all citations are equal

    Get PDF
    The importance of a research article is routinely measured by counting how many times it has been cited. However, treating all citations with equal weight ignores the wide variety of functions that citations perform. We want to automatically identify the subset of references in a bibliography that have a central academic influence on the citing paper. For this purpose, we examine the effectiveness of a variety of features for determining the academic influence of a citation. By asking authors to identify the key references in their own work, we created a data set in which citations were labeled according to their academic influence. Using automatic feature selection with supervised machine learning, we found a model for predicting academic influence that achieves good performance on this data set using only four features. The best features, among those we evaluated, were those based on the number of times a reference is mentioned in the body of a citing paper. The performance of these features inspired us to design an influence-primed h-index (the hip-index). Unlike the conventional h-index, it weights citations by how many times a reference is mentioned. According to our experiments, the hip-index is a better indicator of researcher performance than the conventional h-index

    Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks

    Full text link
    Identifying the most influential spreaders that maximize information flow is a central question in network theory. Recently, a scalable method called "Collective Influence (CI)" has been put forward through collective influence maximization. In contrast to heuristic methods evaluating nodes' significance separately, CI method inspects the collective influence of multiple spreaders. Despite that CI applies to the influence maximization problem in percolation model, it is still important to examine its efficacy in realistic information spreading. Here, we examine real-world information flow in various social and scientific platforms including American Physical Society, Facebook, Twitter and LiveJournal. Since empirical data cannot be directly mapped to ideal multi-source spreading, we leverage the behavioral patterns of users extracted from data to construct "virtual" information spreading processes. Our results demonstrate that the set of spreaders selected by CI can induce larger scale of information propagation. Moreover, local measures as the number of connections or citations are not necessarily the deterministic factors of nodes' importance in realistic information spreading. This result has significance for rankings scientists in scientific networks like the APS, where the commonly used number of citations can be a poor indicator of the collective influence of authors in the community.Comment: 11 pages, 4 figure
    • …
    corecore