15 research outputs found
On rank statistics of PageRank and MarkovRank
The well-known statistic PageRank was created in 1998 by co-founders of
Google, Sergey Brin and Larry Page, to optimize the ranking of websites for
their search engine outcomes. It is computed using an iterative algorithm,
based on the idea that nodes with a larger number of incoming edges are more
important. Google's PageRank involves some information from "aliens"; the 15%
of information is regarded as the connections from the outside of the network
system under consideration. Without involving the information from "aliens",
Google's PageRank could not be well-defined.
In this paper, seeking a stable statistic which is "close" to an "intrinsic"
version of PageRank, we will introduce a new statistic called MarkovRank. A
special attention will be paid to the comparison of rank statistics among
standard-PageRank,"intrinsic-PageRank" and MarkovRank, and our conclusion is
that the rank statistic of MarkovRank, which is always well-defined, is
identical to that of "intrinsic-PageRank", as far as the latter is
well-defined.Comment: 16 pages, 4 figure
Local Ranking Problem on the BrowseGraph
The "Local Ranking Problem" (LRP) is related to the computation of a
centrality-like rank on a local graph, where the scores of the nodes could
significantly differ from the ones computed on the global graph. Previous work
has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a
graph where nodes are webpages and edges are browsing transitions. Recently,
this graph has received more and more attention in many different tasks such as
ranking, prediction and recommendation. However, a web-server has only the
browsing traffic performed on its pages (local BrowseGraph) and, as a
consequence, the local computation can lead to estimation errors, which hinders
the increasing number of applications in the state of the art. Also, although
the divergence between the local and global ranks has been measured, the
possibility of estimating such divergence using only local knowledge has been
mainly overlooked. These aspects are of great interest for online service
providers who want to: (i) gauge their ability to correctly assess the
importance of their resources only based on their local knowledge, and (ii)
take into account real user browsing fluxes that better capture the actual user
interest than the static hyperlink network. We study the LRP problem on a
BrowseGraph from a large news provider, considering as subgraphs the
aggregations of browsing traces of users coming from different domains. We show
that the distance between rankings can be accurately predicted based only on
structural information of the local graph, being able to achieve an average
rank correlation as high as 0.8
A local PageRank algorithm for evaluating the importance of scientific articles
We define a modified PageRank algorithm and the PR-score to measure
the influence of a single article by using its local co-citation network. We
also calculate the reaching probability and RP -score of a paper starting at
an arbitrary article of its co-citation network for the same purpose. We
highlight the advantages of our methods by applying them on the celebrated
paper of Jenő Egerváry that is underrated by the standard indices.
Keywords: Scientometric, PageRank, Ranking algorithms, Co-citation network
The agroecological transition in Senegal: transnational links and uneven empowerment.
Senegal is among the few African countries that counts with an important agroecological movement. This movement is strongly backed up by a network of transnational partnerships and has recently matured into an advocacy coalition that promotes an agroecological transition at national scale. In this article, we investigate the role of transnational links on the empowerment potential of agroecology. Combining the multi-level perspective of socio-technical transitions and Bourdieu's theory of practices, we conceptualize the agroecological network as a niche shaped by the circulation of different types of capital. Using social network analysis, we investigate the existing flows of resources and knowledge, as well as membership and advocacy links to critically address within-niche empowerment processes. We show that transnational ties play a key role in building the niche protective space, showing a financial dependency of the agroecological niche on NGOs and international cooperation programmes based in Europe and North America. This configuration tends to favor the empowerment of NGOs instead of farmer unions, which only play a peripheral role in the network. However, the multiple innovations focus of agroecology may open up prospects for more gradual but potentially radical change. Based on our findings, we suggest to include more explicitly core-periphery dynamics in transition studies involving North-South relations, including circulation of capital, ideas and norms.
Supplementary Information
The online version contains supplementary material available at 10.1007/s10460-021-10247-5
Maximizing Routing Throughput with Applications to Delay Tolerant Networks
abstract: Many applications require efficient data routing and dissemination in Delay Tolerant Networks (DTNs) in order to maximize the throughput of data in the network, such as providing healthcare to remote communities, and spreading related information in Mobile Social Networks (MSNs). In this thesis, the feasibility of using boats in the Amazon Delta Riverine region as data mule nodes is investigated and a robust data routing algorithm based on a fountain code approach is designed to ensure fast and timely data delivery considering unpredictable boat delays, break-downs, and high transmission failures. Then, the scenario of providing healthcare in Amazon Delta Region is extended to a general All-or-Nothing (Splittable) Multicommodity Flow (ANF) problem and a polynomial time constant approximation algorithm is designed for the maximum throughput routing problem based on a randomized rounding scheme with applications to DTNs. In an MSN, message content is closely related to users’ preferences, and can be used to significantly impact the performance of data dissemination. An interest- and content-based algorithm is developed where the contents of the messages, along with the network structural information are taken into consideration when making message relay decisions in order to maximize data throughput in an MSN. Extensive experiments show the effectiveness of the above proposed data dissemination algorithm by comparing it with state-of-the-art techniques.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Distributed Graph Neural Network Training: A Survey
Graph neural networks (GNNs) are a type of deep learning models that are
trained on graphs and have been successfully applied in various domains.
Despite the effectiveness of GNNs, it is still challenging for GNNs to
efficiently scale to large graphs. As a remedy, distributed computing becomes a
promising solution of training large-scale GNNs, since it is able to provide
abundant computing resources. However, the dependency of graph structure
increases the difficulty of achieving high-efficiency distributed GNN training,
which suffers from the massive communication and workload imbalance. In recent
years, many efforts have been made on distributed GNN training, and an array of
training algorithms and systems have been proposed. Yet, there is a lack of
systematic review on the optimization techniques for the distributed execution
of GNN training. In this survey, we analyze three major challenges in
distributed GNN training that are massive feature communication, the loss of
model accuracy and workload imbalance. Then we introduce a new taxonomy for the
optimization techniques in distributed GNN training that address the above
challenges. The new taxonomy classifies existing techniques into four
categories that are GNN data partition, GNN batch generation, GNN execution
model, and GNN communication protocol. We carefully discuss the techniques in
each category. In the end, we summarize existing distributed GNN systems for
multi-GPUs, GPU-clusters and CPU-clusters, respectively, and give a discussion
about the future direction on distributed GNN training
Graph Deep Learning: Methods and Applications
The past few years have seen the growing prevalence of deep neural networks on various application domains including image processing, computer vision, speech recognition, machine translation, self-driving cars, game playing, social networks, bioinformatics, and healthcare etc. Due to the broad applications and strong performance, deep learning, a subfield of machine learning and artificial intelligence, is changing everyone\u27s life.Graph learning has been another hot field among the machine learning and data mining communities, which learns knowledge from graph-structured data. Examples of graph learning range from social network analysis such as community detection and link prediction, to relational machine learning such as knowledge graph completion and recommender systems, to mutli-graph tasks such as graph classification and graph generation etc.An emerging new field, graph deep learning, aims at applying deep learning to graphs. To deal with graph-structured data, graph neural networks (GNNs) are invented in recent years which directly take graphs as input and output graph/node representations. Although GNNs have shown superior performance than traditional methods in tasks such as semi-supervised node classification, there still exist a wide range of other important graph learning problems where either GNNs\u27 applicabilities have not been explored or GNNs only have less satisfying performance.In this dissertation, we dive deeper into the field of graph deep learning. By developing new algorithms, architectures and theories, we push graph neural networks\u27 boundaries to a much wider range of graph learning problems. The problems we have explored include: 1) graph classification; 2) medical ontology embedding; 3) link prediction; 4) recommender systems; 5) graph generation; and 6) graph structure optimization.We first focus on two graph representation learning problems: graph classification and medical ontology embedding.For graph classification, we develop a novel deep GNN architecture which aggregates node features through a novel SortPooling layer that replaces the simple summing used in previous works. We demonstrate its state-of-the-art graph classification performance on benchmark datasets. For medical ontology embedding, we propose a novel hierarchical attention propagation model, which uses attention mechanism to learn embeddings of medical concepts from hierarchically-structured medical ontologies such as ICD-9 and CCS. We validate the learned embeddings on sequential procedure/diagnosis prediction tasks with real patient data.Then we investigate GNNs\u27 potential for predicting relations, specifically link prediction and recommender systems. For link prediction, we first develop a theory unifying various traditional link prediction heuristics, and then design a framework to automatically learn suitable heuristics from a given network based on GNNs. Our model shows unprecedented strong link prediction performance, significantly outperforming all traditional methods. For recommender systems, we propose a novel graph-based matrix completion model, which uses a GNN to learn graph structure features from the bipartite graph formed by user and item interactions. Our model not only outperforms various matrix completion baselines, but also demonstrates excellent transfer learning ability -- a model trained on MovieLens can be directly used to predict Douban movie ratings with high performance.Finally, we explore GNNs\u27 applicability to graph generation and graph structure optimization. We focus on a specific type of graphs which usually carry computations on them, namely directed acyclic graphs (DAGs). We develop a variational autoencoder (VAE) for DAGs and prove that it can injectively map computations into a latent space. This injectivity allows us to perform optimization in the continuous latent space instead of the original discrete structure space. We then apply our VAE to two types of DAGs, neural network architectures and Bayesian networks. Experiments show that our model not only generates novel and valid DAGs, but also finds high-quality neural architectures and Bayesian networks through performing Bayesian optimization in its latent space