32,242 research outputs found
Distributed Temporal Link Prediction Algorithm Based on Label Propagation
Link prediction has steadily become an important research topic in the area of complex networks. However, the current link prediction algorithms typically neglect the evolution process and they tend to exhibit low accuracy and scalability when applied to large-scale networks. In this article, we propose a novel distributed temporal link prediction algorithm based on label propagation (DTLPLP), governed by the dynamical properties of the interactions between nodes. In particular, nodes are associated with labels, which include details of their sources, and the corresponding similarity value. When such labels are propagated across neighbouring nodes, they are updated based on the weights of the incident links, and the values from same source nodes are aggregated to evaluate the scores of links in the predicted network. Furthermore, DTLPLP has been designed to be distributed and parallelised, and thus suitable for large-scale network analysis. As part of the validation process, we have designed a prototype system developed in Pregel, which is a distributed network analysis framework. Experiments are conducted on the Enron e-mails and the General Relativity and Quantum Cosmology Scientific Collaboration networks. The experimental results show that compared to the most of link prediction algorithms, DTLPLP offers enhanced accuracy, stability and scalability
DHLP 1&2: Giraph based distributed label propagation algorithms on heterogeneous drug-related networks
Background and Objective: Heterogeneous complex networks are large graphs
consisting of different types of nodes and edges. The knowledge extraction from
these networks is complicated. Moreover, the scale of these networks is
steadily increasing. Thus, scalable methods are required. Methods: In this
paper, two distributed label propagation algorithms for heterogeneous networks,
namely DHLP-1 and DHLP-2 have been introduced. Biological networks are one type
of the heterogeneous complex networks. As a case study, we have measured the
efficiency of our proposed DHLP-1 and DHLP-2 algorithms on a biological network
consisting of drugs, diseases, and targets. The subject we have studied in this
network is drug repositioning but our algorithms can be used as general methods
for heterogeneous networks other than the biological network. Results: We
compared the proposed algorithms with similar non-distributed versions of them
namely MINProp and Heter-LP. The experiments revealed the good performance of
the algorithms in terms of running time and accuracy.Comment: Source code available for Apache Giraph on Hadoo
On the Troll-Trust Model for Edge Sign Prediction in Social Networks
In the problem of edge sign prediction, we are given a directed graph
(representing a social network), and our task is to predict the binary labels
of the edges (i.e., the positive or negative nature of the social
relationships). Many successful heuristics for this problem are based on the
troll-trust features, estimating at each node the fraction of outgoing and
incoming positive/negative edges. We show that these heuristics can be
understood, and rigorously analyzed, as approximators to the Bayes optimal
classifier for a simple probabilistic model of the edge labels. We then show
that the maximum likelihood estimator for this model approximately corresponds
to the predictions of a Label Propagation algorithm run on a transformed
version of the original social graph. Extensive experiments on a number of
real-world datasets show that this algorithm is competitive against
state-of-the-art classifiers in terms of both accuracy and scalability.
Finally, we show that troll-trust features can also be used to derive online
learning algorithms which have theoretical guarantees even when edges are
adversarially labeled.Comment: v5: accepted to AISTATS 201
Link-Prediction Enhanced Consensus Clustering for Complex Networks
Many real networks that are inferred or collected from data are incomplete
due to missing edges. Missing edges can be inherent to the dataset (Facebook
friend links will never be complete) or the result of sampling (one may only
have access to a portion of the data). The consequence is that downstream
analyses that consume the network will often yield less accurate results than
if the edges were complete. Community detection algorithms, in particular,
often suffer when critical intra-community edges are missing. We propose a
novel consensus clustering algorithm to enhance community detection on
incomplete networks. Our framework utilizes existing community detection
algorithms that process networks imputed by our link prediction based
algorithm. The framework then merges their multiple outputs into a final
consensus output. On average our method boosts performance of existing
algorithms by 7% on artificial data and 17% on ego networks collected from
Facebook
- …