142,393 research outputs found
Link Prediction in Complex Networks: A Survey
Link prediction in complex networks has attracted increasing attention from
both physical and computer science communities. The algorithms can be used to
extract missing information, identify spurious interactions, evaluate network
evolving mechanisms, and so on. This article summaries recent progress about
link prediction algorithms, emphasizing on the contributions from physical
perspectives and approaches, such as the random-walk-based methods and the
maximum likelihood methods. We also introduce three typical applications:
reconstruction of networks, evaluation of network evolving mechanism and
classification of partially labelled networks. Finally, we introduce some
applications and outline future challenges of link prediction algorithms.Comment: 44 pages, 5 figure
ALPINE : Active Link Prediction using Network Embedding
Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on the observed part of the network.
Often, the link status of a node pair can be queried, which can be used as additional information by the link prediction algorithm. Unfortunately, such queries can be expensive or time-consuming, mandating the careful consideration of which node pairs to query. In this paper we estimate the improvement in link prediction accuracy after querying any particular node pair, to use in an active learning setup.
Specifically, we propose ALPINE (Active Link Prediction usIng Network Embedding), the first method to achieve this for link prediction based on network embedding. To this end, we generalized the notion of V-optimality from experimental design to this setting, as well as more basic active learning heuristics originally developed in standard classification settings. Empirical results on real data show that ALPINE is scalable, and boosts link prediction accuracy with far fewer queries
A Model of Consistent Node Types in Signed Directed Social Networks
Signed directed social networks, in which the relationships between users can
be either positive (indicating relations such as trust) or negative (indicating
relations such as distrust), are increasingly common. Thus the interplay
between positive and negative relationships in such networks has become an
important research topic. Most recent investigations focus upon edge sign
inference using structural balance theory or social status theory. Neither of
these two theories, however, can explain an observed edge sign well when the
two nodes connected by this edge do not share a common neighbor (e.g., common
friend). In this paper we develop a novel approach to handle this situation by
applying a new model for node types. Initially, we analyze the local node
structure in a fully observed signed directed network, inferring underlying
node types. The sign of an edge between two nodes must be consistent with their
types; this explains edge signs well even when there are no common neighbors.
We show, moreover, that our approach can be extended to incorporate directed
triads, when they exist, just as in models based upon structural balance or
social status theory. We compute Bayesian node types within empirical studies
based upon partially observed Wikipedia, Slashdot, and Epinions networks in
which the largest network (Epinions) has 119K nodes and 841K edges. Our
approach yields better performance than state-of-the-art approaches for these
three signed directed networks.Comment: To appear in the IEEE/ACM International Conference on Advances in
Social Network Analysis and Mining (ASONAM), 201
Uncovering missing links with cold ends
To evaluate the performance of prediction of missing links, the known data
are randomly divided into two parts, the training set and the probe set. We
argue that this straightforward and standard method may lead to terrible bias,
since in real biological and information networks, missing links are more
likely to be links connecting low-degree nodes. We therefore study how to
uncover missing links with low-degree nodes, namely links in the probe set are
of lower degree products than a random sampling. Experimental analysis on ten
local similarity indices and four disparate real networks reveals a surprising
result that the Leicht-Holme-Newman index [E. A. Leicht, P. Holme, and M. E. J.
Newman, Phys. Rev. E 73, 026120 (2006)] performs the best, although it was
known to be one of the worst indices if the probe set is a random sampling of
all links. We further propose an parameter-dependent index, which considerably
improves the prediction accuracy. Finally, we show the relevance of the
proposed index on three real sampling methods.Comment: 16 pages, 5 figures, 6 table
- …