20,965 research outputs found
Entropy-based approach to missing-links prediction
Link-prediction is an active research field within network theory, aiming at
uncovering missing connections or predicting the emergence of future
relationships from the observed network structure. This paper represents our
contribution to the stream of research concerning missing links prediction.
Here, we propose an entropy-based method to predict a given percentage of
missing links, by identifying them with the most probable non-observed ones.
The probability coefficients are computed by solving opportunely defined
null-models over the accessible network structure. Upon comparing our
likelihood-based, local method with the most popular algorithms over a set of
economic, financial and food networks, we find ours to perform best, as pointed
out by a number of statistical indicators (e.g. the precision, the area under
the ROC curve, etc.). Moreover, the entropy-based formalism adopted in the
present paper allows us to straightforwardly extend the link-prediction
exercise to directed networks as well, thus overcoming one of the main
limitations of current algorithms. The higher accuracy achievable by employing
these methods - together with their larger flexibility - makes them strong
competitors of available link-prediction algorithms
Conditional network embeddings
Network Embeddings (NEs) map the nodes of a given network into -dimensional Euclidean space . Ideally, this mapping is such that 'similar' nodes are mapped onto nearby points, such that the NE can be used for purposes such as link prediction (if 'similar' means being 'more likely to be connected') or classification (if 'similar' means 'being more likely to have the same label'). In recent years various methods for NE have been introduced, all following a similar strategy: defining a notion of similarity between nodes (typically some distance measure within the network), a distance measure in the embedding space, and a loss function that penalizes large distances for similar nodes and small distances for dissimilar nodes.
A difficulty faced by existing methods is that certain networks are fundamentally hard to embed due to their structural properties: (approximate) multipartiteness, certain degree distributions, assortativity, etc. To overcome this, we introduce a conceptual innovation to the NE literature and propose to create \emph{Conditional Network Embeddings} (CNEs); embeddings that maximally add information with respect to given structural properties (e.g. node degrees, block densities, etc.). We use a simple Bayesian approach to achieve this, and propose a block stochastic gradient descent algorithm for fitting it efficiently.
We demonstrate that CNEs are superior for link prediction and multi-label classification when compared to state-of-the-art methods, and this without adding significant mathematical or computational complexity. Finally, we illustrate the potential of CNE for network visualization
Benchmarking Network Embedding Models for Link Prediction: Are We Making Progress?
Network embedding methods map a network's nodes to vectors in an embedding
space, in such a way that these representations are useful for estimating some
notion of similarity or proximity between pairs of nodes in the network. The
quality of these node representations is then showcased through results of
downstream prediction tasks. Commonly used benchmark tasks such as link
prediction, however, present complex evaluation pipelines and an abundance of
design choices. This, together with a lack of standardized evaluation setups
can obscure the real progress in the field. In this paper, we aim to shed light
on the state-of-the-art of network embedding methods for link prediction and
show, using a consistent evaluation pipeline, that only thin progress has been
made over the last years. The newly conducted benchmark that we present here,
including 17 embedding methods, also shows that many approaches are
outperformed even by simple heuristics. Finally, we argue that standardized
evaluation tools can repair this situation and boost future progress in this
field
- …