1,326,836 research outputs found
ALPINE : Active Link Prediction using Network Embedding
Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on the observed part of the network.
Often, the link status of a node pair can be queried, which can be used as additional information by the link prediction algorithm. Unfortunately, such queries can be expensive or time-consuming, mandating the careful consideration of which node pairs to query. In this paper we estimate the improvement in link prediction accuracy after querying any particular node pair, to use in an active learning setup.
Specifically, we propose ALPINE (Active Link Prediction usIng Network Embedding), the first method to achieve this for link prediction based on network embedding. To this end, we generalized the notion of V-optimality from experimental design to this setting, as well as more basic active learning heuristics originally developed in standard classification settings. Empirical results on real data show that ALPINE is scalable, and boosts link prediction accuracy with far fewer queries
Predicting Social Links for New Users across Aligned Heterogeneous Social Networks
Online social networks have gained great success in recent years and many of
them involve multiple kinds of nodes and complex relationships. Among these
relationships, social links among users are of great importance. Many existing
link prediction methods focus on predicting social links that will appear in
the future among all users based upon a snapshot of the social network. In
real-world social networks, many new users are joining in the service every
day. Predicting links for new users are more important. Different from
conventional link prediction problems, link prediction for new users are more
challenging due to the following reasons: (1) differences in information
distributions between new users and the existing active users (i.e., old
users); (2) lack of information from the new users in the network. We propose a
link prediction method called SCAN-PS (Supervised Cross Aligned Networks link
prediction with Personalized Sampling), to solve the link prediction problem
for new users with information transferred from both the existing active users
in the target network and other source networks through aligned accounts. We
proposed a within-target-network personalized sampling method to process the
existing active users' information in order to accommodate the differences in
information distributions before the intra-network knowledge transfer. SCAN-PS
can also exploit information in other source networks, where the user accounts
are aligned with the target network. In this way, SCAN-PS could solve the cold
start problem when information of these new users is total absent in the target
network.Comment: 11 pages, 10 figures, 4 table
Principled Multilayer Network Embedding
Multilayer network analysis has become a vital tool for understanding
different relationships and their interactions in a complex system, where each
layer in a multilayer network depicts the topological structure of a group of
nodes corresponding to a particular relationship. The interactions among
different layers imply how the interplay of different relations on the topology
of each layer. For a single-layer network, network embedding methods have been
proposed to project the nodes in a network into a continuous vector space with
a relatively small number of dimensions, where the space embeds the social
representations among nodes. These algorithms have been proved to have a better
performance on a variety of regular graph analysis tasks, such as link
prediction, or multi-label classification. In this paper, by extending a
standard graph mining into multilayer network, we have proposed three methods
("network aggregation," "results aggregation" and "layer co-analysis") to
project a multilayer network into a continuous vector space. From the
evaluation, we have proved that comparing with regular link prediction methods,
"layer co-analysis" achieved the best performance on most of the datasets,
while "network aggregation" and "results aggregation" also have better
performance than regular link prediction methods
Algebraic shortcuts for leave-one-out cross-validation in supervised network inference
Supervised machine learning techniques have traditionally been very successful at reconstructing biological networks, such as protein-ligand interaction, protein-protein interaction and gene regulatory networks. Many supervised techniques for network prediction use linear models on a possibly nonlinear pairwise feature representation of edges. Recently, much emphasis has been placed on the correct evaluation of such supervised models. It is vital to distinguish between using a model to either predict new interactions in a given network or to predict interactions for a new vertex not present in the original network. This distinction matters because (i) the performance might dramatically differ between the prediction settings and (ii) tuning the model hyperparameters to obtain the best possible model depends on the setting of interest. Specific cross-validation schemes need to be used to assess the performance in such different prediction settings. In this work we discuss a state-of-the-art kernel-based network inference technique called two-step kernel ridge regression. We show that this regression model can be trained efficiently, with a time complexity scaling with the number of vertices rather than the number of edges. Furthermore, this framework leads to a series of cross-validation shortcuts that allow one to rapidly estimate the model performance for any relevant network prediction setting. This allows computational biologists to fully assess the capabilities of their models
Modelling the permeability of polymers: a neural network approach
In this short communication, the prediction of the permeability of carbon dioxide through different polymers using a neural network is studied. A neural network is a numeric-mathematical construction that can model complex non-linear relationships. Here it is used to correlate the IR spectrum of a polymer to its permeability. The underlying assumption is that the chemical information hidden in the IR spectrum is sufficient for the prediction. The best neural network investigated so far does indeed show predictive capabilities
- …
