84,279 research outputs found

    ALPINE : Active Link Prediction using Network Embedding

    Get PDF
    Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on the observed part of the network. Often, the link status of a node pair can be queried, which can be used as additional information by the link prediction algorithm. Unfortunately, such queries can be expensive or time-consuming, mandating the careful consideration of which node pairs to query. In this paper we estimate the improvement in link prediction accuracy after querying any particular node pair, to use in an active learning setup. Specifically, we propose ALPINE (Active Link Prediction usIng Network Embedding), the first method to achieve this for link prediction based on network embedding. To this end, we generalized the notion of V-optimality from experimental design to this setting, as well as more basic active learning heuristics originally developed in standard classification settings. Empirical results on real data show that ALPINE is scalable, and boosts link prediction accuracy with far fewer queries

    Deep Learning for Link Prediction in Dynamic Networks using Weak Estimators

    Full text link
    Link prediction is the task of evaluating the probability that an edge exists in a network, and it has useful applications in many domains. Traditional approaches rely on measuring the similarity between two nodes in a static context. Recent research has focused on extending link prediction to a dynamic setting, predicting the creation and destruction of links in networks that evolve over time. Though a difficult task, the employment of deep learning techniques have shown to make notable improvements to the accuracy of predictions. To this end, we propose the novel application of weak estimators in addition to the utilization of traditional similarity metrics to inexpensively build an effective feature vector for a deep neural network. Weak estimators have been used in a variety of machine learning algorithms to improve model accuracy, owing to their capacity to estimate changing probabilities in dynamic systems. Experiments indicate that our approach results in increased prediction accuracy on several real-world dynamic networks

    An Automated Social Graph De-anonymization Technique

    Full text link
    We present a generic and automated approach to re-identifying nodes in anonymized social networks which enables novel anonymization techniques to be quickly evaluated. It uses machine learning (decision forests) to matching pairs of nodes in disparate anonymized sub-graphs. The technique uncovers artefacts and invariants of any black-box anonymization scheme from a small set of examples. Despite a high degree of automation, classification succeeds with significant true positive rates even when small false positive rates are sought. Our evaluation uses publicly available real world datasets to study the performance of our approach against real-world anonymization strategies, namely the schemes used to protect datasets of The Data for Development (D4D) Challenge. We show that the technique is effective even when only small numbers of samples are used for training. Further, since it detects weaknesses in the black-box anonymization scheme it can re-identify nodes in one social network when trained on another.Comment: 12 page
    corecore