3,647 research outputs found
Link Prediction by De-anonymization: How We Won the Kaggle Social Network Challenge
This paper describes the winning entry to the IJCNN 2011 Social Network
Challenge run by Kaggle.com. The goal of the contest was to promote research on
real-world link prediction, and the dataset was a graph obtained by crawling
the popular Flickr social photo sharing website, with user identities scrubbed.
By de-anonymizing much of the competition test set using our own Flickr crawl,
we were able to effectively game the competition. Our attack represents a new
application of de-anonymization to gaming machine learning contests, suggesting
changes in how future competitions should be run.
We introduce a new simulated annealing-based weighted graph matching
algorithm for the seeding step of de-anonymization. We also show how to combine
de-anonymization with link prediction---the latter is required to achieve good
performance on the portion of the test set not de-anonymized---for example by
training the predictor on the de-anonymized portion of the test set, and
combining probabilistic predictions from de-anonymization and link prediction.Comment: 11 pages, 13 figures; submitted to IJCNN'201
Spectral Graph Forge: Graph Generation Targeting Modularity
Community structure is an important property that captures inhomogeneities
common in large networks, and modularity is one of the most widely used metrics
for such community structure. In this paper, we introduce a principled
methodology, the Spectral Graph Forge, for generating random graphs that
preserves community structure from a real network of interest, in terms of
modularity. Our approach leverages the fact that the spectral structure of
matrix representations of a graph encodes global information about community
structure. The Spectral Graph Forge uses a low-rank approximation of the
modularity matrix to generate synthetic graphs that match a target modularity
within user-selectable degree of accuracy, while allowing other aspects of
structure to vary. We show that the Spectral Graph Forge outperforms
state-of-the-art techniques in terms of accuracy in targeting the modularity
and randomness of the realizations, while also preserving other local
structural properties and node attributes. We discuss extensions of the
Spectral Graph Forge to target other properties beyond modularity, and its
applications to anonymization
Anonymizing Social Graphs via Uncertainty Semantics
Rather than anonymizing social graphs by generalizing them to super
nodes/edges or adding/removing nodes and edges to satisfy given privacy
parameters, recent methods exploit the semantics of uncertain graphs to achieve
privacy protection of participating entities and their relationship. These
techniques anonymize a deterministic graph by converting it into an uncertain
form. In this paper, we propose a generalized obfuscation model based on
uncertain adjacency matrices that keep expected node degrees equal to those in
the unanonymized graph. We analyze two recently proposed schemes and show their
fitting into the model. We also point out disadvantages in each method and
present several elegant techniques to fill the gap between them. Finally, to
support fair comparisons, we develop a new tradeoff quantifying framework by
leveraging the concept of incorrectness in location privacy research.
Experiments on large social graphs demonstrate the effectiveness of our
schemes
Generating realistic scaled complex networks
Research on generative models is a central project in the emerging field of
network science, and it studies how statistical patterns found in real networks
could be generated by formal rules. Output from these generative models is then
the basis for designing and evaluating computational methods on networks, and
for verification and simulation studies. During the last two decades, a variety
of models has been proposed with an ultimate goal of achieving comprehensive
realism for the generated networks. In this study, we (a) introduce a new
generator, termed ReCoN; (b) explore how ReCoN and some existing models can be
fitted to an original network to produce a structurally similar replica, (c)
use ReCoN to produce networks much larger than the original exemplar, and
finally (d) discuss open problems and promising research directions. In a
comparative experimental study, we find that ReCoN is often superior to many
other state-of-the-art network generation methods. We argue that ReCoN is a
scalable and effective tool for modeling a given network while preserving
important properties at both micro- and macroscopic scales, and for scaling the
exemplar data by orders of magnitude in size.Comment: 26 pages, 13 figures, extended version, a preliminary version of the
paper was presented at the 5th International Workshop on Complex Networks and
their Application
- …