4,536 research outputs found
Publishing Community-Preserving Attributed Social Graphs with a Differential Privacy Guarantee
We present a novel method for publishing differentially private synthetic
attributed graphs. Unlike preceding approaches, our method is able to preserve
the community structure of the original graph without sacrificing the ability
to capture global structural properties. Our proposal relies on C-AGM, a new
community-preserving generative model for attributed graphs. We equip C-AGM
with efficient methods for attributed graph sampling and parameter estimation.
For the latter, we introduce differentially private computation methods, which
allow us to release community-preserving synthetic attributed social graphs
with a strong formal privacy guarantee. Through comprehensive experiments, we
show that our new model outperforms its most relevant counterparts in
synthesising differentially private attributed social graphs that preserve the
community structure of the original graph, as well as degree sequences and
clustering coefficients
SoK: Chasing Accuracy and Privacy, and Catching Both in Differentially Private Histogram Publication
Histograms and synthetic data are of key importance in data analysis. However, researchers have shown that even aggregated data such as histograms, containing no obvious sensitive attributes, can result in privacy leakage. To enable data analysis, a strong notion of privacy is required to avoid risking unintended privacy violations.Such a strong notion of privacy is differential privacy, a statistical notion of privacy that makes privacy leakage quantifiable. The caveat regarding differential privacy is that while it has strong guarantees for privacy, privacy comes at a cost of accuracy. Despite this trade-off being a central and important issue in the adoption of differential privacy, there exists a gap in the literature regarding providing an understanding of the trade-off and how to address it appropriately. Through a systematic literature review (SLR), we investigate the state-of-the-art within accuracy improving differentially private algorithms for histogram and synthetic data publishing. Our contribution is two-fold: 1) we identify trends and connections in the contributions to the field of differential privacy for histograms and synthetic data and 2) we provide an understanding of the privacy/accuracy trade-off challenge by crystallizing different dimensions to accuracy improvement. Accordingly, we position and visualize the ideas in relation to each other and external work, and deconstruct each algorithm to examine the building blocks separately with the aim of pinpointing which dimension of accuracy improvement each technique/approach is targeting. Hence, this systematization of knowledge (SoK) provides an understanding of in which dimensions and how accuracy improvement can be pursued without sacrificing privacy
2.5K-Graphs: from Sampling to Generation
Understanding network structure and having access to realistic graphs plays a
central role in computer and social networks research. In this paper, we
propose a complete, and practical methodology for generating graphs that
resemble a real graph of interest. The metrics of the original topology we
target to match are the joint degree distribution (JDD) and the
degree-dependent average clustering coefficient (). We start by
developing efficient estimators for these two metrics based on a node sample
collected via either independence sampling or random walks. Then, we process
the output of the estimators to ensure that the target properties are
realizable. Finally, we propose an efficient algorithm for generating
topologies that have the exact target JDD and a close to the
target. Extensive simulations using real-life graphs show that the graphs
generated by our methodology are similar to the original graph with respect to,
not only the two target metrics, but also a wide range of other topological
metrics; furthermore, our generator is order of magnitudes faster than
state-of-the-art techniques
- …