34 research outputs found
Using Metrics Suites to Improve the Measurement of Privacy in Graphs
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Social graphs are widely used in research (e.g., epidemiology) and business (e.g., recommender systems). However, sharing these graphs poses privacy risks because they contain sensitive information about individuals. Graph anonymization techniques aim to protect individual users in a graph, while graph de-anonymization aims to re-identify users. The effectiveness of anonymization and de-anonymization algorithms is usually evaluated with privacy metrics. However, it is unclear how strong existing privacy metrics are when they are used in graph privacy. In this paper, we study 26 privacy metrics for graph anonymization and de-anonymization and evaluate their strength in terms of three criteria: monotonicity indicates whether the metric indicates lower privacy for stronger adversaries; for within-scenario comparisons, evenness indicates whether metric values are spread evenly; and for between-scenario comparisons, shared value range indicates whether metrics use a consistent value range across scenarios. Our extensive experiments indicate that no single metric fulfills all three criteria perfectly. We therefore use methods from multi-criteria decision analysis to aggregate multiple metrics in a metrics suite, and we show that these metrics suites improve monotonicity compared to the best individual metric. This important result enables more monotonic, and thus more accurate, evaluations of new graph anonymization and de-anonymization algorithms
Seeded Graph Matching: Efficient Algorithms and Theoretical Guarantees
In this paper, a new information theoretic framework for graph matching is
introduced. Using this framework, the graph isomorphism and seeded graph
matching problems are studied. The maximum degree algorithm for graph
isomorphism is analyzed and sufficient conditions for successful matching are
rederived using type analysis. Furthermore, a new seeded matching algorithm
with polynomial time complexity is introduced. The algorithm uses `typicality
matching' and techniques from point-to-point communications for reliable
matching. Assuming an Erdos-Renyi model on the correlated graph pair, it is
shown that successful matching is guaranteed when the number of seeds grows
logarithmically with the number of vertices in the graphs. The logarithmic
coefficient is shown to be inversely proportional to the mutual information
between the edge variables in the two graphs
GLOVE: towards privacy-preserving publishing of record-level-truthful mobile phone trajectories
Datasets of mobile phone trajectories collected by network operators offer an unprecedented opportunity to discover new knowledge from the activity of large populations of millions. However, publishing such trajectories also raises significant privacy concerns, as they contain personal data in the form of individual movement patterns. Privacy risks induce network operators to enforce restrictive confidential agreements in the rare occasions when they grant access to collected trajectories, whereas a less involved circulation of these data would fuel research and enable reproducibility in many disciplines. In this work, we contribute a building block toward the design of privacy-preserving datasets of mobile phone trajectories that are truthful at the record level. We present GLOVE, an algorithm that implements k-anonymity, hence solving the crucial unicity problem that affects this type of data while ensuring that the anonymized trajectories correspond to real-life users. GLOVE builds on original insights about the root causes behind the undesirable unicity of mobile phone trajectories, and leverages generalization and suppression to remove them. Proof-of-concept validations with large-scale real-world datasets demonstrate that the approach adopted by GLOVE allows preserving a substantial level of accuracy in the data, higher than that granted by previous methodologies.This work was supported by the Atracción de Talento Investigador program of the Comunidad de Madrid under Grant No. 2019-T1/TIC-16037 NetSense