Graph embeddings have been proposed to map graph data to low dimensional
space for downstream processing (e.g., node classification or link prediction).
With the increasing collection of personal data, graph embeddings can be
trained on private and sensitive data. For the first time, we quantify the
privacy leakage in graph embeddings through three inference attacks targeting
Graph Neural Networks. We propose a membership inference attack to infer
whether a graph node corresponding to individual user's data was member of the
model's training or not. We consider a blackbox setting where the adversary
exploits the output prediction scores, and a whitebox setting where the
adversary has also access to the released node embeddings. This attack provides
an accuracy up to 28% (blackbox) 36% (whitebox) beyond random guess by
exploiting the distinguishable footprint between train and test data records
left by the graph embedding. We propose a Graph Reconstruction attack where the
adversary aims to reconstruct the target graph given the corresponding graph
embeddings. Here, the adversary can reconstruct the graph with more than 80% of
accuracy and link inference between two nodes around 30% more confidence than a
random guess. We then propose an attribute inference attack where the adversary
aims to infer a sensitive attribute. We show that graph embeddings are strongly
correlated to node attributes letting the adversary inferring sensitive
information (e.g., gender or location).Comment: 11 page

Boutet, Antoine

Duddu, Vasisht

Shejwalkar, Virat

English

arXiv

International audienceGraph embeddings have been proposed to map graph data to low dimensional space for downstream processing (e.g., node classification or link prediction). With the increasing collection of personal data, graph embeddings can be trained on private and sensitive data. For the first time, we quantify the privacy leakage in graph embeddings through three inference attacks targeting Graph Neural Networks. We propose a membership inference attack to infer whether a graph node corresponding to an individual user's data was a member of the model's training or not. We consider a blackbox setting where the adversary exploits the output prediction scores and a whitebox setting where the adversary has also access to the released node embeddings. This attack provides accuracy up to 28% (blackbox) and 36% (whitebox) beyond random guess by exploiting the distinguishable footprint between train and test data records left by the graph embedding. We propose a Graph Reconstruction attack where the adversary aims to reconstruct the target graph given the corresponding graph embeddings. Here, the adversary can reconstruct the graph with more than 80% of accuracy and link inference between two nodes with around 30% more confidence than a random guess. We then propose an attribute inference attack where the adversary aims to infer a sensitive attribute. We show that graph embeddings are strongly correlated to the node attributes letting the adversary inferring sensitive information (e.g., gender or location). CCS CONCEPTS • Security and privacy → Privacy protections; • Computing methodologies → Machine learning

INRIA a CCSD electronic archive server

Quantifying Privacy Leakage in Graph Embedding

https://hal.inria.fr/hal-03013638/file/hal.pdf

Quantifying Privacy Leakage in Graph Embedding

Abstract

Similar works

Full text

Available Versions

INRIA a CCSD electronic archive server