494 research outputs found
Preserving Node-level Privacy in Graph Neural Networks
Differential privacy (DP) has seen immense applications in learning on
tabular, image, and sequential data where instance-level privacy is concerned.
In learning on graphs, contrastingly, works on node-level privacy are highly
sparse. Challenges arise as existing DP protocols hardly apply to the
message-passing mechanism in Graph Neural Networks (GNNs).
In this study, we propose a solution that specifically addresses the issue of
node-level privacy. Our protocol consists of two main components: 1) a sampling
routine called HeterPoisson, which employs a specialized node sampling strategy
and a series of tailored operations to generate a batch of sub-graphs with
desired properties, and 2) a randomization routine that utilizes symmetric
multivariate Laplace (SML) noise instead of the commonly used Gaussian noise.
Our privacy accounting shows this particular combination provides a non-trivial
privacy guarantee. In addition, our protocol enables GNN learning with good
performance, as demonstrated by experiments on five real-world datasets;
compared with existing baselines, our method shows significant advantages,
especially in the high privacy regime. Experimentally, we also 1) perform
membership inference attacks against our protocol and 2) apply privacy audit
techniques to confirm our protocol's privacy integrity.
In the sequel, we present a study on a seemingly appealing approach
\cite{sajadmanesh2023gap} (USENIX'23) that protects node-level privacy via
differentially private node/instance embeddings. Unfortunately, such work has
fundamental privacy flaws, which are identified through a thorough case study.
More importantly, we prove an impossibility result of achieving both (strong)
privacy and (acceptable) utility through private instance embedding. The
implication is that such an approach has intrinsic utility barriers when
enforcing differential privacy
Towards a Fault-Injection Benchmarking Suite
Soft errors in memories and logic circuits are known to disturb program
execution. In this context, the research community has been proposing a
plethora of fault-tolerance (FT) solutions over the last decades, as well as
fault-injection (FI) approaches to test, measure and compare them. However,
there is no agreed-upon benchmarking suite for demonstrating FT or FI
approaches. As a replacement, authors pick benchmarks from other domains, e.g.
embedded systems. This leads to little comparability across publications, and
causes behavioral overlap within benchmarks that were not selected for
orthogonality in the FT/FI domain.
In this paper, we want to initiate a discussion on what a benchmarking suite
for the FT/FI domain should look like, and propose criteria for benchmark
selection.Comment: S. Bernardi, T. Zoppi (Editors), "Fast Abstracts and Student Forum
Proceedings - EDCC 2024 - 19th European Dependable Computing Conference,
Leuven, Belgium, 8-11 April 2024
Differentially Private Vertical Federated Clustering
In many applications, multiple parties have private data regarding the same
set of users but on disjoint sets of attributes, and a server wants to leverage
the data to train a model. To enable model learning while protecting the
privacy of the data subjects, we need vertical federated learning (VFL)
techniques, where the data parties share only information for training the
model, instead of the private data. However, it is challenging to ensure that
the shared information maintains privacy while learning accurate models. To the
best of our knowledge, the algorithm proposed in this paper is the first
practical solution for differentially private vertical federated k-means
clustering, where the server can obtain a set of global centers with a provable
differential privacy guarantee. Our algorithm assumes an untrusted central
server that aggregates differentially private local centers and membership
encodings from local data parties. It builds a weighted grid as the synopsis of
the global dataset based on the received information. Final centers are
generated by running any k-means algorithm on the weighted grid. Our approach
for grid weight estimation uses a novel, light-weight, and differentially
private set intersection cardinality estimation algorithm based on the
Flajolet-Martin sketch. To improve the estimation accuracy in the setting with
more than two data parties, we further propose a refined version of the weights
estimation algorithm and a parameter tuning strategy to reduce the final
k-means utility to be close to that in the central private setting. We provide
theoretical utility analysis and experimental evaluation results for the
cluster centers computed by our algorithm and show that our approach performs
better both theoretically and empirically than the two baselines based on
existing techniques
Practical Differentially Private and Byzantine-resilient Federated Learning
Privacy and Byzantine resilience are two indispensable requirements for a
federated learning (FL) system. Although there have been extensive studies on
privacy and Byzantine security in their own track, solutions that consider both
remain sparse. This is due to difficulties in reconciling privacy-preserving
and Byzantine-resilient algorithms.
In this work, we propose a solution to such a two-fold issue. We use our
version of differentially private stochastic gradient descent (DP-SGD)
algorithm to preserve privacy and then apply our Byzantine-resilient
algorithms. We note that while existing works follow this general approach, an
in-depth analysis on the interplay between DP and Byzantine resilience has been
ignored, leading to unsatisfactory performance. Specifically, for the random
noise introduced by DP, previous works strive to reduce its impact on the
Byzantine aggregation. In contrast, we leverage the random noise to construct
an aggregation that effectively rejects many existing Byzantine attacks.
We provide both theoretical proof and empirical experiments to show our
protocol is effective: retaining high accuracy while preserving the DP
guarantee and Byzantine resilience. Compared with the previous work, our
protocol 1) achieves significantly higher accuracy even in a high privacy
regime; 2) works well even when up to 90% of distributive workers are
Byzantine
- …