Federated learning (FL) enables multiple parties to collaboratively train a
machine learning model without sharing their data; rather, they train their own
model locally and send updates to a central server for aggregation. Depending
on how the data is distributed among the participants, FL can be classified
into Horizontal (HFL) and Vertical (VFL). In VFL, the participants share the
same set of training instances but only host a different and non-overlapping
subset of the whole feature space. Whereas in HFL, each participant shares the
same set of features while the training set is split into locally owned
training data subsets.
VFL is increasingly used in applications like financial fraud detection;
nonetheless, very little work has analyzed its security. In this paper, we
focus on robustness in VFL, in particular, on backdoor attacks, whereby an
adversary attempts to manipulate the aggregate model during the training
process to trigger misclassifications. Performing backdoor attacks in VFL is
more challenging than in HFL, as the adversary i) does not have access to the
labels during training and ii) cannot change the labels as she only has access
to the feature embeddings. We present a first-of-its-kind clean-label backdoor
attack in VFL, which consists of two phases: a label inference and a backdoor
phase. We demonstrate the effectiveness of the attack on three different
datasets, investigate the factors involved in its success, and discuss
countermeasures to mitigate its impact