1,209 research outputs found

    Privet: A Privacy-Preserving Vertical Federated Learning Service for Gradient Boosted Decision Tables

    Full text link
    Vertical federated learning (VFL) has recently emerged as an appealing distributed paradigm empowering multi-party collaboration for training high-quality models over vertically partitioned datasets. Gradient boosting has been popularly adopted in VFL, which builds an ensemble of weak learners (typically decision trees) to achieve promising prediction performance. Recently there have been growing interests in using decision table as an intriguing alternative weak learner in gradient boosting, due to its simpler structure, good interpretability, and promising performance. In the literature, there have been works on privacy-preserving VFL for gradient boosted decision trees, but no prior work has been devoted to the emerging case of decision tables. Training and inference on decision tables are different from that the case of generic decision trees, not to mention gradient boosting with decision tables in VFL. In light of this, we design, implement, and evaluate Privet, the first system framework enabling privacy-preserving VFL service for gradient boosted decision tables. Privet delicately builds on lightweight cryptography and allows an arbitrary number of participants holding vertically partitioned datasets to securely train gradient boosted decision tables. Extensive experiments over several real-world datasets and synthetic datasets demonstrate that Privet achieves promising performance, with utility comparable to plaintext centralized learning.Comment: Accepted in IEEE Transactions on Services Computing (TSC

    Privacy-Preserving Federated Learning over Vertically and Horizontally Partitioned Data for Financial Anomaly Detection

    Full text link
    The effective detection of evidence of financial anomalies requires collaboration among multiple entities who own a diverse set of data, such as a payment network system (PNS) and its partner banks. Trust among these financial institutions is limited by regulation and competition. Federated learning (FL) enables entities to collaboratively train a model when data is either vertically or horizontally partitioned across the entities. However, in real-world financial anomaly detection scenarios, the data is partitioned both vertically and horizontally and hence it is not possible to use existing FL approaches in a plug-and-play manner. Our novel solution, PV4FAD, combines fully homomorphic encryption (HE), secure multi-party computation (SMPC), differential privacy (DP), and randomization techniques to balance privacy and accuracy during training and to prevent inference threats at model deployment time. Our solution provides input privacy through HE and SMPC, and output privacy against inference time attacks through DP. Specifically, we show that, in the honest-but-curious threat model, banks do not learn any sensitive features about PNS transactions, and the PNS does not learn any information about the banks' dataset but only learns prediction labels. We also develop and analyze a DP mechanism to protect output privacy during inference. Our solution generates high-utility models by significantly reducing the per-bank noise level while satisfying distributed DP. To ensure high accuracy, our approach produces an ensemble model, in particular, a random forest. This enables us to take advantage of the well-known properties of ensembles to reduce variance and increase accuracy. Our solution won second prize in the first phase of the U.S. Privacy Enhancing Technologies (PETs) Prize Challenge.Comment: Prize Winner in the U.S. Privacy Enhancing Technologies (PETs) Prize Challeng
    • …
    corecore