377 research outputs found

    FedVS: Straggler-Resilient and Privacy-Preserving Vertical Federated Learning for Split Models

    Full text link
    In a vertical federated learning (VFL) system consisting of a central server and many distributed clients, the training data are vertically partitioned such that different features are privately stored on different clients. The problem of split VFL is to train a model split between the server and the clients. This paper aims to address two major challenges in split VFL: 1) performance degradation due to straggling clients during training; and 2) data and model privacy leakage from clients' uploaded data embeddings. We propose FedVS to simultaneously address these two challenges. The key idea of FedVS is to design secret sharing schemes for the local data and models, such that information-theoretical privacy against colluding clients and curious server is guaranteed, and the aggregation of all clients' embeddings is reconstructed losslessly, via decrypting computation shares from the non-straggling clients. Extensive experiments on various types of VFL datasets (including tabular, CV, and multi-view) demonstrate the universal advantages of FedVS in straggler mitigation and privacy protection over baseline protocols.Comment: This paper was accepted to ICML 202

    Conclave: secure multi-party computation on big data (extended TR)

    Full text link
    Secure Multi-Party Computation (MPC) allows mutually distrusting parties to run joint computations without revealing private data. Current MPC algorithms scale poorly with data size, which makes MPC on "big data" prohibitively slow and inhibits its practical use. Many relational analytics queries can maintain MPC's end-to-end security guarantee without using cryptographic MPC techniques for all operations. Conclave is a query compiler that accelerates such queries by transforming them into a combination of data-parallel, local cleartext processing and small MPC steps. When parties trust others with specific subsets of the data, Conclave applies new hybrid MPC-cleartext protocols to run additional steps outside of MPC and improve scalability further. Our Conclave prototype generates code for cleartext processing in Python and Spark, and for secure MPC using the Sharemind and Obliv-C frameworks. Conclave scales to data sets between three and six orders of magnitude larger than state-of-the-art MPC frameworks support on their own. Thanks to its hybrid protocols, Conclave also substantially outperforms SMCQL, the most similar existing system.Comment: Extended technical report for EuroSys 2019 pape

    Bounded Collusion Protocols, Cylinder-Intersection Extractors and Leakage-Resilient Secret Sharing

    Get PDF
    In this work we study bounded collusion protocols (BCPs) recently introduced in the context of secret sharing by Kumar, Meka, and Sahai (FOCS 2019). These are multi-party communication protocols on nn parties where in each round a subset of pp-parties (the collusion bound) collude together and write a function of their inputs on a public blackboard. BCPs interpolate elegantly between the well-studied number-in-hand (NIH) model (p=1p=1) and the number-on-forehead (NOF) model (p=n1p=n-1). Motivated by questions in communication complexity, secret sharing, and pseudorandomness we investigate BCPs more thoroughly, answering several questions about them. * We prove a polynomial (in the input-length) lower bound for an explicit function against BCPs where any constant fraction of players can collude. Previously, nontrivial lower bounds were known only when the collusion bound was at most logarithmic in the input-length (owing to bottlenecks in NOF lower bounds). * For all tnt \leq n, we construct efficient tt-out-of-nn secret sharing schemes where the secret remains hidden even given the transcript of a BCP with collusion bound O(t/logt)O(t/\log t). Prior work could only handle collusions of size O(logn)O(\log n). Along the way, we construct leakage-resilient schemes against disjoint and adaptive leakage, resolving a question asked by Goyal and Kumar (STOC 2018). * An explicit nn-source cylinder intersection extractor whose output is close to uniform even when given the transcript of a BCP with a constant fraction of parties colluding. The min-entropy rate we require is 0.30.3 (independent of collusion bound pnp \ll n). Our results rely on a new class of exponential sums that interpolate between the ones considered in additive combinatorics by Bourgain (Geometric and Functional Analysis 2009) and Petridis and Shparlinski (Journal d\u27Analyse Mathématique 2019)

    Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning

    Get PDF
    Federated learning is a distributed framework for training machine learning models over the data residing at mobile devices, while protecting the privacy of individual users. A major bottleneck in scaling federated learning to a large number of users is the overhead of secure model aggregation across many users. In particular, the overhead of the state-of-the-art protocols for secure model aggregation grows quadratically with the number of users. In this paper, we propose the first secure aggregation framework, named Turbo-Aggregate, that in a network with NN users achieves a secure aggregation overhead of O(NlogN)O(N\log{N}), as opposed to O(N2)O(N^2), while tolerating up to a user dropout rate of 50%50\%. Turbo-Aggregate employs a multi-group circular strategy for efficient model aggregation, and leverages additive secret sharing and novel coding techniques for injecting aggregation redundancy in order to handle user dropouts while guaranteeing user privacy. We experimentally demonstrate that Turbo-Aggregate achieves a total running time that grows almost linear in the number of users, and provides up to 40×40\times speedup over the state-of-the-art protocols with up to N=200N=200 users. Our experiments also demonstrate the impact of model size and bandwidth on the performance of Turbo-Aggregate

    Privacy-Preserving Distributed Optimization via Subspace Perturbation: A General Framework

    Get PDF
    As the modern world becomes increasingly digitized and interconnected, distributed signal processing has proven to be effective in processing its large volume of data. However, a main challenge limiting the broad use of distributed signal processing techniques is the issue of privacy in handling sensitive data. To address this privacy issue, we propose a novel yet general subspace perturbation method for privacy-preserving distributed optimization, which allows each node to obtain the desired solution while protecting its private data. In particular, we show that the dual variables introduced in each distributed optimizer will not converge in a certain subspace determined by the graph topology. Additionally, the optimization variable is ensured to converge to the desired solution, because it is orthogonal to this non-convergent subspace. We therefore propose to insert noise in the non-convergent subspace through the dual variable such that the private data are protected, and the accuracy of the desired solution is completely unaffected. Moreover, the proposed method is shown to be secure under two widely-used adversary models: passive and eavesdropping. Furthermore, we consider several distributed optimizers such as ADMM and PDMM to demonstrate the general applicability of the proposed method. Finally, we test the performance through a set of applications. Numerical tests indicate that the proposed method is superior to existing methods in terms of several parameters like estimated accuracy, privacy level, communication cost and convergence rate

    Leakage-resilient coin tossing

    Get PDF
    Proceedings 25th International Symposium, DISC 2011, Rome, Italy, September 20-22, 2011.The ability to collectively toss a common coin among n parties in the presence of faults is an important primitive in the arsenal of randomized distributed protocols. In the case of dishonest majority, it was shown to be impossible to achieve less than 1 r bias in O(r) rounds (Cleve STOC ’86). In the case of honest majority, in contrast, unconditionally secure O(1)-round protocols for generating common unbiased coins follow from general completeness theorems on multi-party secure protocols in the secure channels model (e.g., BGW, CCD STOC ’88). However, in the O(1)-round protocols with honest majority, parties generate and hold secret values which are assumed to be perfectly hidden from malicious parties: an assumption which is crucial to proving the resulting common coin is unbiased. This assumption unfortunately does not seem to hold in practice, as attackers can launch side-channel attacks on the local state of honest parties and leak information on their secrets. In this work, we present an O(1)-round protocol for collectively generating an unbiased common coin, in the presence of leakage on the local state of the honest parties. We tolerate t ≤ ( 1 3 − )n computationallyunbounded Byzantine faults and in addition a Ω(1)-fraction leakage on each (honest) party’s secret state. Our results hold in the memory leakage model (of Akavia, Goldwasser, Vaikuntanathan ’08) adapted to the distributed setting. Additional contributions of our work are the tools we introduce to achieve the collective coin toss: a procedure for disjoint committee election, and leakage-resilient verifiable secret sharing.National Defense Science and Engineering Graduate FellowshipNational Science Foundation (U.S.) (CCF-1018064

    Leakage-Resilient Extractors and Secret-Sharing against Bounded Collusion Protocols

    Get PDF
    In a recent work, Kumar, Meka, and Sahai (FOCS 2019) introduced the notion of bounded collusion protocols (BCPs), in which NN parties wish to compute some joint function f:({0,1}n)N{0,1}f:(\{0,1\}^n)^N\to\{0,1\} using a public blackboard, but such that only pp parties may collude at a time. This generalizes well studied models in multiparty communication complexity, such as the number-in-hand (NIH) and number-on-forehead (NOF) models, which are just endpoints on this rich spectrum. We construct explicit hard functions against this spectrum, and achieve a tradeoff between collusion and complexity. Using this, we obtain improved leakage-resilient secret sharing schemes against bounded collusion protocols. Our main tool in obtaining hard functions against BCPs are explicit constructions of leakage resilient extractors against BCPs for a wide range of parameters. Kumar et al. (FOCS 2019) studied such extractors and called them cylinder intersection extractors. In fact, such extractors directly yield correlation bounds against BCPs. We focus on the following setting: the input to the extractor consists of NN independent sources of length nn, and the leakage function Leak :({0,1}n)N{0,1}μF:(\{0,1\}^n)^N\to\{0,1\}^\mu\in\mathcal{F} is a BCP with some collusion bound pp and leakage (output length) μ\mu. While our extractor constructions are very general, we highlight some interesting parameter settings: 1. In the case when the input sources are uniform, and p=0.99Np=0.99N parties collude, our extractor can handle nΩ(1)n^{\Omega(1)} bits of leakage, regardless of the dependence between N,nN,n. The best NOF lower bound (i.e., p=N1p=N-1) on the other hand requires N<lognN<\log n even to handle 11 bit of leakage. 2. Next, we show that for the same setting as above, we can drop the entropy requirement to k=k= polylog nn, while still handling polynomial leakage for p=0.99Np=0.99N. This resolves an open question about cylinder intersection extractors raised by Kumar et al. (FOCS 2019), and we find an application of such low entropy extractors in a new type of secret sharing. We also provide an explicit compiler that transforms any function with high NOF (distributional) communication complexity into a leakage-resilient extractor that can handle polylogarithmic entropy and substantially more leakage against BCPs. Thus any improvement of NOF lower bounds will immediately yield better leakage-resilient extractors. Using our extractors against BCPs, we obtain improved NN-out-of-NN leakage-resilient secret sharing schemes. The previous best scheme from Kumar et al. (FOCS 2019) required share size to grow exponentially in the collusion bound, and thus cannot efficiently handle p=ω(logN)p=\omega(\log N). Our schemes have no dependence of this form, and can thus handle collusion size p=0.99Np=0.99N

    A Hybrid Approach to Privacy-Preserving Federated Learning

    Full text link
    Federated learning facilitates the collaborative training of models without the sharing of raw data. However, recent attacks demonstrate that simply maintaining data locality during training processes does not provide sufficient privacy guarantees. Rather, we need a federated learning system capable of preventing inference over both the messages exchanged during training and the final trained model while ensuring the resulting model also has acceptable predictive accuracy. Existing federated learning approaches either use secure multiparty computation (SMC) which is vulnerable to inference or differential privacy which can lead to low accuracy given a large number of parties with relatively small amounts of data each. In this paper, we present an alternative approach that utilizes both differential privacy and SMC to balance these trade-offs. Combining differential privacy with secure multiparty computation enables us to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust. Our system is therefore a scalable approach that protects against inference threats and produces models with high accuracy. Additionally, our system can be used to train a variety of machine learning models, which we validate with experimental results on 3 different machine learning algorithms. Our experiments demonstrate that our approach out-performs state of the art solutions
    corecore