23 research outputs found

    Formalising oblivious transfer in the semi-honest and malicious model in CryptHOL

    Get PDF
    Multi-Party Computation (MPC) allows multiple parties to compute a function together while keeping their inputs private. Large scale implementations of MPC protocols are becoming practical thus it is important to have strong guarantees for the whole development process, from the underlying cryptography to the implementation. Computer aided proofs are a way to provide such guarantees. We use CryptHOL to formalise a framework for reasoning about two party protocols using the security definitions for MPC. In particular we consider protocols for 1-out-of-2 Oblivious Transfer (OT21OT^1_2) --- a fundamental MPC protocol --- in both the semi-honest and malicious models. We then extend our semi-honest formalisation to OT41OT^1_4 which is a building block for our proof of security for the two party GMW protocol --- a protocol that can securely compute any Boolean circuit. The semi-honest OT21OT^1_2 protocol we formalise is constructed from Extended Trapdoor Permutations (ETP), we first prove the general construction secure and then instantiate for the RSA collection of functions --- a known ETP. Our general proof assumes only the existence of ETPs, meaning any instantiated results come without needing to prove any security properties, only that the requirements of an ETP are met

    TAPAS : tricks to accelerate (encrypted) prediction as a service

    Get PDF
    Machine learning methods are widely used for a variety of prediction problems. Prediction as a service is a paradigm in which service providers with technological expertise and computational resources may perform predictions for clients. However, data privacy severely restricts the applicability of such services, unless measures to keep client data private (even from the service provider) are designed. Equally important is to minimize the nature of computation and amount of communication required between client and server. Fully homomorphic encryption offers a way out, whereby clients may encrypt their data, and on which the server may perform arithmetic computations. The one drawback of using fully homomorphic encryption is the amount of time required to evaluate large machine learning models on encrypted data. We combine several ideas from the machine learning literature, particularly work on quantization and sparsification of neural networks, together with algorithmic tools to speed-up and parallelize computation using encrypted data

    Private federated discovery of out-of-vocabulary words for Gboard

    Full text link
    The vocabulary of language models in Gboard, Google's keyboard application, plays a crucial role for improving user experience. One way to improve the vocabulary is to discover frequently typed out-of-vocabulary (OOV) words on user devices. This task requires strong privacy protection due to the sensitive nature of user input data. In this report, we present a private OOV discovery algorithm for Gboard, which builds on recent advances in private federated analytics. The system offers local differential privacy (LDP) guarantees for user contributed words. With anonymous aggregation, the final released result would satisfy central differential privacy guarantees with ε=0.315,δ=10−10\varepsilon = 0.315, \delta = 10^{-10} for OOV discovery in en-US (English in United States)

    Make Some ROOM for the Zeros: Data Sparsity in Secure Distributed Machine Learning

    Get PDF
    Exploiting data sparsity is crucial for the scalability of many data analysis tasks. However, while there is an increasing interest in efficient secure computation protocols for distributed machine learning, data sparsity has so far not been considered in a principled way in that setting. We propose sparse data structures together with their corresponding secure computation protocols to address common data analysis tasks while utilizing data sparsity. In particular, we define a Read-Only Oblivious Map primitive (ROOM) for accessing elements in sparse structures, and present several instantiations of this primitive with different trade-offs. Then, using ROOM as a building block, we propose protocols for basic linear algebra operations such as Gather, Scatter, and multiple variants of sparse matrix multiplication. Our protocols are easily composable by using secret sharing. We leverage this, at the highest level of abstraction, to build secure end-to-end protocols for non-parametric models (kk-nearest neighbors and naive Bayes classification) and parametric models (logistic regression) that enable secure analysis on high-dimensional datasets. The experimental evaluation of our protocol implementations demonstrates a manyfold improvement in the efficiency over state-of-the-art techniques across all applications. Our system is designed and built mirroring the modular architecture in scientific computing and machine learning frameworks, and inspired by the Sparse BLAS standard

    Reconstructing Genotypes in Private Genomic Databases from Genetic Risk Scores

    Get PDF
    Some organizations such as 23andMe and the UK Biobank have large genomic databases that they re-use for multiple different genome-wide association studies. Even research studies that compile smaller genomic databases often utilize these databases to investigate many related traits. It is common for the study to report a genetic risk score (GRS) model for each trait within the publication. Here, we show that under some circumstances, these GRS models can be used to recover the genetic variants of individuals in these genomic databases—a reconstruction attack. In particular, if two GRS models are trained by using a largely overlapping set of participants, it is often possible to determine the genotype for each of the individuals who were used to train one GRS model, but not the other. We demonstrate this theoretically and experimentally by analyzing the Cornell Dog Genome database. The accuracy of our reconstruction attack depends on how accurately we can estimate the rate of co-occurrence of pairs of single nucleotide polymorphisms within the private database, so if this aggregate information is ever released, it would drastically reduce the security of a private genomic database. Caution should be applied when using the same database for multiple analysis, especially when a small number of individuals are included or excluded from one part of the study

    Secure Stateful Aggregation: A Practical Protocol with Applications in Differentially-Private Federated Learning

    Get PDF
    Recent advances in differentially private federated learning (DPFL) algorithms have found that using correlated noise across the rounds of federated learning (DP-FTRL) yields provably and empirically better accuracy than using independent noise (DP-SGD). While DP-SGD is well-suited to federated learning with a single untrusted central server using lightweight secure aggregation protocols, secure aggregation is not conducive to implementing modern DP-FTRL techniques without assuming a trusted central server. DP-FTRL based approaches have already seen widespread deployment in industry, albeit with a trusted central curator who provides and applies the correlated noise. To realize a fully private, single untrusted server DP-FTRL federated learning protocol, we introduce secure stateful aggregation: a simple append-only data structure that allows for the private storage of aggregate values and reading linear functions of the aggregates. Assuming Ring Learning with Errors, we provide a lightweight and scalable realization of this protocol for high-dimensional data in a new security/resource model, Federated MPC: where a powerful persistent server interacts with weak, ephemeral clients. We observe that secure stateful aggregation suffices for realizing DP-FTRL-based private federated learning: improving DPFL utility guarantees over the state of the art while maintaining privacy with an untrusted central party. Our approach has minimal overhead relative to existing techniques which do not yield comparable utility. The secure stateful aggregation primitive and the federated MPC paradigm may be of interest for other practical applications
    corecore