1,324 research outputs found

    Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments

    Get PDF
    Decentralized systems are a subset of distributed systems where multiple authorities control different components and no authority is fully trusted by all. This implies that any component in a decentralized system is potentially adversarial. We revise fifteen years of research on decentralization and privacy, and provide an overview of key systems, as well as key insights for designers of future systems. We show that decentralized designs can enhance privacy, integrity, and availability but also require careful trade-offs in terms of system complexity, properties provided, and degree of decentralization. These trade-offs need to be understood and navigated by designers. We argue that a combination of insights from cryptography, distributed systems, and mechanism design, aligned with the development of adequate incentives, are necessary to build scalable and successful privacy-preserving decentralized systems

    Verifiable Encodings for Secure Homomorphic Analytics

    Full text link
    Homomorphic encryption, which enables the execution of arithmetic operations directly on ciphertexts, is a promising solution for protecting privacy of cloud-delegated computations on sensitive data. However, the correctness of the computation result is not ensured. We propose two error detection encodings and build authenticators that enable practical client-verification of cloud-based homomorphic computations under different trade-offs and without compromising on the features of the encryption algorithm. Our authenticators operate on top of trending ring learning with errors based fully homomorphic encryption schemes over the integers. We implement our solution in VERITAS, a ready-to-use system for verification of outsourced computations executed over encrypted data. We show that contrary to prior work VERITAS supports verification of any homomorphic operation and we demonstrate its practicality for various applications, such as ride-hailing, genomic-data analysis, encrypted search, and machine-learning training and inference.Comment: update authors, typos corrected, scheme update

    Robust and Actively Secure Serverless Collaborative Learning

    Full text link
    Collaborative machine learning (ML) is widely used to enable institutions to learn better models from distributed data. While collaborative approaches to learning intuitively protect user data, they remain vulnerable to either the server, the clients, or both, deviating from the protocol. Indeed, because the protocol is asymmetric, a malicious server can abuse its power to reconstruct client data points. Conversely, malicious clients can corrupt learning with malicious updates. Thus, both clients and servers require a guarantee when the other cannot be trusted to fully cooperate. In this work, we propose a peer-to-peer (P2P) learning scheme that is secure against malicious servers and robust to malicious clients. Our core contribution is a generic framework that transforms any (compatible) algorithm for robust aggregation of model updates to the setting where servers and clients can act maliciously. Finally, we demonstrate the computational efficiency of our approach even with 1-million parameter models trained by 100s of peers on standard datasets.Comment: Accepted at NeurIPS 202

    Evaluate and Guard the Wisdom of Crowds: Zero Knowledge Proofs for Crowdsourcing Truth Inference

    Full text link
    Due to the risks of correctness and security in outsourced cloud computing, we consider a new paradigm called crowdsourcing: distribute tasks, receive answers and aggregate the results from multiple entities. Through this approach, we can aggregate the wisdom of the crowd to complete tasks, ensuring the accuracy of task completion while reducing the risks posed by the malicious acts of a single entity. However, the ensuing question is, how can we ensure that the aggregator has done its work honestly and each contributor's work has been evaluated fairly? In this paper, we propose a new scheme called zkTI\mathsf{zkTI}. This scheme ensures that the aggregator has honestly completed the aggregation and each data source is fairly evaluated. We combine a cryptographic primitive called \textit{zero-knowledge proof} with a class of \textit{truth inference algorithms} which is widely studied in AI/ML scenarios. Under this scheme, various complex outsourced tasks can be solved with efficiency and accuracy. To build our scheme, a novel method to prove the precise computation of floating-point numbers is proposed, which is nearly optimal and well-compatible with existing argument systems. This may become an independent point of interest. Thus our work can prove the process of aggregation and inference without loss of precision. We fully implement and evaluate our ideas. Compared with recent works, our scheme achieves 24×2-4 \times efficiency improvement and is robust to be widely applied

    Know your customer:balancing innovation and regulation for financial inclusion

    Get PDF
    Financial inclusion depends on providing adjusted services for citizens with disclosed vulnerabilities. At the same time, the financial industry needs to adhere to a strict regulatory framework, which is often in conflict with the desire for inclusive, adaptive, and privacy-preserving services. In this article we study how this tension impacts the deployment of privacy-sensitive technologies aimed at financial inclusion. We conduct a qualitative study with banking experts to understand their perspectives on service development for financial inclusion. We build and demonstrate a prototype solution based on open source decentralized identifiers and verifiable credentials software and report on feedback from the banking experts on this system. The technology is promising thanks to its selective disclosure of vulnerabilities to the full control of the individual. This supports GDPR requirements, but at the same time, there is a clear tension between introducing these technologies and fulfilling other regulatory requirements, particularly with respect to 'Know Your Customer.' We consider the policy implications stemming from these tensions and provide guidelines for the further design of related technologies.Comment: Published in the Journal Data & Polic

    Practical and Provably Secure Distributed Aggregation: Verifiable Additive Homomorphic Secret Sharing

    Get PDF
    Often clients (e.g., sensors, organizations) need to outsource joint computations that are based on some joint inputs to external untrusted servers. These computations often rely on the aggregation of data collected from multiple clients, while the clients want to guarantee that the results are correct and, thus, an output that can be publicly verified is required. However, important security and privacy challenges are raised, since clients may hold sensitive information. In this paper, we propose an approach, called verifiable additive homomorphic secret sharing (VAHSS), to achieve practical and provably secure aggregation of data, while allowing for the clients to protect their secret data and providing public verifiability i.e., everyone should be able to verify the correctness of the computed result. We propose three VAHSS constructions by combining an additive homomorphic secret sharing (HSS) scheme, for computing the sum of the clients\u27 secret inputs, and three different methods for achieving public verifiability, namely: (i) homomorphic collision-resistant hash functions; (ii) linear homomorphic signatures; as well as (iii) a threshold RSA signature scheme. In all three constructions, we provide a detailed correctness, security, and verifiability analysis and detailed experimental evaluations. Our results demonstrate the efficiency of our proposed constructions, especially from the client side

    Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties

    Get PDF
    Learning from data owned by several parties, as in federated learning, raises challenges regarding the privacy guarantees provided to participants and the correctness of the computation in the presence of malicious parties. We tackle these challenges in the context of distributed averaging, an essential building block of distributed and federated learning. Our first contribution is a novel distributed differentially private protocol which naturally scales with the number of parties. The key idea underlying our protocol is to exchange correlated Gaussian noise along the edges of a network graph, complemented by independent noise added by each party. We analyze the differential privacy guarantees of our protocol and the impact of the graph topology, showing that we can match the accuracy of the trusted curator model even when each party communicates with only a logarithmic number of other parties chosen at random. This is in contrast with protocols in the local model of privacy (with lower accuracy) or based on secure aggregation (where all pairs of users need to exchange messages). Our second contribution is to enable users to prove the correctness of their computations without compromising the efficiency and privacy guarantees of the protocol. Our construction relies on standard cryptographic primitives like commitment schemes and zero knowledge proofs.Comment: 39 page

    Privacy-Preserving Cloud-Assisted Data Analytics

    Get PDF
    Nowadays industries are collecting a massive and exponentially growing amount of data that can be utilized to extract useful insights for improving various aspects of our life. Data analytics (e.g., via the use of machine learning) has been extensively applied to make important decisions in various real world applications. However, it is challenging for resource-limited clients to analyze their data in an efficient way when its scale is large. Additionally, the data resources are increasingly distributed among different owners. Nonetheless, users\u27 data may contain private information that needs to be protected. Cloud computing has become more and more popular in both academia and industry communities. By pooling infrastructure and servers together, it can offer virtually unlimited resources easily accessible via the Internet. Various services could be provided by cloud platforms including machine learning and data analytics. The goal of this dissertation is to develop privacy-preserving cloud-assisted data analytics solutions to address the aforementioned challenges, leveraging the powerful and easy-to-access cloud. In particular, we propose the following systems. To address the problem of limited computation power at user and the need of privacy protection in data analytics, we consider geometric programming (GP) in data analytics, and design a secure, efficient, and verifiable outsourcing protocol for GP. Our protocol consists of a transform scheme that converts GP to DGP, a transform scheme with computationally indistinguishability, and an efficient scheme to solve the transformed DGP at the cloud side with result verification. Evaluation results show that the proposed secure outsourcing protocol can achieve significant time savings for users. To address the problem of limited data at individual users, we propose two distributed learning systems such that users can collaboratively train machine learning models without losing privacy. The first one is a differentially private framework to train logistic regression models with distributed data sources. We employ the relevance between input data features and the model output to significantly improve the learning accuracy. Moreover, we adopt an evaluation data set at the cloud side to suppress low-quality data sources and propose a differentially private mechanism to protect user\u27s data quality privacy. Experimental results show that the proposed framework can achieve high utility with low quality data, and strong privacy guarantee. The second one is an efficient privacy-preserving federated learning system that enables multiple edge users to collaboratively train their models without revealing dataset. To reduce the communication overhead, we select well-aligned and large-enough magnitude gradients for uploading which leads to quick convergence. To minimize the noise added and improve model utility, each user only adds a small amount of noise to his selected gradients, encrypts the noise gradients before uploading, and the cloud server will only get the aggregate gradients that contain enough noise to achieve differential privacy. Evaluation results show that the proposed system can achieve high accuracy, low communication overhead, and strong privacy guarantee. In future work, we plan to design a privacy-preserving data analytics with fair exchange, which ensures the payment fairness. We will also consider designing distributed learning systems with heterogeneous architectures

    Privacy Preserving Opinion Aggregation

    Get PDF
    There are numerous settings in which people\u27s preferences are aggregated outside of formal elections, and where privacy and verification are important but the stringent authentication and coercion-resistant properties of government elections do not apply, a prime example being social media platforms. These systems are often iterative and have no trusted authority, in contrast to the centrally organised, single-shot elections on which most of the literature is focused. Moreover, they require a continuous flow of aggregation to take place and become available even as input is still collected from the participants which is in contrast to fairness in classical elections where partial results should never be revealed. In this work, we explore opinion aggregation in a decentralised, iterative setting by proposing a novel protocol in which randomly-chosen participants take turns to act in an incentive-driven manner as decryption authorities. Our construction provides public verifiability, robust vote privacy and liveness guarantees, while striving to minimise the resources each participant needs to contribute

    Lightweight Techniques for Private Heavy Hitters

    Full text link
    This paper presents a new protocol for solving the private heavy-hitters problem. In this problem, there are many clients and a small set of data-collection servers. Each client holds a private bitstring. The servers want to recover the set of all popular strings, without learning anything else about any client's string. A web-browser vendor, for instance, can use our protocol to figure out which homepages are popular, without learning any user's homepage. We also consider the simpler private subset-histogram problem, in which the servers want to count how many clients hold strings in a particular set without revealing this set to the clients. Our protocols use two data-collection servers and, in a protocol run, each client send sends only a single message to the servers. Our protocols protect client privacy against arbitrary misbehavior by one of the servers and our approach requires no public-key cryptography (except for secure channels), nor general-purpose multiparty computation. Instead, we rely on incremental distributed point functions, a new cryptographic tool that allows a client to succinctly secret-share the labels on the nodes of an exponentially large binary tree, provided that the tree has a single non-zero path. Along the way, we develop new general tools for providing malicious security in applications of distributed point functions. In an experimental evaluation with two servers on opposite sides of the U.S., the servers can find the 200 most popular strings among a set of 400,000 client-held 256-bit strings in 54 minutes. Our protocols are highly parallelizable. We estimate that with 20 physical machines per logical server, our protocols could compute heavy hitters over ten million clients in just over one hour of computation.Comment: To appear in IEEE Security & Privacy 202