76,950 research outputs found

    Perceptions and Truth: A Mechanism Design Approach to Crowd-Sourcing Reputation

    Full text link
    We consider a distributed multi-user system where individual entities possess observations or perceptions of one another, while the truth is only known to themselves, and they might have an interest in withholding or distorting the truth. We ask the question whether it is possible for the system as a whole to arrive at the correct perceptions or assessment of all users, referred to as their reputation, by encouraging or incentivizing the users to participate in a collective effort without violating private information and self-interest. Two specific applications, online shopping and network reputation, are provided to motivate our study and interpret the results. In this paper we investigate this problem using a mechanism design theoretic approach. We introduce a number of utility models representing users' strategic behavior, each consisting of one or both of a truth element and an image element, reflecting the user's desire to obtain an accurate view of the other and an inflated image of itself. For each model, we either design a mechanism that achieves the optimal performance (solution to the corresponding centralized problem), or present individually rational sub-optimal solutions. In the latter case, we demonstrate that even when the centralized solution is not achievable, by using a simple punish-reward mechanism, not only a user has the incentive to participate and provide information, but also that this information can improve the system performance.Comment: 14 pages, 3 figure

    How To Backdoor Federated Learning

    Full text link
    Federated learning enables thousands of participants to construct a deep learning model without sharing their private training data with each other. For example, multiple smartphones can jointly train a next-word predictor for keyboards without revealing what individual users type. We demonstrate that any participant in federated learning can introduce hidden backdoor functionality into the joint global model, e.g., to ensure that an image classifier assigns an attacker-chosen label to images with certain features, or that a word predictor completes certain sentences with an attacker-chosen word. We design and evaluate a new model-poisoning methodology based on model replacement. An attacker selected in a single round of federated learning can cause the global model to immediately reach 100% accuracy on the backdoor task. We evaluate the attack under different assumptions for the standard federated-learning tasks and show that it greatly outperforms data poisoning. Our generic constrain-and-scale technique also evades anomaly detection-based defenses by incorporating the evasion into the attacker's loss function during training

    Differentially Private Hierarchical Count-of-Counts Histograms

    Full text link
    We consider the problem of privately releasing a class of queries that we call hierarchical count-of-counts histograms. Count-of-counts histograms partition the rows of an input table into groups (e.g., group of people in the same household), and for every integer j report the number of groups of size j. Hierarchical count-of-counts queries report count-of-counts histograms at different granularities as per hierarchy defined on an attribute in the input data (e.g., geographical location of a household at the national, state and county levels). In this paper, we introduce this problem, along with appropriate error metrics and propose a differentially private solution that generates count-of-counts histograms that are consistent across all levels of the hierarchy.Comment: 13 page

    Privacy-Preserving Multiparty Learning For Logistic Regression

    Full text link
    In recent years, machine learning techniques are widely used in numerous applications, such as weather forecast, financial data analysis, spam filtering, and medical prediction. In the meantime, massive data generated from multiple sources further improve the performance of machine learning tools. However, data sharing from multiple sources brings privacy issues for those sources since sensitive information may be leaked in this process. In this paper, we propose a framework enabling multiple parties to collaboratively and accurately train a learning model over distributed datasets while guaranteeing the privacy of data sources. Specifically, we consider logistic regression model for data training and propose two approaches for perturbing the objective function to preserve {\epsilon}-differential privacy. The proposed solutions are tested on real datasets, including Bank Marketing and Credit Card Default prediction. Experimental results demonstrate that the proposed multiparty learning framework is highly efficient and accurate.Comment: This work was done when Wei Du was at the University of Arkansa

    Towards Federated Learning at Scale: System Design

    Full text link
    Federated Learning is a distributed machine learning approach which enables model training on a large corpus of decentralized data. We have built a scalable production system for Federated Learning in the domain of mobile devices, based on TensorFlow. In this paper, we describe the resulting high-level design, sketch some of the challenges and their solutions, and touch upon the open problems and future directions

    Engineering Methods for Differentially Private Histograms: Efficiency Beyond Utility

    Full text link
    Publishing histograms with ϵ\epsilon-differential privacy has been studied extensively in the literature. Existing schemes aim at maximizing the utility of the published data, while previous experimental evaluations analyze the privacy/utility trade-off. In this paper we provide the first experimental evaluation of differentially private methods that goes beyond utility, emphasizing also on another important aspect, namely efficiency. Towards this end, we first observe that all existing schemes are comprised of a small set of common blocks. We then optimize and choose the best implementation for each block, determine the combinations of blocks that capture the entire literature, and propose novel block combinations. We qualitatively assess the quality of the schemes based on the skyline of efficiency and utility, i.e., based on whether a method is dominated on both aspects or not. Using exhaustive experiments on four real datasets with different characteristics, we conclude that there are always trade-offs in terms of utility and efficiency. We demonstrate that the schemes derived from our novel block combinations provide the best trade-offs for time critical applications. Our work can serve as a guide to help practitioners engineer a differentially private histogram scheme depending on their application requirements

    Quantifying Privacy in Nuclear Warhead Authentication Protocols

    Full text link
    International verification of nuclear warheads is a practical problem in which the protection of secret warhead information is of paramount importance. We propose a measure that would enable a weapon owner to evaluate the privacy of a proposed protocol in a technology-neutral fashion. We show the problem is reducible to `natural' and `corrective' learning. The natural learning can be computed without assumptions about the inspector, while the corrective learning accounts for the inspector's prior knowledge. The natural learning provides the warhead owner a useful lower bound on the information leaked by the proposed protocol. Using numerical examples, we demonstrate that the proposed measure correlates better with the accuracy of a maximum a posteriori probability estimate than alternative measures

    Learning Privately from Multiparty Data

    Full text link
    Learning a classifier from private data collected by multiple parties is an important problem that has many potential applications. How can we build an accurate and differentially private global classifier by combining locally-trained classifiers from different parties, without access to any party's private data? We propose to transfer the `knowledge' of the local classifier ensemble by first creating labeled data from auxiliary unlabeled data, and then train a global ϵ\epsilon-differentially private classifier. We show that majority voting is too sensitive and therefore propose a new risk weighted by class probabilities estimated from the ensemble. Relative to a non-private solution, our private solution has a generalization error bounded by O(ϵ2M2)O(\epsilon^{-2}M^{-2}) where MM is the number of parties. This allows strong privacy without performance loss when MM is large, such as in crowdsensing applications. We demonstrate the performance of our method with realistic tasks of activity recognition, network intrusion detection, and malicious URL detection

    Distributed Differentially Private Computation of Functions with Correlated Noise

    Full text link
    Many applications of machine learning, such as human health research, involve processing private or sensitive information. Privacy concerns may impose significant hurdles to collaboration in scenarios where there are multiple sites holding data and the goal is to estimate properties jointly across all datasets. Differentially private decentralized algorithms can provide strong privacy guarantees. However, the accuracy of the joint estimates may be poor when the datasets at each site are small. This paper proposes a new framework, Correlation Assisted Private Estimation (CAPE), for designing privacy-preserving decentralized algorithms with better accuracy guarantees in an honest-but-curious model. CAPE can be used in conjunction with the functional mechanism for statistical and machine learning optimization problems. A tighter characterization of the functional mechanism is provided that allows CAPE to achieve the same performance as a centralized algorithm in the decentralized setting using all datasets. Empirical results on regression and neural network problems for both synthetic and real datasets show that differentially private methods can be competitive with non-private algorithms in many scenarios of interest.Comment: The manuscript is partially subsumed by arXiv:1910.1291

    Defending Non-Bayesian Learning against Adversarial Attacks

    Full text link
    This paper addresses the problem of non-Bayesian learning over multi-agent networks, where agents repeatedly collect partially informative observations about an unknown state of the world, and try to collaboratively learn the true state. We focus on the impact of the adversarial agents on the performance of consensus-based non-Bayesian learning, where non-faulty agents combine local learning updates with consensus primitives. In particular, we consider the scenario where an unknown subset of agents suffer Byzantine faults -- agents suffering Byzantine faults behave arbitrarily. Two different learning rules are proposed
    corecore