1,729 research outputs found

    Decentralized Differentially Private Without-Replacement Stochastic Gradient Descent

    Full text link
    While machine learning has achieved remarkable results in a wide variety of domains, the training of models often requires large datasets that may need to be collected from different individuals. As sensitive information may be contained in the individual's dataset, sharing training data may lead to severe privacy concerns. Therefore, there is a compelling need to develop privacy-aware machine learning methods, for which one effective approach is to leverage the generic framework of differential privacy. Considering that stochastic gradient descent (SGD) is one of the mostly adopted methods for large-scale machine learning problems, two decentralized differentially private SGD algorithms are proposed in this work. Particularly, we focus on SGD without replacement due to its favorable structure for practical implementation. In addition, both privacy and convergence analysis are provided for the proposed algorithms. Finally, extensive experiments are performed to verify the theoretical results and demonstrate the effectiveness of the proposed algorithms

    Differentially Private Linear Models for Gossip Learning through Data Perturbation

    Get PDF
    Privacy is a key concern in many distributed systems that are rich in personal data such as networks of smart meters or smartphones. Decentralizing the processing of personal data in such systems is a promising first step towards achieving privacy through avoiding the collection of data altogether. However, decentralization in itself is not enough: Additional guarantees such as differential privacy are highly desirable. Here, we focus on stochastic gradient descent (SGD), a popular approach to implement distributed learning. Our goal is to design differentially private variants of SGD to be applied in gossip learning, a decentralized learning framework. Known approaches that are suitable for our scenario focus on protecting the gradient that is being computed in each iteration of SGD. This has the drawback that each data point can be accessed only a small number of times. We propose a solution in which we effectively publish the entire database in a differentially private way so that linear learners could be run that are allowed to access any (perturbed) data point any number of times. This flexibility is very useful when using the method in combination with distributed learning environments. We show empirically that the performance of the obtained model is comparable to that of previous gradient-based approaches and it is even superior in certain scenarios
    corecore