27,635 research outputs found
Privacy Preserving ID3 over Horizontally, Vertically and Grid Partitioned Data
We consider privacy preserving decision tree induction via ID3 in the case
where the training data is horizontally or vertically distributed. Furthermore,
we consider the same problem in the case where the data is both horizontally
and vertically distributed, a situation we refer to as grid partitioned data.
We give an algorithm for privacy preserving ID3 over horizontally partitioned
data involving more than two parties. For grid partitioned data, we discuss two
different evaluation methods for preserving privacy ID3, namely, first merging
horizontally and developing vertically or first merging vertically and next
developing horizontally. Next to introducing privacy preserving data mining
over grid-partitioned data, the main contribution of this paper is that we
show, by means of a complexity analysis that the former evaluation method is
the more efficient.Comment: 25 page
Privacy-preserving Data Sharing on Vertically Partitioned Data
In this work, we introduce a differentially private method for generating
synthetic data from vertically partitioned data, \emph{i.e.}, where data of the
same individuals is distributed across multiple data holders or parties. We
present a differentially privacy stochastic gradient descent (DP-SGD) algorithm
to train a mixture model over such partitioned data using variational
inference. We modify a secure multiparty computation (MPC) framework to combine
MPC with differential privacy (DP), in order to use differentially private MPC
effectively to learn a probabilistic generative model under DP on such
vertically partitioned data.
Assuming the mixture components contain no dependencies across different
parties, the objective function can be factorized into a sum of products of the
contributions calculated by the parties. Finally, MPC is used to compute the
aggregate between the different contributions. Moreover, we rigorously define
the privacy guarantees with respect to the different players in the system. To
demonstrate the accuracy of our method, we run our algorithm on the Adult
dataset from the UCI machine learning repository, where we obtain comparable
results to the non-partitioned case
Privacy-Preserving Sequential Pattern Mining Over Vertically Partitioned Data
Privacy-preserving data mining in distributed environments is an important issue in the field of data mining. In this paper, we study how to conduct sequential patterns mining, which is one of the data mining computations, on private data in the following scenario: Multiple parties, each having a private data set, want to jointly conduct sequential pattern mining. Since no party wants to disclose its private data to other parties, a secure method needs to be provided to make such a computation feasible. We develop a practical solution to the above problem in this paper
Compressed-VFL: Communication-Efficient Learning with Vertically Partitioned Data
We propose Compressed Vertical Federated Learning (C-VFL) for
communication-efficient training on vertically partitioned data. In C-VFL, a
server and multiple parties collaboratively train a model on their respective
features utilizing several local iterations and sharing compressed intermediate
results periodically. Our work provides the first theoretical analysis of the
effect message compression has on distributed training over vertically
partitioned data. We prove convergence of non-convex objectives at a rate of
when the compression error is bounded over the course
of training. We provide specific requirements for convergence with common
compression techniques, such as quantization and top- sparsification.
Finally, we experimentally show compression can reduce communication by over
without a significant decrease in accuracy over VFL without compression
Privacy-Preserving Naive Bayesian Classification Over Vertically Partitioned Data
Protection of privacy is a critical problem in data mining. Preserving data privacy in distributed data mining is even more challenging. In this paper, we consider the problem of privacy-preserving naive Bayesian classification over vertically partitioned data. The problem is one of important issues in privacypreserving distributed data mining. Our approach is based on homomorphic encryption. The scheme is very efficient in the term of computation and communication cost
- …