Search CORE

765 research outputs found

Privacy Preserving ID3 over Horizontally, Vertically and Grid Partitioned Data

Author: Kuijpers Bart
Lemmens Vanessa
Moelans Bart
Tuyls Karl
Publication venue
Publication date: 11/03/2008
Field of study

We consider privacy preserving decision tree induction via ID3 in the case where the training data is horizontally or vertically distributed. Furthermore, we consider the same problem in the case where the data is both horizontally and vertically distributed, a situation we refer to as grid partitioned data. We give an algorithm for privacy preserving ID3 over horizontally partitioned data involving more than two parties. For grid partitioned data, we discuss two different evaluation methods for preserving privacy ID3, namely, first merging horizontally and developing vertically or first merging vertically and next developing horizontally. Next to introducing privacy preserving data mining over grid-partitioned data, the main contribution of this paper is that we show, by means of a complexity analysis that the former evaluation method is the more efficient.Comment: 25 page

arXiv.org e-Print Archive

University of Liverpool Repository

Privacy-Preserving Decision Tree Classification over Horizontally Partitioned Data

Author: Chang LiWu
Matwin Stan
Zhan Justin
Publication venue: AIS Electronic Library (AISeL)
Publication date: 05/12/2005
Field of study

Protection of privacy is one of important problems in data mining. The unwillingness to share their data frequently results in failure of collaborative data mining. This paper studies how to build a decision tree classifier under the following scenario: a database is horizontally partitioned into multiple pieces, with each piece owned by a particular party. All the parties want to build a decision tree classifier based on such a database, but due to the privacy constraints, neither of them wants to disclose their private pieces. We build a privacy-preserving system, including a set of secure protocols, that allows the parties to construct such a classifier. We guarantee that the private data are securely protected

AIS Electronic Library (AISeL)

Efficient Privacy Preserving Distributed Clustering Based on Secret Sharing

Author: Savas Erkay
Savaş Erkay
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2007
Field of study

In this paper, we propose a privacy preserving distributed clustering protocol for horizontally partitioned data based on a very efficient homomorphic additive secret sharing scheme. The model we use for the protocol is novel in the sense that it utilizes two non-colluding third parties. We provide a brief security analysis of our protocol from information theoretic point of view, which is a stronger security model. We show communication and computation complexity analysis of our protocol along with another protocol previously proposed for the same problem. We also include experimental results for computation and communication overhead of these two protocols. Our protocol not only outperforms the others in execution time and communication overhead on data holders, but also uses a more efficient model for many data mining applications

Sabanci University Research Database

An Enhanced CART Algorithm for Preserving Privacy of Distributed Data and Provide Access Control over Tree Data

Author: Monika Gupta
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2015
Field of study

Now in these days the utilization of distributed applications are increases rapidly because these applications are serve more than one client at a time. In the use of distributed database data distribution and management is a key area of attraction. Because of privacy of private data organizations are unwilling to participate for data mining due to the data leakage. So it is required to collect data from different parties in a secured way. This paper represents how CART algorithm can be used for multi parties in vertically partitioned environment. In order to solve the privacy and security issues the proposed model incorporates the server side random key generation and key distribution. Finally the performance of proposed classification technique is evaluated in terms of memory consumption, training time, search time, accuracy and there error rate

International Journal on Recent and Innovation Trends in Computing and Communication

Exploring Machine Learning Models for Federated Learning: A Review of Approaches, Performance, and Limitations

Author: Jafarigol Elaheh
Razzaghi Talayeh
Trafalis Theodore
Zamankhani Mona
Publication venue
Publication date: 17/11/2023
Field of study

In the growing world of artificial intelligence, federated learning is a distributed learning framework enhanced to preserve the privacy of individuals' data. Federated learning lays the groundwork for collaborative research in areas where the data is sensitive. Federated learning has several implications for real-world problems. In times of crisis, when real-time decision-making is critical, federated learning allows multiple entities to work collectively without sharing sensitive data. This distributed approach enables us to leverage information from multiple sources and gain more diverse insights. This paper is a systematic review of the literature on privacy-preserving machine learning in the last few years based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specifically, we have presented an extensive review of supervised/unsupervised machine learning algorithms, ensemble methods, meta-heuristic approaches, blockchain technology, and reinforcement learning used in the framework of federated learning, in addition to an overview of federated learning applications. This paper reviews the literature on the components of federated learning and its applications in the last few years. The main purpose of this work is to provide researchers and practitioners with a comprehensive overview of federated learning from the machine learning point of view. A discussion of some open problems and future research directions in federated learning is also provided

arXiv.org e-Print Archive

Privacy-Preserving Federated Learning over Vertically and Horizontally Partitioned Data for Financial Anomaly Detection

Author: Baracaldo Nathalie
Drucker Nir
Holohan Naoise
Houck Keith
Kadhe Swanand Ravindra
Kawahara Ryo
King Alan
Kushnir Eyal
Ludwig Heiko
Purcell Mark
Rawat Ambrish
Shaul Hayim
Soceanu Omri
Takeuchi Mikio
Zhou Yi
Publication venue
Publication date: 30/10/2023
Field of study

The effective detection of evidence of financial anomalies requires collaboration among multiple entities who own a diverse set of data, such as a payment network system (PNS) and its partner banks. Trust among these financial institutions is limited by regulation and competition. Federated learning (FL) enables entities to collaboratively train a model when data is either vertically or horizontally partitioned across the entities. However, in real-world financial anomaly detection scenarios, the data is partitioned both vertically and horizontally and hence it is not possible to use existing FL approaches in a plug-and-play manner. Our novel solution, PV4FAD, combines fully homomorphic encryption (HE), secure multi-party computation (SMPC), differential privacy (DP), and randomization techniques to balance privacy and accuracy during training and to prevent inference threats at model deployment time. Our solution provides input privacy through HE and SMPC, and output privacy against inference time attacks through DP. Specifically, we show that, in the honest-but-curious threat model, banks do not learn any sensitive features about PNS transactions, and the PNS does not learn any information about the banks' dataset but only learns prediction labels. We also develop and analyze a DP mechanism to protect output privacy during inference. Our solution generates high-utility models by significantly reducing the per-bank noise level while satisfying distributed DP. To ensure high accuracy, our approach produces an ensemble model, in particular, a random forest. This enables us to take advantage of the well-known properties of ensembles to reduce variance and increase accuracy. Our solution won second prize in the first phase of the U.S. Privacy Enhancing Technologies (PETs) Prize Challenge.Comment: Prize Winner in the U.S. Privacy Enhancing Technologies (PETs) Prize Challeng

arXiv.org e-Print Archive