Search CORE

467 research outputs found

Programming support for an integrated multi-party computation and MapReduce infrastructure

Author: Bestavros Azer
Lapets Andrei
Volgushev Nikolaj
Publication venue: Computer Science Department, Boston University
Publication date: 01/09/2015
Field of study

We describe and present a prototype of a distributed computational infrastructure and associated high-level programming language that allow multiple parties to leverage their own computational resources capable of supporting MapReduce [1] operations in combination with multi-party computation (MPC). Our architecture allows a programmer to author and compile a protocol using a uniform collection of standard constructs, even when that protocol involves computations that take place locally within each participant’s MapReduce cluster as well as across all the participants using an MPC protocol. The highlevel programming language provided to the user is accompanied by static analysis algorithms that allow the programmer to reason about the efficiency of the protocol before compiling and running it. We present two example applications demonstrating how such an infrastructure can be employed.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798

CiteSeerX

Boston University Institutional Repository (OpenBU)

Privacy-Preserving Secret Shared Computations using MapReduce

Author: Dolev Shlomi
Gupta Peeyush
Li Yin
Mehrotra Sharad
Sharma Shantanu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Data outsourcing allows data owners to keep their data at \emph{untrusted} clouds that do not ensure the privacy of data and/or computations. One useful framework for fault-tolerant data processing in a distributed fashion is MapReduce, which was developed for \emph{trusted} private clouds. This paper presents algorithms for data outsourcing based on Shamir's secret-sharing scheme and for executing privacy-preserving SQL queries such as count, selection including range selection, projection, and join while using MapReduce as an underlying programming model. Our proposed algorithms prevent an adversary from knowing the database or the query while also preventing output-size and access-pattern attacks. Interestingly, our algorithms do not involve the database owner, which only creates and distributes secret-shares once, in answering any query, and hence, the database owner also cannot learn the query. Logically and experimentally, we evaluate the efficiency of the algorithms on the following parameters: (\textit{i}) the number of communication rounds (between a user and a server), (\textit{ii}) the total amount of bit flow (between a user and a server), and (\textit{iii}) the computational load at the user and the server.\BComment: IEEE Transactions on Dependable and Secure Computing, Accepted 01 Aug. 201

arXiv.org e-Print Archive

eScholarship - University of California

DEMO: integrating MPC in big data workflows

Author: Bestavros Azer
Lapets Andrei
Schwarzkopf Malte
Varia Mayank
Volgushev Nikolaj
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Secure multi-party computation (MPC) allows multiple parties to perform a joint computation without disclosing their private inputs. Many real-world joint computation use cases, however, involve data analyses on very large data sets, and are implemented by software engineers who lack MPC knowledge. Moreover, the collaborating parties -- e.g., several companies -- often deploy different data analytics stacks internally. These restrictions hamper the real-world usability of MPC. To address these challenges, we combine existing MPC frameworks with data-parallel analytics frameworks by extending the Musketeer big data workflow manager [4]. Musketeer automatically generates code for both the sensitive parts of a workflow, which are executed in MPC, and the remainder of the computation, which runs on scalable, widely-deployed analytics systems. In a prototype use case, we compute the Herfindahl-Hirschman Index (HHI), an index of market concentration used in antitrust regulation, on an aggregate 156GB of taxi trip data over five transportation companies. Our implementation computes the HHI in about 20 minutes using a combination of Hadoop and VIFF [1], while even "mixed mode" MPC with VIFF alone would have taken many hours. Finally, we discuss future research questions that we seek to address using our approach

Boston University Institutional Repository (OpenBU)

Cryptology ePrint Archive

Privacy-preserving Data clustering in Cloud Computing based on Fully Homomorphic Encryption

Author: Alabdulatif Abdulatif
Khalil Ibrahim
Kumarage Heshan
Reynolds Mark
Yi Xun
Publication venue: AIS Electronic Library (AISeL)
Publication date: 20/07/2017
Field of study

Cloud infrastructure with its massive storage and computing power is an ideal platform to perform large scale data analysis tasks to extract knowledge and support decision-making. However, there are critical data privacy and security issues associated with this platform, as the data is stored in a public infrastructure. Recently, fully homomorphic data encryption has been proposed as a solution due to its capabilities in performing computations over encrypted data. However, it is demonstrably slow for practical data mining applications. To address this and related concerns, we introduce a fully homomorphic and distributed data processing framework that utilizes MapReduce to perform distributed computations for data clustering tasks on a large number of cloud Virtual Machines (VMs). We illustrate how a variety of fully homomorphic-based computations can be carried out to accomplish data clustering tasks independently in the cloud and show that the distributed execution of data clustering tasks based on MapReduce can significantly reduce the execution time overhead caused by fully homomorphic computations. To evaluate our framework, we performed experiments using electricity consumption measurement data on the Google cloud platform with 100 VMs. We found the proposed distributed data processing framework to be highly efficient when compared to a centralized approach and as accurate as a plaintext implementation

RMIT Research Repository

AIS Electronic Library (AISeL)