23,186 research outputs found

    Scather: programming with multi-party computation and MapReduce

    Full text link
    We present a prototype of a distributed computational infrastructure, an associated high level programming language, and an underlying formal framework that allow multiple parties to leverage their own cloud-based computational resources (capable of supporting MapReduce [27] operations) in concert with multi-party computation (MPC) to execute statistical analysis algorithms that have privacy-preserving properties. Our architecture allows a data analyst unfamiliar with MPC to: (1) author an analysis algorithm that is agnostic with regard to data privacy policies, (2) to use an automated process to derive algorithm implementation variants that have different privacy and performance properties, and (3) to compile those implementation variants so that they can be deployed on an infrastructures that allows computations to take place locally within each participant’s MapReduce cluster as well as across all the participants’ clusters using an MPC protocol. We describe implementation details of the architecture, discuss and demonstrate how the formal framework enables the exploration of tradeoffs between the efficiency and privacy properties of an analysis algorithm, and present two example applications that illustrate how such an infrastructure can be utilized in practice.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798

    Programming support for an integrated multi-party computation and MapReduce infrastructure

    Full text link
    We describe and present a prototype of a distributed computational infrastructure and associated high-level programming language that allow multiple parties to leverage their own computational resources capable of supporting MapReduce [1] operations in combination with multi-party computation (MPC). Our architecture allows a programmer to author and compile a protocol using a uniform collection of standard constructs, even when that protocol involves computations that take place locally within each participant’s MapReduce cluster as well as across all the participants using an MPC protocol. The highlevel programming language provided to the user is accompanied by static analysis algorithms that allow the programmer to reason about the efficiency of the protocol before compiling and running it. We present two example applications demonstrating how such an infrastructure can be employed.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    State of The Art and Hot Aspects in Cloud Data Storage Security

    Get PDF
    Along with the evolution of cloud computing and cloud storage towards matu- rity, researchers have analyzed an increasing range of cloud computing security aspects, data security being an important topic in this area. In this paper, we examine the state of the art in cloud storage security through an overview of selected peer reviewed publications. We address the question of defining cloud storage security and its different aspects, as well as enumerate the main vec- tors of attack on cloud storage. The reviewed papers present techniques for key management and controlled disclosure of encrypted data in cloud storage, while novel ideas regarding secure operations on encrypted data and methods for pro- tection of data in fully virtualized environments provide a glimpse of the toolbox available for securing cloud storage. Finally, new challenges such as emergent government regulation call for solutions to problems that did not receive enough attention in earlier stages of cloud computing, such as for example geographical location of data. The methods presented in the papers selected for this review represent only a small fraction of the wide research effort within cloud storage security. Nevertheless, they serve as an indication of the diversity of problems that are being addressed
    • …
    corecore