14,764 research outputs found

    PUBA: Privacy-Preserving User-Data Bookkeeping and Analytics

    Get PDF
    In this paper we propose Privacy-preserving User-data Bookkeeping & Analytics (PUBA), a building block destined to enable the implementation of business models (e.g., targeted advertising) and regulations (e.g., fraud detection) requiring user-data analysis in a privacy-preserving way. In PUBA, users keep an unlinkable but authenticated cryptographic logbook containing their historic data on their device. This logbook can only be updated by the operator while its content is not revealed. Users can take part in a privacy-preserving analytics computation, where it is ensured that their logbook is up-to-date and authentic while the potentially secret analytics function is verified to be privacy-friendly. Taking constrained devices into account, users may also outsource analytic computations (to a potentially malicious proxy not colluding with the operator).We model our novel building block in the Universal Composability framework and provide a practical protocol instantiation. To demonstrate the flexibility of PUBA, we sketch instantiations of privacy-preserving fraud detection and targeted advertising, although it could be used in many more scenarios, e.g. data analytics for multi-modal transportation systems. We implemented our bookkeeping protocols and an exemplary outsourced analytics computation based on logistic regression using the MP-SPDZ MPC framework. Performance evaluations using a smartphone as user device and more powerful hardware for operator and proxy suggest that PUBA for smaller logbooks can indeed be practical

    LDP-IDS: Local Differential Privacy for Infinite Data Streams

    Get PDF
    Streaming data collection is essential to real-time data analytics in various IoTs and mobile device-based systems, which, however, may expose end users' privacy. Local differential privacy (LDP) is a promising solution to privacy-preserving data collection and analysis. However, existing few LDP studies over streams are either applicable to finite streams only or suffering from insufficient protection. This paper investigates this problem by proposing LDP-IDS, a novel ww-event LDP paradigm to provide practical privacy guarantee for infinite streams at users end, and adapting the popular budget division framework in centralized differential privacy (CDP). By constructing a unified error analysi for LDP, we first develop two adatpive budget division-based LDP methods for LDP-IDS that can enhance data utility via leveraging the non-deterministic sparsity in streams. Beyond that, we further propose a novel population division framework that can not only avoid the high sensitivity of LDP noise to budget division but also require significantly less communication. Based on the framework, we also present two adaptive population division methods for LDP-IDS with theoretical analysis. We conduct extensive experiments on synthetic and real-world datasets to evaluate the effectiveness and efficiency pf our proposed frameworks and methods. Experimental results demonstrate that, despite the effectiveness of the adaptive budget division methods, the proposed population division framework and methods can further achieve much higher effectiveness and efficiency.Comment: accepted to SIGMOD'2

    PriPeARL: A Framework for Privacy-Preserving Analytics and Reporting at LinkedIn

    Full text link
    Preserving privacy of users is a key requirement of web-scale analytics and reporting applications, and has witnessed a renewed focus in light of recent data breaches and new regulations such as GDPR. We focus on the problem of computing robust, reliable analytics in a privacy-preserving manner, while satisfying product requirements. We present PriPeARL, a framework for privacy-preserving analytics and reporting, inspired by differential privacy. We describe the overall design and architecture, and the key modeling components, focusing on the unique challenges associated with privacy, coverage, utility, and consistency. We perform an experimental study in the context of ads analytics and reporting at LinkedIn, thereby demonstrating the tradeoffs between privacy and utility needs, and the applicability of privacy-preserving mechanisms to real-world data. We also highlight the lessons learned from the production deployment of our system at LinkedIn.Comment: Conference information: ACM International Conference on Information and Knowledge Management (CIKM 2018

    Supporting Regularized Logistic Regression Privately and Efficiently

    Full text link
    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Increasing concerns over data privacy make it more and more difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used machine learning model in various disciplines while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluation on several studies validated the privacy guarantees, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc

    Scather: programming with multi-party computation and MapReduce

    Full text link
    We present a prototype of a distributed computational infrastructure, an associated high level programming language, and an underlying formal framework that allow multiple parties to leverage their own cloud-based computational resources (capable of supporting MapReduce [27] operations) in concert with multi-party computation (MPC) to execute statistical analysis algorithms that have privacy-preserving properties. Our architecture allows a data analyst unfamiliar with MPC to: (1) author an analysis algorithm that is agnostic with regard to data privacy policies, (2) to use an automated process to derive algorithm implementation variants that have different privacy and performance properties, and (3) to compile those implementation variants so that they can be deployed on an infrastructures that allows computations to take place locally within each participant’s MapReduce cluster as well as across all the participants’ clusters using an MPC protocol. We describe implementation details of the architecture, discuss and demonstrate how the formal framework enables the exploration of tradeoffs between the efficiency and privacy properties of an analysis algorithm, and present two example applications that illustrate how such an infrastructure can be utilized in practice.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798

    Big Data Privacy Context: Literature Effects On Secure Informational Assets

    Get PDF
    This article's objective is the identification of research opportunities in the current big data privacy domain, evaluating literature effects on secure informational assets. Until now, no study has analyzed such relation. Its results can foster science, technologies and businesses. To achieve these objectives, a big data privacy Systematic Literature Review (SLR) is performed on the main scientific peer reviewed journals in Scopus database. Bibliometrics and text mining analysis complement the SLR. This study provides support to big data privacy researchers on: most and least researched themes, research novelty, most cited works and authors, themes evolution through time and many others. In addition, TOPSIS and VIKOR ranks were developed to evaluate literature effects versus informational assets indicators. Secure Internet Servers (SIS) was chosen as decision criteria. Results show that big data privacy literature is strongly focused on computational aspects. However, individuals, societies, organizations and governments face a technological change that has just started to be investigated, with growing concerns on law and regulation aspects. TOPSIS and VIKOR Ranks differed in several positions and the only consistent country between literature and SIS adoption is the United States. Countries in the lowest ranking positions represent future research opportunities.Comment: 21 pages, 9 figure

    Secure multi-party computation for analytics deployed as a lightweight web application

    Full text link
    We describe the definition, design, implementation, and deployment of a secure multi-party computation protocol and web application. The protocol and application allow groups of cooperating parties with minimal expertise and no specialized resources to compute basic statistical analytics on their collective data sets without revealing the contributions of individual participants. The application was developed specifically to support a Boston Women’s Workforce Council (BWWC) study of wage disparities within employer organizations in the Greater Boston Area. The application has been deployed successfully to support two data collection sessions (in 2015 and in 2016) to obtain data pertaining to compensation levels across genders and demographics. Our experience provides insights into the particular security and usability requirements (and tradeoffs) a successful “MPC-as-a-service” platform design and implementation must negotiate.We would like to acknowledge all the members of the Boston Women’s Workforce Council, and to thank in particular MaryRose Mazzola, Christina M. Knowles, and Katie A. Johnston, who led the efforts to organize participants and deploy the protocol as part of the 100% Talent: The Boston Women’s Compact [31], [32] data collections. We also thank the Boston University Initiative on Cities (IOC), and in particular Executive Director Katherine Lusk, who brought this potential application of secure multi-party computation to our attention. The BWWC, the IOC, and several sponsors contributed funding to complete this work. Support was also provided in part by Smart-city Cloud-based Open Platform and Ecosystem (SCOPE), an NSF Division of Industrial Innovation and Partnerships PFI:BIC project under award #1430145, and by Modular Approach to Cloud Security (MACS), an NSF CISE CNS SaTC Frontier project under award #1414119
    • …
    corecore