Search CORE

33 research outputs found

Recommended from our members

New Data Protection Abstractions for Emerging Mobile and Big Data Workloads

Author: Spahn Riley Burns
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Two recent shifts in computing are challenging the effectiveness of traditional approaches to data protection. Emerging machine learning workloads have complex access patterns and unique leakage characteristics that are not well supported by existing protection approaches. Second, mobile operating systems do not provide sufficient support for fine grained data protection tools forcing users to rely on individual applications to correctly manage and protect data. My thesis is that these emerging workloads have unique characteristics that we can leverage to build new, more effective data protection abstractions. This dissertation presents two new data protection systems for machine learning work-loads and a new system for fine grained data management and protection on mobile devices. First is Sage, a differentially private machine learning platform addressing the two primary challenges of differential privacy: running out of budget and the privacy utility tradeoff. The second system, Pyramid, is the first selective data system. Pyramid leverages count featurization to reduce the amount of data exposed while training classification models by two orders of magnitude. The final system, Pebbles, provides users with logical data objects as a new fine grained data management and protection primitive allowing data management at a higher level of abstraction. Pebbles, leverages high level storage abstractions in mobile operating systems to discover user recognizable application level data objects in unmodified mobile applications

Columbia University Academic Commons

Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization

Author: Geambasu Roxana
Huang Tzu-Kuo
Lecuyer Mathias
Sen Siddhartha
Spahn Riley
Publication venue
Publication date: 21/05/2017
Field of study

Protecting vast quantities of data poses a daunting challenge for the growing number of organizations that collect, stockpile, and monetize it. The ability to distinguish data that is actually needed from data collected "just in case" would help these organizations to limit the latter's exposure to attack. A natural approach might be to monitor data use and retain only the working-set of in-use data in accessible storage; unused data can be evicted to a highly protected store. However, many of today's big data applications rely on machine learning (ML) workloads that are periodically retrained by accessing, and thus exposing to attack, the entire data store. Training set minimization methods, such as count featurization, are often used to limit the data needed to train ML workloads to improve performance or scalability. We present Pyramid, a limited-exposure data management system that builds upon count featurization to enhance data protection. As such, Pyramid uniquely introduces both the idea and proof-of-concept for leveraging training set minimization methods to instill rigor and selectivity into big data management. We integrated Pyramid into Spark Velox, a framework for ML-based targeting and personalization. We evaluate it on three applications and show that Pyramid approaches state-of-the-art models while training on less than 1% of the raw data

arXiv.org e-Print Archive

Crossref

XRay: Enhancing the Web's Transparency with Differential Correlation

Author: Chaintreau Augustin
Ducoffe Guillaume
Geambasu Roxana
Lan Francis
Lecuyer Mathias
Papancea Andrei
Petsios Theofilos
Spahn Riley
Publication venue
Publication date: 20/08/2014
Field of study

Today's Web services - such as Google, Amazon, and Facebook - leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose. To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay's core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing outputs from different accounts with similar, but not identical, subsets of data, one can pinpoint targeting through correlation. We show both theoretically, and through experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision and recall by correlating data from a surprisingly small number of extra accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security Symposium (USENIX Security 14

arXiv.org e-Print Archive

CiteSeerX

HAL-UNICE

INRIA a CCSD electronic archive server

ACMiner: Extraction and Analysis of Authorization Checks in Android's Middleware

Author: Arzt Steven
Backes Michael
Backes Michael
Dalton Andrew
Enck William
Enck William
Felt Adrienne Porter
Grace Michael
Lam Patrick
McCoy Travis
Milian Mark
Ranger Steve
Shao Yuru
Spahn Riley
Tan Lin
Vallée-Rai Raja
Zhang Xiaolan
Publication venue
Publication date: 01/01/2019
Field of study

Billions of users rely on the security of the Android platform to protect phones, tablets, and many different types of consumer electronics. While Android's permission model is well studied, the enforcement of the protection policy has received relatively little attention. Much of this enforcement is spread across system services, taking the form of hard-coded checks within their implementations. In this paper, we propose Authorization Check Miner (ACMiner), a framework for evaluating the correctness of Android's access control enforcement through consistency analysis of authorization checks. ACMiner combines program and text analysis techniques to generate a rich set of authorization checks, mines the corresponding protection policy for each service entry point, and uses association rule mining at a service granularity to identify inconsistencies that may correspond to vulnerabilities. We used ACMiner to study the AOSP version of Android 7.1.1 to identify 28 vulnerabilities relating to missing authorization checks. In doing so, we demonstrate ACMiner's ability to help domain experts process thousands of authorization checks scattered across millions of lines of code

arXiv.org e-Print Archive

Crossref

Open Repository and Bibliography - Luxembourg