15 research outputs found
Processing Analytical Queries over Encrypted Data
MONOMI is a system for securely executing analytical workloads over sensitive data on an untrusted database server. MONOMI works by encrypting the entire database and running queries over the encrypted data. MONOMI introduces split client/server query execution, which can execute arbitrarily complex queries over encrypted data, as well as several techniques that improve performance for such workloads, including per-row precomputation, space-efficient encryption, grouped homomorphic addition, and pre-filtering. Since these optimizations are good for some queries but not others, MONOMI introduces a designer for choosing an efficient physical design at the server for a given workload, and a planner to choose an efficient execution plan for a given query at runtime. A prototype of MONOMI running on top of Postgres can execute most of the queries from the TPC-H benchmark with a median overhead of only 1.24× (ranging from 1.03×to 2.33×) compared to an un-encrypted Postgres database where a compromised server would reveal all data.National Science Foundation (U.S.) (Award IIS-1065219)Google (Firm
Privacy and Data-Based Research
What can we, as users of microdata, formally guarantee to the individuals (or firms) in our dataset, regarding their privacy? We retell a few stories, well-known in data-privacy circles, of failed anonymization attempts in publicly released datasets. We then provide a mostly informal introduction to several ideas from the literature on differential privacy, an active literature in computer science that studies formal approaches to preserving the privacy of individuals in statistical databases. We apply some of its insights to situations routinely faced by applied economists, emphasizing big-data contexts
Seeding with Differentially Private Network Information
When designing interventions in public health, development, and education,
decision makers rely on social network data to target a small number of people,
capitalizing on peer effects and social contagion to bring about the most
welfare benefits to the population. Developing new methods that are
privacy-preserving for network data collection and targeted interventions is
critical for designing sustainable public health and development interventions
on social networks. In a similar vein, social media platforms rely on network
data and information from past diffusions to organize their ad campaign and
improve the efficacy of targeted advertising. Ensuring that these network
operations do not violate users' privacy is critical to the sustainability of
social media platforms and their ad economies. We study privacy guarantees for
influence maximization algorithms when the social network is unknown, and the
inputs are samples of prior influence cascades that are collected at random.
Building on recent results that address seeding with costly network
information, our privacy-preserving algorithms introduce randomization in the
collected data or the algorithm output, and can bound each node's (or group of
nodes') privacy loss in deciding whether or not their data should be included
in the algorithm input. We provide theoretical guarantees of the seeding
performance with a limited sample size subject to differential privacy budgets
in both central and local privacy regimes. Simulations on synthetic and
empirical network datasets reveal the diminishing value of network information
with decreasing privacy budget in both regimes.Comment: Preliminary version in AAMAS 2023:
https://dl.acm.org/doi/10.5555/3545946.3599081 -- Code and data:
https://github.com/aminrahimian/dp-inf-ma