1,183 research outputs found
Balancing Security, Performance and Deployability in Encrypted Search
Encryption is an important tool for protecting data, especially data stored in the cloud. However, standard encryption techniques prevent efficient search. Searchable encryption attempts to solve this issue, protecting the data while still providing search functionality. Retaining the ability to search comes at a cost of security, performance and/or utility.
An important practical aspect of utility is compatibility with legacy systems. Unfortunately, the efficient searchable encryption constructions that are compatible with these systems have been proven vulnerable to attack, even against weaker adversary models.
The goal of this work is to address this security problem inherent with efficient, legacy compatible constructions. First, we present attacks on previous constructions that are compatible with legacy systems, demonstrating their vulnerability. Then we present two new searchable encryption constructions. The first, weakly randomized encryption, provides superior security to prior easily deployable constructions, while providing similar ease of deployment and query performance nearly identical to unencrypted databases. The second construction, EDDiES, provides much stronger security at the expense of a slight regression on performance.
These constructions show that it is possible to achieve a better balance of security and performance with the utility constraints that come with deployment in legacy systems
Location Privacy in Spatial Crowdsourcing
Spatial crowdsourcing (SC) is a new platform that engages individuals in
collecting and analyzing environmental, social and other spatiotemporal
information. With SC, requesters outsource their spatiotemporal tasks to a set
of workers, who will perform the tasks by physically traveling to the tasks'
locations. This chapter identifies privacy threats toward both workers and
requesters during the two main phases of spatial crowdsourcing, tasking and
reporting. Tasking is the process of identifying which tasks should be assigned
to which workers. This process is handled by a spatial crowdsourcing server
(SC-server). The latter phase is reporting, in which workers travel to the
tasks' locations, complete the tasks and upload their reports to the SC-server.
The challenge is to enable effective and efficient tasking as well as reporting
in SC without disclosing the actual locations of workers (at least until they
agree to perform a task) and the tasks themselves (at least to workers who are
not assigned to those tasks). This chapter aims to provide an overview of the
state-of-the-art in protecting users' location privacy in spatial
crowdsourcing. We provide a comparative study of a diverse set of solutions in
terms of task publishing modes (push vs. pull), problem focuses (tasking and
reporting), threats (server, requester and worker), and underlying technical
approaches (from pseudonymity, cloaking, and perturbation to exchange-based and
encryption-based techniques). The strengths and drawbacks of the techniques are
highlighted, leading to a discussion of open problems and future work
Improved Reconstruction Attacks on Encrypted Data Using Range Query Leakage
We analyse the security of database encryption schemes supporting range queries against persistent adversaries. The bulk of our work applies to a generic setting, where the adversary's view is limited to the set of records matched by each query (known as access pattern leakage). We also consider a more specific setting where certain rank information is also leaked. The latter is inherent to multiple recent encryption schemes supporting range queries, including Kerschbaum's FH-OPE scheme (CCS 2015), Lewi and Wu's order-revealing encryption scheme (CCS 2016), and the recently proposed Arx scheme of Poddar et al. (IACR eprint 2016/568, 2016/591). We provide three attacks.
First, we consider full reconstruction, which aims to recover the value of every record, fully negating encryption. We show that for dense datasets, full reconstruction is possible within an expected number of queries NlogN+O(N)Nlogā”N+O(N), where NN is the number of distinct plaintext values. This directly improves on a O(N2logN)O(N2logā”N) bound in the same setting by Kellaris et al. (CCS 2016). We also provide very efficient, data-optimal algorithms that succeed with the minimum possible number of queries (in a strong, information theoretical sense), and prove a matching data lower bound for the number of queries required.
Second, we present an approximate reconstruction attack recovering all plaintext values in a dense dataset within a constant ratio of error (such as a 5% error), requiring the access pattern leakage of only O(N)O(N) queries. We also prove a matching lower bound.
Third, we devise an attack in the common setting where the adversary has access to an auxiliary distribution for the target dataset. This third attack proves highly effective on age data from real-world medical data sets. In our experiments, observing only 25 queries was sufficient to reconstruct a majority of records to within 5 years.
In combination, our attacks show that current approaches to enabling range queries offer little security when the threat model goes beyond snapshot attacks to include a persistent server-side adversary
The Tao of Inference in Privacy-Protected Databases
To protect database confidentiality even in the face of full compromise while supporting standard functionality, recent academic proposals and commercial products rely on a mix of encryption schemes. The
common recommendation is to apply strong, semantically secure encryption to the āsensitiveā columns and protect other columns with property-revealing encryption (PRE) that supports operations such as sorting.
We design, implement, and evaluate a new methodology for inferring data stored in such encrypted databases. The cornerstone is the multinomial attack, a new inference technique that is analytically optimal and empirically outperforms prior heuristic attacks against PRE-encrypted data. We also extend the multinomial attack to take advantage of correlations across multiple columns. These improvements
recover PRE-encrypted data with sufficient accuracy to then apply machine learning and record linkage methods to infer the values of columns protected by semantically secure encryption or redaction.
We evaluate our methodology on medical, census, and union-membership datasets, showing for the first time how to infer full database records. For PRE-encrypted attributes such as demographics and
ZIP codes, our attack outperforms the best prior heuristic by a factor of 16. Unlike any prior technique, we also infer attributes, such as incomes and medical diagnoses, protected by strong encryption. For
example, when we infer that a patient in a hospital-discharge dataset has a mental health or substance abuse condition, this prediction is 97% accurate
- ā¦