154 research outputs found
Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners
The k-nearest neighbors (k-NN) algorithm is a popular and effective
classification algorithm. Due to its large storage and computational
requirements, it is suitable for cloud outsourcing. However, k-NN is often run
on sensitive data such as medical records, user images, or personal
information. It is important to protect the privacy of data in an outsourced
k-NN system.
Prior works have all assumed the data owners (who submit data to the
outsourced k-NN system) are a single trusted party. However, we observe that in
many practical scenarios, there may be multiple mutually distrusting data
owners. In this work, we present the first framing and exploration of privacy
preservation in an outsourced k-NN system with multiple data owners. We
consider the various threat models introduced by this modification. We discover
that under a particularly practical threat model that covers numerous
scenarios, there exists a set of adaptive attacks that breach the data privacy
of any exact k-NN system. The vulnerability is a result of the mathematical
properties of k-NN and its output. Thus, we propose a privacy-preserving
alternative system supporting kernel density estimation using a Gaussian
kernel, a classification algorithm from the same family as k-NN. In many
applications, this similar algorithm serves as a good substitute for k-NN. We
additionally investigate solutions for other threat models, often through
extensions on prior single data owner systems
Big Data Analytics over Encrypted Datasets with Seabed
Today, enterprises collect large amounts of data and leverage the cloud to perform analytics over this data. Since the data is often sensitive, enterprises would prefer to keep it confidential and to hide it even from the cloud operator. Systems such as CryptDB and Monomi can accomplish this by operating mostly on encrypted data; however, these systems rely on expensive cryptographic techniques that limit performance in true “big data” scenarios that involve terabytes of data or more. This paper presents Seabed, a system that enables efficient analytics over large encrypted datasets. In contrast to previous systems, which rely on asymmetric encryption schemes, Seabed uses a novel, additively symmetric homomorphic encryption scheme (ASHE) to perform large-scale aggregations efficiently. Additionally, Seabed introduces a novel randomized encryption scheme called Splayed ASHE, or SPLASHE, that can, in certain cases, prevent frequency attacks based on auxiliary data
Survey on Efficient Information Retrieval for Ranked Query in Cost-Efficient Clouds
Cloud computing technology redefines the advances in information technology. The most challenging research works in cloud computing is privacy and protection of data. Cloud computing provides an innovative business model for organizations with minimal investment. Cloud computing has emerged as a major driver in reducing the information technology costs incurred by organizations. Security is one of the major issues in cloud computing. So it is necessary to protect the user privacy while querying the data in the cloud environment, different techniques are developed by researchers to provide privacy, but the computational and bandwidth costs increased which are unacceptable to the users. This paper presents description and comparison of Ostrovsky, COPS and EIRQ protocols which are currently available for retrieving information from clouds. EIRQ protocol is the latest among these protocols and it addresses the issues of privacy, aggregation, CPU consumption and network bandwidth usage
A Privacy-Preserving Framework for Collaborative Association Rule Mining in Cloud
Collaborative Data Mining facilitates multiple organizations to integrate their datasets and extract useful knowledge from their joint datasets for mutual benefits. The knowledge extracted in this manner is found to be superior to the knowledge extracted locally from a single organization’s dataset. With the rapid development of outsourcing, there is a growing interest for organizations to outsource their data mining tasks to a cloud environment to effectively address their economic and performance demands. However, due to privacy concerns and stringent compliance regulations, organizations do not want to share their private datasets neither with the cloud nor with other participating organizations. In this paper, we address the problem of outsourcing association rule mining task to a federated cloud environment in a privacy-preserving manner. Specifically, we propose a privacy-preserving framework that allows a set of users, each with a private dataset, to outsource their encrypted databases and the cloud returns the association rules extracted from the aggregated encrypted databases to the participating users. Our proposed solution ensures the confidentiality of the outsourced data and also minimizes the users’ participation during the association rule mining process. Additionally, we show that the proposed solution is secure under the standard semi-honest model and demonstrate its practicality
Privately Connecting Mobility to Infectious Diseases via Applied Cryptography
Human mobility is undisputedly one of the critical factors in infectious
disease dynamics. Until a few years ago, researchers had to rely on static data
to model human mobility, which was then combined with a transmission model of a
particular disease resulting in an epidemiological model. Recent works have
consistently been showing that substituting the static mobility data with
mobile phone data leads to significantly more accurate models. While prior
studies have exclusively relied on a mobile network operator's subscribers'
aggregated data, it may be preferable to contemplate aggregated mobility data
of infected individuals only. Clearly, naively linking mobile phone data with
infected individuals would massively intrude privacy. This research aims to
develop a solution that reports the aggregated mobile phone location data of
infected individuals while still maintaining compliance with privacy
expectations. To achieve privacy, we use homomorphic encryption, zero-knowledge
proof techniques, and differential privacy. Our protocol's open-source
implementation can process eight million subscribers in one and a half hours.
Additionally, we provide a legal analysis of our solution with regards to the
EU General Data Protection Regulation.Comment: Added differentlial privacy experiments and new benchmark
CryptDB: A Practical Encrypted Relational DBMS
CryptDB is a DBMS that provides provable and practical privacy in the face of a compromised database server or curious database administrators. CryptDB works by executing SQL queries over encrypted data. At its core are three novel ideas: an SQL-aware encryption strategy that maps SQL operations to encryption schemes, adjustable query-based encryption which allows CryptDB to adjust the encryption level of each data item based on user queries, and onion encryption to efficiently change data encryption levels. CryptDB only empowers the server to execute queries that the users requested, and achieves maximum privacy given the mix of queries issued by the users. The database server fully evaluates queries on encrypted data and sends the result back to the client for final decryption; client machines do not perform any query processing and client-side applications run unchanged. Our evaluation shows that CryptDB has modest overhead: on the TPC-C benchmark on Postgres, CryptDB reduces throughput by 27% compared to regular Postgres. Importantly, CryptDB does not change the innards of existing DBMSs: we realized the implementation of CryptDB using client-side query rewriting/encrypting, user-defined functions, and server-side tables for public key information. As such, CryptDB is portable; porting CryptDB to MySQL required changing 86 lines of code, mostly at the connectivity layer
Distributed Query Execution With Strong Privacy Guarantees
As the Internet evolves, we find more applications that involve data originating from multiple sources, and spanning machines located all over the world. Such wide distribution of sensitive data increases the risk of information leakage, and may sometimes inhibit useful applications. For instance, even though banks could share data to detect systemic threats in the US financial network, they hesitate to do so because it can leak business secrets to their competitors. Encryption is an effective way to preserve data confidentiality, but eliminates all processing capabilities. Some approaches enable processing on encrypted data, but they usually have security weaknesses, such as data leakage through side-channels, or require expensive cryptographic computations.
In this thesis, we present techniques that address the above limitations. First, we present an efficient symmetric homomorphic encryption scheme, which can aggregate encrypted data at an unprecedented scale. Second, we present a way to efficiently perform secure computations on distributed graphs. To accomplish this, we express large computations as a series of small, parallelizable vertex programs, whose state is safely transferred between vertices using a new cryptographic protocol. Finally, we propose using differential privacy to strengthen the security of trusted processors: noise is added to the side-channels, so that no adversary can extract useful information about individual users. Our experimental results suggest that the presented techniques achieve order-of-magnitude performance improvements over previous approaches, in scenarios such as the business intelligence application of a large corporation and the detection of systemic threats in the US financial network
- …