154 research outputs found

    Exploring Privacy Preservation in Outsourced K-Nearest Neighbors with Multiple Data Owners

    Full text link
    The k-nearest neighbors (k-NN) algorithm is a popular and effective classification algorithm. Due to its large storage and computational requirements, it is suitable for cloud outsourcing. However, k-NN is often run on sensitive data such as medical records, user images, or personal information. It is important to protect the privacy of data in an outsourced k-NN system. Prior works have all assumed the data owners (who submit data to the outsourced k-NN system) are a single trusted party. However, we observe that in many practical scenarios, there may be multiple mutually distrusting data owners. In this work, we present the first framing and exploration of privacy preservation in an outsourced k-NN system with multiple data owners. We consider the various threat models introduced by this modification. We discover that under a particularly practical threat model that covers numerous scenarios, there exists a set of adaptive attacks that breach the data privacy of any exact k-NN system. The vulnerability is a result of the mathematical properties of k-NN and its output. Thus, we propose a privacy-preserving alternative system supporting kernel density estimation using a Gaussian kernel, a classification algorithm from the same family as k-NN. In many applications, this similar algorithm serves as a good substitute for k-NN. We additionally investigate solutions for other threat models, often through extensions on prior single data owner systems

    Big Data Analytics over Encrypted Datasets with Seabed

    Get PDF
    Today, enterprises collect large amounts of data and leverage the cloud to perform analytics over this data. Since the data is often sensitive, enterprises would prefer to keep it confidential and to hide it even from the cloud operator. Systems such as CryptDB and Monomi can accomplish this by operating mostly on encrypted data; however, these systems rely on expensive cryptographic techniques that limit performance in true “big data” scenarios that involve terabytes of data or more. This paper presents Seabed, a system that enables efficient analytics over large encrypted datasets. In contrast to previous systems, which rely on asymmetric encryption schemes, Seabed uses a novel, additively symmetric homomorphic encryption scheme (ASHE) to perform large-scale aggregations efficiently. Additionally, Seabed introduces a novel randomized encryption scheme called Splayed ASHE, or SPLASHE, that can, in certain cases, prevent frequency attacks based on auxiliary data

    Survey on Efficient Information Retrieval for Ranked Query in Cost-Efficient Clouds

    Get PDF
    Cloud computing technology redefines the advances in information technology. The most challenging research works in cloud computing is privacy and protection of data. Cloud computing provides an innovative business model for organizations with minimal investment. Cloud computing has emerged as a major driver in reducing the information technology costs incurred by organizations. Security is one of the major issues in cloud computing. So it is necessary to protect the user privacy while querying the data in the cloud environment, different techniques are developed by researchers to provide privacy, but the computational and bandwidth costs increased which are unacceptable to the users. This paper presents description and comparison of Ostrovsky, COPS and EIRQ protocols which are currently available for retrieving information from clouds. EIRQ protocol is the latest among these protocols and it addresses the issues of privacy, aggregation, CPU consumption and network bandwidth usage

    A Privacy-Preserving Framework for Collaborative Association Rule Mining in Cloud

    Get PDF
    Collaborative Data Mining facilitates multiple organizations to integrate their datasets and extract useful knowledge from their joint datasets for mutual benefits. The knowledge extracted in this manner is found to be superior to the knowledge extracted locally from a single organization’s dataset. With the rapid development of outsourcing, there is a growing interest for organizations to outsource their data mining tasks to a cloud environment to effectively address their economic and performance demands. However, due to privacy concerns and stringent compliance regulations, organizations do not want to share their private datasets neither with the cloud nor with other participating organizations. In this paper, we address the problem of outsourcing association rule mining task to a federated cloud environment in a privacy-preserving manner. Specifically, we propose a privacy-preserving framework that allows a set of users, each with a private dataset, to outsource their encrypted databases and the cloud returns the association rules extracted from the aggregated encrypted databases to the participating users. Our proposed solution ensures the confidentiality of the outsourced data and also minimizes the users’ participation during the association rule mining process. Additionally, we show that the proposed solution is secure under the standard semi-honest model and demonstrate its practicality

    Privately Connecting Mobility to Infectious Diseases via Applied Cryptography

    Get PDF
    Human mobility is undisputedly one of the critical factors in infectious disease dynamics. Until a few years ago, researchers had to rely on static data to model human mobility, which was then combined with a transmission model of a particular disease resulting in an epidemiological model. Recent works have consistently been showing that substituting the static mobility data with mobile phone data leads to significantly more accurate models. While prior studies have exclusively relied on a mobile network operator's subscribers' aggregated data, it may be preferable to contemplate aggregated mobility data of infected individuals only. Clearly, naively linking mobile phone data with infected individuals would massively intrude privacy. This research aims to develop a solution that reports the aggregated mobile phone location data of infected individuals while still maintaining compliance with privacy expectations. To achieve privacy, we use homomorphic encryption, zero-knowledge proof techniques, and differential privacy. Our protocol's open-source implementation can process eight million subscribers in one and a half hours. Additionally, we provide a legal analysis of our solution with regards to the EU General Data Protection Regulation.Comment: Added differentlial privacy experiments and new benchmark

    CryptDB: A Practical Encrypted Relational DBMS

    Get PDF
    CryptDB is a DBMS that provides provable and practical privacy in the face of a compromised database server or curious database administrators. CryptDB works by executing SQL queries over encrypted data. At its core are three novel ideas: an SQL-aware encryption strategy that maps SQL operations to encryption schemes, adjustable query-based encryption which allows CryptDB to adjust the encryption level of each data item based on user queries, and onion encryption to efficiently change data encryption levels. CryptDB only empowers the server to execute queries that the users requested, and achieves maximum privacy given the mix of queries issued by the users. The database server fully evaluates queries on encrypted data and sends the result back to the client for final decryption; client machines do not perform any query processing and client-side applications run unchanged. Our evaluation shows that CryptDB has modest overhead: on the TPC-C benchmark on Postgres, CryptDB reduces throughput by 27% compared to regular Postgres. Importantly, CryptDB does not change the innards of existing DBMSs: we realized the implementation of CryptDB using client-side query rewriting/encrypting, user-defined functions, and server-side tables for public key information. As such, CryptDB is portable; porting CryptDB to MySQL required changing 86 lines of code, mostly at the connectivity layer

    Distributed Query Execution With Strong Privacy Guarantees

    Get PDF
    As the Internet evolves, we find more applications that involve data originating from multiple sources, and spanning machines located all over the world. Such wide distribution of sensitive data increases the risk of information leakage, and may sometimes inhibit useful applications. For instance, even though banks could share data to detect systemic threats in the US financial network, they hesitate to do so because it can leak business secrets to their competitors. Encryption is an effective way to preserve data confidentiality, but eliminates all processing capabilities. Some approaches enable processing on encrypted data, but they usually have security weaknesses, such as data leakage through side-channels, or require expensive cryptographic computations. In this thesis, we present techniques that address the above limitations. First, we present an efficient symmetric homomorphic encryption scheme, which can aggregate encrypted data at an unprecedented scale. Second, we present a way to efficiently perform secure computations on distributed graphs. To accomplish this, we express large computations as a series of small, parallelizable vertex programs, whose state is safely transferred between vertices using a new cryptographic protocol. Finally, we propose using differential privacy to strengthen the security of trusted processors: noise is added to the side-channels, so that no adversary can extract useful information about individual users. Our experimental results suggest that the presented techniques achieve order-of-magnitude performance improvements over previous approaches, in scenarios such as the business intelligence application of a large corporation and the detection of systemic threats in the US financial network
    • …
    corecore