11 research outputs found

    Privacy-Preserving Ad-Hoc Equi-Join on Outsourced Data

    Get PDF
    In IT outsourcing, a user may delegate the data storage and query processing functions to a third-party server that is not completely trusted. This gives rise to the need to safeguard the privacy of the database as well as the user queries over it. In this article, we address the problem of running ad hoc equi-join queries directly on encrypted data in such a setting. Our contribution is the first solution that achieves constant complexity per pair of records that are evaluated for the join. After formalizing the privacy requirements pertaining to the database and user queries, we introduce a cryptographic construct for securely joining records across relations. The construct protects the database with a strong encryption scheme. Moreover, information disclosure after executing an equi-join is kept to the minimum—that two input records combine to form an output record if and only if they share common join attribute values. There is no disclosure on records that are not part of the join result. Building on this construct, we then present join algorithms that optimize the join execution by eliminating the need to match every record pair from the input relations. We provide a detailed analysis of the cost of the algorithms and confirm the analysis through extensive experiments with both synthetic and benchmark workloads. Through this evaluation, we tease out useful insights on how to configure the join algorithms to deliver acceptable execution time in practice.</jats:p

    Privacy-Preserving Ad-Hoc Equi-Join on Outsourced Data

    No full text

    Privacy-preserving queries on encrypted databases

    Get PDF
    In today's Internet, with the advent of cloud computing, there is a natural desire for enterprises, organizations, and end users to outsource increasingly large amounts of data to a cloud provider. Therefore, ensuring security and privacy is becoming a significant challenge for cloud computing, especially for users with sensitive and valuable data. Recently, many efficient and scalable query processing methods over encrypted data have been proposed. Despite that, numerous challenges remain to be addressed due to the high complexity of many important queries on encrypted large-scale datasets. This thesis studies the problem of privacy-preserving database query processing on structured data (e.g., relational and graph databases). In particular, this thesis proposes several practical and provable secure structured encryption schemes that allow the data owner to encrypt data without losing the ability to query and retrieve it efficiently for authorized clients. This thesis includes two parts. The first part investigates graph encryption schemes. This thesis proposes a graph encryption scheme for approximate shortest distance queries. Such scheme allows the client to query the shortest distance between two nodes in an encrypted graph securely and efficiently. Moreover, this thesis also explores how the techniques can be applied to other graph queries. The second part of this thesis proposes secure top-k query processing schemes on encrypted relational databases. Furthermore, the thesis develops a scheme for the top-k join queries over multiple encrypted relations. Finally, this thesis demonstrates the practicality of the proposed encryption schemes by prototyping the encryption systems to perform queries on real-world encrypted datasets

    Data Protection in Big Data Analysis

    Get PDF
    "Big data" applications are collecting data from various aspects of our lives more and more every day. This fast transition has surpassed the development pace of data protection techniques and has resulted in innumerable data breaches and privacy violations. To prevent that, it is important to ensure the data is protected while at rest, in transit, in use, as well as during computation or dispersal. We investigate data protection issues in big data analysis in this thesis. We address a security or privacy concern in each phase of the data science pipeline. These phases are: i) data cleaning and preparation, ii) data management, iii) data modelling and analysis, and iv) data dissemination and visualization. In each of our contributions, we either address an existing problem and propose a resolving design (Chapters 2 and 4), or evaluate a current solution for a problem and analyze whether it meets the expected security/privacy goal (Chapters 3 and 5). Starting with privacy in data preparation, we investigate providing privacy in query analysis leveraging differential privacy techniques. We consider contextual outlier analysis and identify challenging queries that require releasing direct information about members of the dataset. We define a new sampling mechanism that allows releasing this information in a differentially private manner. Our second contribution is in the data modelling and analysis phase. We investigate the effect of data properties and application requirements on the successful implementation of privacy techniques. We in particular investigate the effects of data correlation on data protection guarantees of differential privacy. Our third contribution in this thesis is in the data management phase. The problem is to efficiently protecting the data that is outsourced to a database management system (DBMS) provider while still allowing join operation. We provide an encryption method to minimize the leakage and to guarantee confidentiality for the data efficiently. Our last contribution is in the data dissemination phase. We inspect the ownership/contract protection for the prediction models trained on the data. We evaluate the backdoor-based watermarking in deep neural networks which is an important and recent line of the work in model ownership/contract protection
    corecore