56 research outputs found

    Mining Privacy-Preserving Association Rules based on Parallel Processing in Cloud Computing

    Full text link
    With the onset of the Information Era and the rapid growth of information technology, ample space for processing and extracting data has opened up. However, privacy concerns may stifle expansion throughout this area. The challenge of reliable mining techniques when transactions disperse across sources is addressed in this study. This work looks at the prospect of creating a new set of three algorithms that can obtain maximum privacy, data utility, and time savings while doing so. This paper proposes a unique double encryption and Transaction Splitter approach to alter the database to optimize the data utility and confidentiality tradeoff in the preparation phase. This paper presents a customized apriori approach for the mining process, which does not examine the entire database to estimate the support for each attribute. Existing distributed data solutions have a high encryption complexity and an insufficient specification of many participants' properties. Proposed solutions provide increased privacy protection against a variety of attack models. Furthermore, in terms of communication cycles and processing complexity, it is much simpler and quicker. Proposed work tests on top of a realworld transaction database demonstrate that the aim of the proposed method is realistic

    A Secure and Verifiable Computation for k-Nearest Neighbor Queries in Cloud

    Get PDF
    The popularity of cloud computing has increased significantly in the last few years due to scalability, cost efficiency, resiliency, and quality of service. Organizations are more interested in outsourcing the database and DBMS functionalities to the cloud owing to the tremendous growth of big data and on-demand access requirements. As the data is outsourced to untrusted parties, security has become a key consideration to achieve the confidentiality and integrity of data. Therefore, data owners must transform and encrypt the data before outsourcing. In this paper, we focus on a Secure and Verifiable Computation for k-Nearest Neighbor (SVC-kNN) problem. The existing verifiable computation approaches for the kNN problem delegate the verification task solely to a single semi-trusted party. We show that these approaches are unreliable in terms of security, as the verification server could be either dishonest or compromised. To address these issues, we propose a novel solution to the SVC-kNN problem that utilizes the random-splitting approach in conjunction with the homomorphic properties under a two-cloud model. Specifically, the clouds generate and send verification proofs to end-users, allowing them to verify the computation results efficiently. Our solution is highly efficient from the data owner and query issuers’ perspective as it significantly reduces the encryption cost and pre-processing time. Furthermore, we demonstrated the correctness of our solution using Proof by Induction methodology to prove the Euclidean Distance Verification

    Towards Secure and Verifiable Computation of KNN Queries in Outsourced Environments

    Get PDF
    The popularity of cloud computing has increased significantly in the last few years due to scalability, cost efficiency, resiliency, and quality of service. Organizations are more interested in outsourcing the database and DBMS functionalities to the cloud owing to the tremendous growth of big data and on-demand access requirements. As the data is outsourced to untrusted parties, security has become a key consideration to achieve the confidentiality and integrity of data. Therefore, data owners must transform and encrypt the data before outsourcing. In this paper, we focus on a Secure and Verifiable Computation for k-Nearest Neighbor (SVC-kNN) problem. The existing verifiable computation approaches for the kNN problem delegate the verification task solely to a single semi-trusted party. We show that these approaches are unreliable in terms of security, as the verification server could be either dishonest or compromised. To address these issues, we propose a novel solution to the SVC-kNN problem that utilizes the random-splitting approach in conjunction with the homomorphic properties under a two-cloud model. Specifically, the clouds generate and send verification proofs to end-users, allowing them to verify the computation results efficiently. Our solution is highly efficient from the data owner and query issuers’ perspective as it significantly reduces the encryption cost and pre-processing time. Furthermore, we show the correctness of our solution using Proof by Induction methodology to prove the Euclidean Distance Verification. Finally, with a thorough analysis and the empirical results on a real data set, we demonstrate the efficiency and effectiveness of our protocol

    Security and Privacy for Big Data: A Systematic Literature Review

    Get PDF
    Big data is currently a hot research topic, with four million hits on Google scholar in October 2016. One reason for the popularity of big data research is the knowledge that can be extracted from analyzing these large data sets. However, data can contain sensitive information, and data must therefore be sufficiently protected as it is stored and processed. Furthermore, it might also be required to provide meaningful, proven, privacy guarantees if the data can be linked to individuals. To the best of our knowledge, there exists no systematic overview of the overlap between big data and the area of security and privacy. Consequently, this review aims to explore security and privacy research within big data, by outlining and providing structure to what research currently exists. Moreover, we investigate which papers connect security and privacy with big data, and which categories these papers cover. Ultimately, is security and privacy research for big data different from the rest of the research within the security and privacy domain? To answer these questions, we perform a systematic literature review (SLR), where we collect recent papers from top conferences, and categorize them in order to provide an overview of the security and privacy topics present within the context of big data. Within each category we also present a qualitative analysis of papers representative for that specific area. Furthermore, we explore and visualize the relationship between the categories. Thus, the objective of this review is to provide a snapshot of the current state of security and privacy research for big data, and to discover where further research is required

    Exploring the Existing and Unknown Side Effects of Privacy Preserving Data Mining Algorithms

    Get PDF
    The data mining sanitization process involves converting the data by masking the sensitive data and then releasing it to public domain. During the sanitization process, side effects such as hiding failure, missing cost and artificial cost of the data were observed. Privacy Preserving Data Mining (PPDM) algorithms were developed for the sanitization process to overcome information loss and yet maintain data integrity. While these PPDM algorithms did provide benefits for privacy preservation, they also made sure to solve the side effects that occurred during the sanitization process. Many PPDM algorithms were developed to reduce these side effects. There are several PPDM algorithms created based on different PPDM techniques. However, previous studies have not explored or justified why non-traditional side effects were not given much importance. This study reported the findings of the side effects for the PPDM algorithms in a newly created web repository. The research methodology adopted for this study was Design Science Research (DSR). This research was conducted in four phases, which were as follows. The first phase addressed the characteristics, similarities, differences, and relationships of existing side effects. The next phase found the characteristics of non-traditional side effects. The third phase used the Privacy Preservation and Security Framework (PPSF) tool to test if non-traditional side effects occur in PPDM algorithms. This phase also attempted to find additional unknown side effects which have not been found in prior studies. PPDM algorithms considered were Greedy, POS2DT, SIF_IDF, cpGA2DT, pGA2DT, sGA2DT. PPDM techniques associated were anonymization, perturbation, randomization, condensation, heuristic, reconstruction, and cryptography. The final phase involved creating a new online web repository to report all the side effects found for the PPDM algorithms. A Web repository was created using full stack web development. AngularJS, Spring, Spring Boot and Hibernate frameworks were used to build the web application. The results of the study implied various PPDM algorithms and their side effects. Additionally, the relationship and impact that hiding failure, missing cost, and artificial cost have on each other was also understood. Interestingly, the side effects and their relationship with the type of data (sensitive or non-sensitive or new) was observed. As the web repository acts as a quick reference domain for PPDM algorithms. Developing, improving, inventing, and reporting PPDM algorithms is necessary. This study will influence researchers or organizations to report, use, reuse, or develop better PPDM algorithms

    Protection of big data privacy

    Full text link
    In recent years, big data have become a hot research topic. The increasing amount of big data also increases the chance of breaching the privacy of individuals. Since big data require high computational power and large storage, distributed systems are used. As multiple parties are involved in these systems, the risk of privacy violation is increased. There have been a number of privacy-preserving mechanisms developed for privacy protection at different stages (e.g., data generation, data storage, and data processing) of a big data life cycle. The goal of this paper is to provide a comprehensive overview of the privacy preservation mechanisms in big data and present the challenges for existing mechanisms. In particular, in this paper, we illustrate the infrastructure of big data and the state-of-the-art privacy-preserving mechanisms in each stage of the big data life cycle. Furthermore, we discuss the challenges and future research directions related to privacy preservation in big data

    Privacy preserving algorithms for newly emergent computing environments

    Get PDF
    Privacy preserving data usage ensures appropriate usage of data without compromising sensitive information. Data privacy is a primary requirement since customers' data is an asset to any organization and it contains customers' private information. Data seclusion cannot be a solution to keep data private. Data sharing as well as keeping data private is important for different purposes, e.g., company welfare, research, business etc. A broad range of industries where data privacy is mandatory includes healthcare, aviation industry, education system, federal law enforcement, etc.In this thesis dissertation we focus on data privacy schemes in emerging fields of computer science, namely, health informatics, data mining, distributed cloud, biometrics, and mobile payments. Linking and mining medical records across different medical service providers are important to the enhancement of health care quality. Under HIPAA regulation keeping medical records private is important. In real-world health care databases, records may well contain errors. Linking the error-prone data and preserving data privacy at the same time is very difficult. We introduce a privacy preserving Error-Tolerant Linking Algorithm to enable medical records linkage for error-prone medical records. Mining frequent sequential patterns such as, patient path, treatment pattern, etc., across multiple medical sites helps to improve health care quality and research. We propose a privacy preserving sequential pattern mining scheme across multiple medical sites. In a distributed cloud environment resources are provided by users who are geographically distributed over a large area. Since resources are provided by regular users, data privacy and security are main concerns. We propose a privacy preserving data storage mechanism among different users in a distributed cloud. Managing secret key for encryption is difficult in a distributed cloud. To protect secret key in a distributed cloud we propose a multilevel threshold secret sharing mechanism. Biometric authentication ensures user identity by means of user's biometric traits. Any individual's biometrics should be protected since biometrics are unique and can be stolen or misused by an adversary. We present a secure and privacy preserving biometric authentication scheme using watermarking technique. Mobile payments have become popular with the extensive use of mobile devices. Mobile applications for payments needs to be very secure to perform transactions and at the same time needs to be efficient. We design and develop a mobile application for secure mobile payments. To secure mobile payments we focus on user's biometric authentication as well as secure bank transaction. We propose a novel privacy preserving biometric authentication algorithm for secure mobile payments

    Privacy-Aware and Secure Decentralized Air Quality Monitoring

    Get PDF
    Indoor Air Quality monitoring is a major asset to improving quality of life and building management. Today, the evolution of embedded technologies allows the implementation of such monitoring on the edge of the network. However, several concerns need to be addressed related to data security and privacy, routing and sink placement optimization, protection from external monitoring, and distributed data mining. In this paper, we describe an integrated framework that features distributed storage, blockchain-based Role-based Access Control, onion routing, routing and sink placement optimization, and distributed data mining to answer these concerns. We describe the organization of our contribution and show its relevance with simulations and experiments over a set of use cases
    • …
    corecore