948 research outputs found

    ODIN: Obfuscation-based privacy-preserving consensus algorithm for Decentralized Information fusion in smart device Networks

    Get PDF
    The large spread of sensors and smart devices in urban infrastructures are motivating research in the area of the Internet of Things (IoT) to develop new services and improve citizens’ quality of life. Sensors and smart devices generate large amounts of measurement data from sensing the environment, which is used to enable services such as control of power consumption or traffic density. To deal with such a large amount of information and provide accurate measurements, service providers can adopt information fusion, which given the decentralized nature of urban deployments can be performed by means of consensus algorithms. These algorithms allow distributed agents to (iteratively) compute linear functions on the exchanged data, and take decisions based on the outcome, without the need for the support of a central entity. However, the use of consensus algorithms raises several security concerns, especially when private or security critical information is involved in the computation. In this article we propose ODIN, a novel algorithm allowing information fusion over encrypted data. ODIN is a privacy-preserving extension of the popular consensus gossip algorithm, which prevents distributed agents from having direct access to the data while they iteratively reach consensus; agents cannot access even the final consensus value but can only retrieve partial information (e.g., a binary decision). ODIN uses efficient additive obfuscation and proxy re-encryption during the update steps and garbled circuits to make final decisions on the obfuscated consensus. We discuss the security of our proposal and show its practicability and efficiency on real-world resource-constrained devices, developing a prototype implementation for Raspberry Pi devices

    ODIN: Obfuscation-based privacy-preserving consensus algorithm for Decentralized Information fusion in smart device Networks

    Get PDF
    The large spread of sensors and smart devices in urban infrastructures are motivating research in the area of the Internet of Things (IoT) to develop new services and improve citizens’ quality of life. Sensors and smart devices generate large amounts of measurement data from sensing the environment, which is used to enable services such as control of power consumption or traffic density. To deal with such a large amount of information and provide accurate measurements, service providers can adopt information fusion, which given the decentralized nature of urban deployments can be performed by means of consensus algorithms. These algorithms allow distributed agents to (iteratively) compute linear functions on the exchanged data, and take decisions based on the outcome, without the need for the support of a central entity. However, the use of consensus algorithms raises several security concerns, especially when private or security critical information is involved in the computation. In this article we propose ODIN, a novel algorithm allowing information fusion over encrypted data. ODIN is a privacy-preserving extension of the popular consensus gossip algorithm, which prevents distributed agents from having direct access to the data while they iteratively reach consensus; agents cannot access even the final consensus value but can only retrieve partial information (e.g., a binary decision). ODIN uses efficient additive obfuscation and proxy re-encryption during the update steps and garbled circuits to make final decisions on the obfuscated consensus. We discuss the security of our proposal and show its practicability and efficiency on real-world resource-constrained devices, developing a prototype implementation for Raspberry Pi devices

    A Classification of non-Cryptographic Anonymization Techniques Ensuring Privacy in Big Data

    Get PDF
    Recently, Big Data processing becomes crucial to most enterprise and government applications due to the fast growth of the collected data. However, this data often includes private personal information that arise new security and privacy concerns. Moreover, it is widely agreed that the sheer scale of big data makes many privacy preserving techniques unavailing. Therefore, in order to ensure privacy in big data, anonymization is suggested as one of the most efficient approaches. In this paper, we will provide a new detailed classification of the most used non-cryptographic anonymization techniques related to big data including generalization and randomization approaches. Besides, the paper evaluates the presented techniques through integrity, confidentiality and credibility criteria. In addition, three relevant anonymization techniques including k-anonymity, l-diversity and t-closeness are tested on an extract of a huge real data set

    Confidential Boosting with Random Linear Classifiers for Outsourced User-generated Data

    Full text link
    User-generated data is crucial to predictive modeling in many applications. With a web/mobile/wearable interface, a data owner can continuously record data generated by distributed users and build various predictive models from the data to improve their operations, services, and revenue. Due to the large size and evolving nature of users data, data owners may rely on public cloud service providers (Cloud) for storage and computation scalability. Exposing sensitive user-generated data and advanced analytic models to Cloud raises privacy concerns. We present a confidential learning framework, SecureBoost, for data owners that want to learn predictive models from aggregated user-generated data but offload the storage and computational burden to Cloud without having to worry about protecting the sensitive data. SecureBoost allows users to submit encrypted or randomly masked data to designated Cloud directly. Our framework utilizes random linear classifiers (RLCs) as the base classifiers in the boosting framework to dramatically simplify the design of the proposed confidential boosting protocols, yet still preserve the model quality. A Cryptographic Service Provider (CSP) is used to assist the Cloud's processing, reducing the complexity of the protocol constructions. We present two constructions of SecureBoost: HE+GC and SecSh+GC, using combinations of homomorphic encryption, garbled circuits, and random masking to achieve both security and efficiency. For a boosted model, Cloud learns only the RLCs and the CSP learns only the weights of the RLCs. Finally, the data owner collects the two parts to get the complete model. We conduct extensive experiments to understand the quality of the RLC-based boosting and the cost distribution of the constructions. Our results show that SecureBoost can efficiently learn high-quality boosting models from protected user-generated data

    Secured Data Masking Framework and Technique for Preserving Privacy in a Business Intelligence Analytics Platform

    Get PDF
    The main concept behind business intelligence (BI) is how to use integrated data across different business systems within an enterprise to make strategic decisions. It is difficult to map internal and external BI’s users to subsets of the enterprise’s data warehouse (DW), resulting that protecting the privacy of this data while maintaining its utility is a challenging task. Today, such DW systems constitute one of the most serious privacy breach threats that an enterprise might face when many internal users of different security levels have access to BI components. This thesis proposes a data masking framework (iMaskU: Identify, Map, Apply, Sign, Keep testing, Utilize) for a BI platform to protect the data at rest, preserve the data format, and maintain the data utility on-the-fly querying level. A new reversible data masking technique (COntent BAsed Data masking - COBAD) is developed as an implementation of iMaskU. The masking algorithm in COBAD is based on the statistical content of the extracted dataset, so that, the masked data cannot be linked with specific individuals or be re-identified by any means. The strength of the re-identification risk factor for the COBAD technique has been computed using a supercomputer where, three security scheme/attacking methods are considered, a) the brute force attack, needs, on average, 55 years to crack the key of each record; b) the dictionary attack, needs 231 days to crack the same key for the entire extracted dataset (containing 50,000 records), c) a data linkage attack, the re-identification risk is very low when the common linked attributes are used. The performance validation of COBAD masking technique has been conducted. A database schema of 1GB is used in TPC-H decision support benchmark. The performance evaluation for the execution time of the selected TPC-H queries presented that the COBAD speed results are much better than AES128 and 3DES encryption. Theoretical and experimental results show that the proposed solution provides a reasonable trade-off between data security and the utility of re-identified data

    Protection of big data privacy

    Full text link
    In recent years, big data have become a hot research topic. The increasing amount of big data also increases the chance of breaching the privacy of individuals. Since big data require high computational power and large storage, distributed systems are used. As multiple parties are involved in these systems, the risk of privacy violation is increased. There have been a number of privacy-preserving mechanisms developed for privacy protection at different stages (e.g., data generation, data storage, and data processing) of a big data life cycle. The goal of this paper is to provide a comprehensive overview of the privacy preservation mechanisms in big data and present the challenges for existing mechanisms. In particular, in this paper, we illustrate the infrastructure of big data and the state-of-the-art privacy-preserving mechanisms in each stage of the big data life cycle. Furthermore, we discuss the challenges and future research directions related to privacy preservation in big data
    • …
    corecore