249 research outputs found

    k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data

    Full text link
    Data Mining has wide applications in many areas such as banking, medicine, scientific research and among government agencies. Classification is one of the commonly used tasks in data mining applications. For the past decade, due to the rise of various privacy issues, many theoretical and practical solutions to the classification problem have been proposed under different security models. However, with the recent popularity of cloud computing, users now have the opportunity to outsource their data, in encrypted form, as well as the data mining tasks to the cloud. Since the data on the cloud is in encrypted form, existing privacy preserving classification techniques are not applicable. In this paper, we focus on solving the classification problem over encrypted data. In particular, we propose a secure k-NN classifier over encrypted data in the cloud. The proposed k-NN protocol protects the confidentiality of the data, user's input query, and data access patterns. To the best of our knowledge, our work is the first to develop a secure k-NN classifier over encrypted data under the semi-honest model. Also, we empirically analyze the efficiency of our solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text overlap with arXiv:1307.482

    Privacy-preserving query processing over encrypted data in cloud

    Get PDF
    The query processing of relational data has been studied extensively throughout the past decade. A number of theoretical and practical solutions to query processing have been proposed under various scenarios. With the recent popularity of cloud computing, data owners now have the opportunity to outsource not only their data but also data processing functionalities to the cloud. Because of data security and personal privacy concerns, sensitive data (e.g., medical records) should be encrypted before being outsourced to a cloud, and the cloud should perform query processing tasks on the encrypted data only. These tasks are termed as Privacy-Preserving Query Processing (PPQP) over encrypted data. Based on the concept of Secure Multiparty Computation (SMC), SMC-based distributed protocols were developed to allow the cloud to perform queries directly over encrypted data. These protocols protect the confidentiality of the stored data, user queries, and data access patterns from cloud service providers and other unauthorized users. Several queries were considered in an attempt to create a well-defined scope. These queries included the k-Nearest Neighbor (kNN) query, advanced analytical query, and correlated range query. The proposed protocols utilize an additive homomorphic cryptosystem and/or a garbled circuit technique at different stages of query processing to achieve the best performance. In addition, by adopting a multi-cloud computing paradigm, all computations can be done on the encrypted data without using very expensive fully homomorphic encryptions. The proposed protocols\u27 security was analyzed theoretically, and its practicality was evaluated through extensive empirical results --Abstract, page iii

    Offline Model Guard: Secure and Private ML on Mobile Devices

    Full text link
    Performing machine learning tasks in mobile applications yields a challenging conflict of interest: highly sensitive client information (e.g., speech data) should remain private while also the intellectual property of service providers (e.g., model parameters) must be protected. Cryptographic techniques offer secure solutions for this, but have an unacceptable overhead and moreover require frequent network interaction. In this work, we design a practically efficient hardware-based solution. Specifically, we build Offline Model Guard (OMG) to enable privacy-preserving machine learning on the predominant mobile computing platform ARM - even in offline scenarios. By leveraging a trusted execution environment for strict hardware-enforced isolation from other system components, OMG guarantees privacy of client data, secrecy of provided models, and integrity of processing algorithms. Our prototype implementation on an ARM HiKey 960 development board performs privacy-preserving keyword recognition using TensorFlow Lite for Microcontrollers in real time.Comment: Original Publication (in the same form): DATE 202

    A Practical Framework for Storing and Searching Encrypted Data on Cloud Storage

    Full text link
    Security has become a significant concern with the increased popularity of cloud storage services. It comes with the vulnerability of being accessed by third parties. Security is one of the major hurdles in the cloud server for the user when the user data that reside in local storage is outsourced to the cloud. It has given rise to security concerns involved in data confidentiality even after the deletion of data from cloud storage. Though, it raises a serious problem when the encrypted data needs to be shared with more people than the data owner initially designated. However, searching on encrypted data is a fundamental issue in cloud storage. The method of searching over encrypted data represents a significant challenge in the cloud. Searchable encryption allows a cloud server to conduct a search over encrypted data on behalf of the data users without learning the underlying plaintexts. While many academic SE schemes show provable security, they usually expose some query information, making them less practical, weak in usability, and challenging to deploy. Also, sharing encrypted data with other authorized users must provide each document's secret key. However, this way has many limitations due to the difficulty of key management and distribution. We have designed the system using the existing cryptographic approaches, ensuring the search on encrypted data over the cloud. The primary focus of our proposed model is to ensure user privacy and security through a less computationally intensive, user-friendly system with a trusted third party entity. To demonstrate our proposed model, we have implemented a web application called CryptoSearch as an overlay system on top of a well-known cloud storage domain. It exhibits secure search on encrypted data with no compromise to the user-friendliness and the scheme's functional performance in real-world applications.Comment: 146 Pages, Master's Thesis, 6 Chapters, 96 Figures, 11 Table

    A new enhancement of the k-NN algorithm by Using an optimization technique

    Get PDF
    Of a number of ML (Machine Learning) algorithms, k-nearest neighbour (KNN) is among the most common for data classification research, and classifying diseases and faults, which is essential due to frequent alterations in the training dataset, in which it would be expensive using most methods to construct a different classifier every time this happens. Therefore, KNN can be used effectively as it does not require a residual classifier to be constructed in advance. KNN offers ease of use and can be applied across a broad variation spectrum. Here, a novel KNN classification approach is put forward using the Bayesian Optimization Algorithm (BOA) for optimisation. This paper seeks to make classification more accurate and suggest alterations of nearest neighbour K value to use information about dataset structure and the similarity measure of distance. The findings of experimental work based on the University of California Irvine (UCI) repository datasets in general shows improved performance of classifiers compared with conventional KNN and give greater reliability without a significant time cost to speed

    An Approach to Guide Users Towards Less Revealing Internet Browsers

    Get PDF
    When browsing the Internet, HTTP headers enable both clients and servers send extra data in their requests or responses such as the User-Agent string. This string contains information related to the sender’s device, browser, and operating system. Previous research has shown that there are numerous privacy and security risks result from exposing sensitive information in the User-Agent string. For example, it enables device and browser fingerprinting and user tracking and identification. Our large analysis of thousands of User-Agent strings shows that browsers differ tremendously in the amount of information they include in their User-Agent strings. As such, our work aims at guiding users towards using less exposing browsers. In doing so, we propose to assign an exposure score to browsers based on the information they expose and vulnerability records. Thus, our contribution in this work is as follows: first, provide a full implementation that is ready to be deployed and used by users. Second, conduct a user study to identify the effectiveness and limitations of our proposed approach. Our implementation is based on using more than 52 thousand unique browsers. Our performance and validation analysis show that our solution is accurate and efficient. The source code and data set are publicly available and the solution has been deployed
    • …
    corecore