249 research outputs found
k-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data
Data Mining has wide applications in many areas such as banking, medicine,
scientific research and among government agencies. Classification is one of the
commonly used tasks in data mining applications. For the past decade, due to
the rise of various privacy issues, many theoretical and practical solutions to
the classification problem have been proposed under different security models.
However, with the recent popularity of cloud computing, users now have the
opportunity to outsource their data, in encrypted form, as well as the data
mining tasks to the cloud. Since the data on the cloud is in encrypted form,
existing privacy preserving classification techniques are not applicable. In
this paper, we focus on solving the classification problem over encrypted data.
In particular, we propose a secure k-NN classifier over encrypted data in the
cloud. The proposed k-NN protocol protects the confidentiality of the data,
user's input query, and data access patterns. To the best of our knowledge, our
work is the first to develop a secure k-NN classifier over encrypted data under
the semi-honest model. Also, we empirically analyze the efficiency of our
solution through various experiments.Comment: 29 pages, 2 figures, 3 tables arXiv admin note: substantial text
overlap with arXiv:1307.482
Privacy-preserving query processing over encrypted data in cloud
The query processing of relational data has been studied extensively throughout the past decade. A number of theoretical and practical solutions to query processing have been proposed under various scenarios. With the recent popularity of cloud computing, data owners now have the opportunity to outsource not only their data but also data processing functionalities to the cloud. Because of data security and personal privacy concerns, sensitive data (e.g., medical records) should be encrypted before being outsourced to a cloud, and the cloud should perform query processing tasks on the encrypted data only. These tasks are termed as Privacy-Preserving Query Processing (PPQP) over encrypted data. Based on the concept of Secure Multiparty Computation (SMC), SMC-based distributed protocols were developed to allow the cloud to perform queries directly over encrypted data. These protocols protect the confidentiality of the stored data, user queries, and data access patterns from cloud service providers and other unauthorized users. Several queries were considered in an attempt to create a well-defined scope. These queries included the k-Nearest Neighbor (kNN) query, advanced analytical query, and correlated range query. The proposed protocols utilize an additive homomorphic cryptosystem and/or a garbled circuit technique at different stages of query processing to achieve the best performance. In addition, by adopting a multi-cloud computing paradigm, all computations can be done on the encrypted data without using very expensive fully homomorphic encryptions. The proposed protocols\u27 security was analyzed theoretically, and its practicality was evaluated through extensive empirical results --Abstract, page iii
Offline Model Guard: Secure and Private ML on Mobile Devices
Performing machine learning tasks in mobile applications yields a challenging
conflict of interest: highly sensitive client information (e.g., speech data)
should remain private while also the intellectual property of service providers
(e.g., model parameters) must be protected. Cryptographic techniques offer
secure solutions for this, but have an unacceptable overhead and moreover
require frequent network interaction. In this work, we design a practically
efficient hardware-based solution. Specifically, we build Offline Model Guard
(OMG) to enable privacy-preserving machine learning on the predominant mobile
computing platform ARM - even in offline scenarios. By leveraging a trusted
execution environment for strict hardware-enforced isolation from other system
components, OMG guarantees privacy of client data, secrecy of provided models,
and integrity of processing algorithms. Our prototype implementation on an ARM
HiKey 960 development board performs privacy-preserving keyword recognition
using TensorFlow Lite for Microcontrollers in real time.Comment: Original Publication (in the same form): DATE 202
A Practical Framework for Storing and Searching Encrypted Data on Cloud Storage
Security has become a significant concern with the increased popularity of
cloud storage services. It comes with the vulnerability of being accessed by
third parties. Security is one of the major hurdles in the cloud server for the
user when the user data that reside in local storage is outsourced to the
cloud. It has given rise to security concerns involved in data confidentiality
even after the deletion of data from cloud storage. Though, it raises a serious
problem when the encrypted data needs to be shared with more people than the
data owner initially designated. However, searching on encrypted data is a
fundamental issue in cloud storage. The method of searching over encrypted data
represents a significant challenge in the cloud.
Searchable encryption allows a cloud server to conduct a search over
encrypted data on behalf of the data users without learning the underlying
plaintexts. While many academic SE schemes show provable security, they usually
expose some query information, making them less practical, weak in usability,
and challenging to deploy. Also, sharing encrypted data with other authorized
users must provide each document's secret key. However, this way has many
limitations due to the difficulty of key management and distribution.
We have designed the system using the existing cryptographic approaches,
ensuring the search on encrypted data over the cloud. The primary focus of our
proposed model is to ensure user privacy and security through a less
computationally intensive, user-friendly system with a trusted third party
entity. To demonstrate our proposed model, we have implemented a web
application called CryptoSearch as an overlay system on top of a well-known
cloud storage domain. It exhibits secure search on encrypted data with no
compromise to the user-friendliness and the scheme's functional performance in
real-world applications.Comment: 146 Pages, Master's Thesis, 6 Chapters, 96 Figures, 11 Table
A new enhancement of the k-NN algorithm by Using an optimization technique
Of a number of ML (Machine Learning) algorithms, k-nearest neighbour (KNN) is among the most common for data classification research, and classifying diseases and faults, which is essential due to frequent alterations in the training dataset, in which it would be expensive using most methods to construct a different classifier every time this happens. Therefore, KNN can be used effectively as it does not require a residual classifier to be constructed in advance. KNN offers ease of use and can be applied across a broad variation spectrum. Here, a novel KNN classification approach is put forward using the Bayesian Optimization Algorithm (BOA) for optimisation. This paper seeks to make classification more accurate and suggest alterations of nearest neighbour K value to use information about dataset structure and the similarity measure of distance. The findings of experimental work based on the University of California Irvine (UCI) repository datasets in general shows improved performance of classifiers compared with conventional KNN and give greater reliability without a significant time cost to speed
An Approach to Guide Users Towards Less Revealing Internet Browsers
When browsing the Internet, HTTP headers enable both clients and servers send extra data in their requests or responses such as the User-Agent string. This string contains information related to the sender’s device, browser, and operating system. Previous research has shown that there are numerous privacy and security risks result from exposing sensitive information in the User-Agent string. For example, it enables device and browser fingerprinting and user tracking and identification. Our large analysis of thousands of User-Agent strings shows that browsers differ tremendously in the amount of information they include in their User-Agent strings. As such, our work aims at guiding users towards using less exposing browsers. In doing so, we propose to assign an exposure score to browsers based on the information they expose and vulnerability records. Thus, our contribution in this work is as follows: first, provide a full implementation that is ready to be deployed and used by users. Second, conduct a user study to identify the effectiveness and limitations of our proposed approach. Our implementation is based on using more than 52 thousand unique browsers. Our performance and validation analysis show that our solution is accurate and efficient. The source code and data set are publicly available and the solution has been deployed
- …