770 research outputs found
Breaking and Fixing Secure Similarity Approximations: Dealing with Adversarially Perturbed Inputs
Computing similarity between data is a fundamental problem in information retrieval and data mining. To address the relevant performance and scalability challenges, approximation methods are employed for large-scale similarity computation. A common characteristic among all privacy- preserving approximation protocols based on sketching is that the sketching is performed locally and is based on common randomness.
In the semi-honest model the input to the sketching algorithm is independent of the common randomness. We, however, consider a new threat model where a party is allowed to use the common randomness to perturb her input 1) offline, and 2) before the execution of any secure protocol so as to steer the approximation result to a maliciously chosen output. We formally define perturbation attacks under this adversarial model and propose two attacks on the well-studied techniques of minhash and cosine sketching. We demonstrate the power of perturbation attacks by measuring their success on synthetic and real data.
To mitigate such perturbation attacks we propose a server- aided architecture, where an additional party, the server, assists in the secure similarity approximation by handling the common randomness as private data. We revise and introduce the necessary secure protocols so as to apply minhash and cosine sketching techniques in the server-aided architecture. Our implementation demonstrates that this new design can mitigate offline perturbation attacks without sacrificing the efficiency and scalability of the reconstruction protocol
PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference
The rapid growth and deployment of deep learning (DL) has witnessed emerging
privacy and security concerns. To mitigate these issues, secure multi-party
computation (MPC) has been discussed, to enable the privacy-preserving DL
computation. In practice, they often come at very high computation and
communication overhead, and potentially prohibit their popularity in large
scale systems. Two orthogonal research trends have attracted enormous interests
in addressing the energy efficiency in secure deep learning, i.e., overhead
reduction of MPC comparison protocol, and hardware acceleration. However, they
either achieve a low reduction ratio and suffer from high latency due to
limited computation and communication saving, or are power-hungry as existing
works mainly focus on general computing platforms such as CPUs and GPUs.
In this work, as the first attempt, we develop a systematic framework,
PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware
acceleration, by integrating hardware latency of the cryptographic building
block into the DNN loss function to achieve high energy efficiency, accuracy,
and security guarantee. Instead of heuristically checking the model sensitivity
after a DNN is well-trained (through deleting or dropping some non-polynomial
operators), our key design principle is to em enforce exactly what is assumed
in the DNN design -- training a DNN that is both hardware efficient and secure,
while escaping the local minima and saddle points and maintaining high
accuracy. More specifically, we propose a straight through polynomial
activation initialization method for cryptographic hardware friendly trainable
polynomial activation function to replace the expensive 2P-ReLU operator. We
develop a cryptographic hardware scheduler and the corresponding performance
model for Field Programmable Gate Arrays (FPGA) platform
Secure, Fast, and Energy-Efficient Outsourced Authentication for Smartphones
Common smartphone authentication mechanisms (e.g., PINs, graphical passwords, and fingerprint scans) are not designed to offer security post-login. Multi-modal continuous authentication addresses this issue by frequently and unobtrusively authenticating the user via behavioral biometric signals, such as touchscreen interaction and hand movements. Because smartphones can easily fall into the hands of the adversary, it is critical that the behavioral biometric information collected and processed on these devices is secured. This can be done by offloading encrypted template information to a remote server, and then performing authentication via privacy-preserving protocols. In this paper, we demonstrate that the energy overhead of current privacy-preserving protocols for continuous authentication is unsustainable on smartphones. To reduce energy consumption, we design a technique that leverages characteristics unique to the authentication setting in order to securely outsource computation to an untrusted Cloud. Our approach is secure against a colluding smartphone and Cloud, thus making it well suited for authentication. We performed extensive experimental evaluation. With our technique, the energy requirement for running an authentication instance that computes Manhattan distance is 0.2 mWh, which corresponds to a negligible fraction of the smartphone\u27s battery capacity. In addition, for Manhattan distance, our protocol runs in 0.72 and 2 s for 8 and 28 biometric features, respectively. We were also able to compute Hamming distance in 3.29 s, compared with 95.57 s achieved with the previous fastest outsourced computation protocol (Whitewash). These results demonstrate that ours is presently the only technique suitable for low-latency continuous authentication (e.g., with authentication scan windows of 60 s or shorter)
MP2ML: A Mixed-Protocol Machine Learning Framework for Private Inference
Privacy-preserving machine learning (PPML) has many applications, from medical image classification and anomaly detection to financial analysis. nGraph-HE enables data scientists to perform private inference of deep learning (DL) models trained using popular frameworks such as TensorFlow. nGraph-HE computes linear layers using the CKKS homomorphic encryption (HE) scheme. The non-polynomial activation functions, such as MaxPool and ReLU, are evaluated in the clear by the data owner who obtains the intermediate feature maps. This leaks the feature maps to the data owner from which it may be possible to deduce the DL model weights. As a result, such protocols may not be suitable for deployment, especially when the DL model is intellectual property.
In this work, we present MP2ML, a machine learning framework which integrates nGraph-HE and the secure two-party computation framework ABY, to overcome the limitations of leaking the intermediate feature maps to the data owner. We introduce a novel scheme for the conversion between CKKS and secure multi-party computation to execute DL inference while maintaining the privacy of both the input data and model weights. MP2ML is compatible with popular DL frameworks such as TensorFlow that can infer pre-trained neural networks with native ReLU activations. We benchmark MP2ML on the CryptoNets network with ReLU activations, on which it achieves a throughput of 33.3 images/s and an accuracy of 98.6%. This throughput
matches the previous state-of-the-art work, even though our protocol is more accurate and scalable
Homomorphic Encryption for Machine Learning in Medicine and Bioinformatics
Machine learning techniques are an excellent tool for the medical community to analyzing large amounts of medical and genomic data. On the other hand, ethical concerns and privacy regulations prevent the free sharing of this data. Encryption methods such as fully homomorphic encryption (FHE) provide a method evaluate over encrypted data. Using FHE, machine learning models such as deep learning, decision trees, and naive Bayes have been implemented for private prediction using medical data. FHE has also been shown to enable secure genomic algorithms, such as paternity testing, and secure application of genome-wide association studies. This survey provides an overview of fully homomorphic encryption and its applications in medicine and bioinformatics. The high-level concepts behind FHE and its history are introduced. Details on current open-source implementations are provided, as is the state of FHE for privacy-preserving techniques in machine learning and bioinformatics and future growth opportunities for FHE
Information management and security protection for internet of vehicles
Considering the huge number of vehicles on the roads, the Internet of Vehicles is envisioned to foster a variety of new applications ranging from road safety enhancement to mobile entertainment. These new applications all face critical challenges which are how to handle a large volume of data streams of various kinds and how the secure architecture enhances the security of the Internet of Vehicles systems. This dissertation proposes a comprehensive message routing solution to provide the fundamental support of information management for the Internet of Vehicles. The proposed approach delivers messages via a self-organized moving-zone-based architecture formed using pure vehicle-to-vehicle communication and integrates moving object modeling and indexing techniques to vehicle management. It can significantly reduce the communication overhead while providing higher delivery rates. To ensure the identity and location privacy of the vehicles on the Internet of Vehicles environment, a highly efficient randomized authentication protocol, RAU+ is proposed to leverage homomorphic encryption and enable individual vehicles to easily generate a new randomized identity for each newly established communication while each authentication server would not know their real identities. In this way, not any single party can track the user. To minimize the infrastructure reliance, this dissertation further proposes a secure and lightweight identity management mechanism in which vehicles only need to contact a central authority once to obtain a global identity. Vehicles take turns serving as the captain authentication unit in self-organized groups. The local identities are computed from the vehicle's global identity and do not reveal true identities. Extensive experiments are conducted under a variety of Internet of Vehicles environments. The experimental results demonstrate the practicality, effectiveness, and efficiency of the proposed protocols.Includes bibliographical references
System Abstractions for Scalable Application Development at the Edge
Recent years have witnessed an explosive growth of Internet of Things (IoT) devices, which collect or generate huge amounts of data. Given diverse device capabilities and application requirements, data processing takes place across a range of settings, from on-device to a nearby edge server/cloud and remote cloud. Consequently, edge-cloud coordination has been studied extensively from the perspectives of job placement, scheduling and joint optimization. Typical approaches focus on performance optimization for individual applications. This often requires domain knowledge of the applications, but also leads to application-specific solutions. Application development and deployment over diverse scenarios thus incur repetitive manual efforts. There are two overarching challenges to provide system-level support for application development at the edge. First, there is inherent heterogeneity at the device hardware level. The execution settings may range from a small cluster as an edge cloud to on-device inference on embedded devices, differing in hardware capability and programming environments. Further, application performance requirements vary significantly, making it even more difficult to map different applications to already heterogeneous hardware. Second, there are trends towards incorporating edge and cloud and multi-modal data. Together, these add further dimensions to the design space and increase the complexity significantly. In this thesis, we propose a novel framework to simplify application development and deployment over a continuum of edge to cloud. Our framework provides key connections between different dimensions of design considerations, corresponding to the application abstraction, data abstraction and resource management abstraction respectively. First, our framework masks hardware heterogeneity with abstract resource types through containerization, and abstracts away the application processing pipelines into generic flow graphs. Further, our framework further supports a notion of degradable computing for application scenarios at the edge that are driven by multimodal sensory input. Next, as video analytics is the killer app of edge computing, we include a generic data management service between video query systems and a video store to organize video data at the edge. We propose a video data unit abstraction based on a notion of distance between objects in the video, quantifying the semantic similarity among video data. Last, considering concurrent application execution, our framework supports multi-application offloading with device-centric control, with a userspace scheduler service that wraps over the operating system scheduler
Privacy-Preserving Cross-Facility Early Warning for Unknown Epidemics
Syndrome-based early epidemic warning plays a vital role in preventing and controlling unknown epidemic outbreaks. It monitors the frequency of each syndrome, issues a warning if some frequency is aberrant, identifies potential epidemic outbreaks, and alerts governments as early as possible. Existing systems adopt a cloud-assisted paradigm to achieve cross-facility statistics on the syndrome frequencies. However, in these systems, all symptom data would be directly leaked to the cloud, which causes critical security and privacy issues.
In this paper, we first analyze syndrome-based early epidemic warning systems and formalize two security notions, i.e., symptom confidentiality and frequency confidentiality, according to the inherent security requirements. We propose
EpiOracle, a cross-facility early warning scheme for unknown epidemics. EpiOracle ensures that the contents and frequencies of syndromes will not be leaked to any unrelated parties; moreover, our construction uses only a symmetric-key encryption algorithm and cryptographic hash functions (e.g., [CBC]AES and SHA-3), making it highly efficient. We formally prove the security of EpiOracle in the random oracle model. We also implement an EpiOracle prototype and evaluate its performance using a set of real-world symptom lists. The evaluation results demonstrate its practical efficiency
User-centric privacy preservation in Internet of Things Networks
Recent trends show how the Internet of Things (IoT) and its services are becoming more omnipresent and popular. The end-to-end IoT services that are extensively used include everything from neighborhood discovery to smart home security systems, wearable health monitors, and connected appliances and vehicles. IoT leverages different kinds of networks like Location-based social networks, Mobile edge systems, Digital Twin Networks, and many more to realize these services. Many of these services rely on a constant feed of user information. Depending on the network being used, how this data is processed can vary significantly. The key thing to note is that so much data is collected, and users have little to no control over how extensively their data is used and what information is being used. This causes many privacy concerns, especially for a na ̈ıve user who does not know the implications and consequences of severe privacy breaches. When designing privacy policies, we need to understand the different user data types used in these networks. This includes user profile information, information from their queries used to get services (communication privacy), and location information which is much needed in many on-the-go services. Based on the context of the application, and the service being provided, the user data at risk and the risks themselves vary. First, we dive deep into the networks and understand the different aspects of privacy for user data and the issues faced in each such aspect. We then propose different privacy policies for these networks and focus on two main aspects of designing privacy mechanisms: The quality of service the user expects and the private information from the user’s perspective. The novel contribution here is to focus on what the user thinks and needs instead of fixating on designing privacy policies that only satisfy the third-party applications’ requirement of quality of service
- …