770 research outputs found

    Breaking and Fixing Secure Similarity Approximations: Dealing with Adversarially Perturbed Inputs

    Get PDF
    Computing similarity between data is a fundamental problem in information retrieval and data mining. To address the relevant performance and scalability challenges, approximation methods are employed for large-scale similarity computation. A common characteristic among all privacy- preserving approximation protocols based on sketching is that the sketching is performed locally and is based on common randomness. In the semi-honest model the input to the sketching algorithm is independent of the common randomness. We, however, consider a new threat model where a party is allowed to use the common randomness to perturb her input 1) offline, and 2) before the execution of any secure protocol so as to steer the approximation result to a maliciously chosen output. We formally define perturbation attacks under this adversarial model and propose two attacks on the well-studied techniques of minhash and cosine sketching. We demonstrate the power of perturbation attacks by measuring their success on synthetic and real data. To mitigate such perturbation attacks we propose a server- aided architecture, where an additional party, the server, assists in the secure similarity approximation by handling the common randomness as private data. We revise and introduce the necessary secure protocols so as to apply minhash and cosine sketching techniques in the server-aided architecture. Our implementation demonstrates that this new design can mitigate offline perturbation attacks without sacrificing the efficiency and scalability of the reconstruction protocol

    PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference

    Full text link
    The rapid growth and deployment of deep learning (DL) has witnessed emerging privacy and security concerns. To mitigate these issues, secure multi-party computation (MPC) has been discussed, to enable the privacy-preserving DL computation. In practice, they often come at very high computation and communication overhead, and potentially prohibit their popularity in large scale systems. Two orthogonal research trends have attracted enormous interests in addressing the energy efficiency in secure deep learning, i.e., overhead reduction of MPC comparison protocol, and hardware acceleration. However, they either achieve a low reduction ratio and suffer from high latency due to limited computation and communication saving, or are power-hungry as existing works mainly focus on general computing platforms such as CPUs and GPUs. In this work, as the first attempt, we develop a systematic framework, PolyMPCNet, of joint overhead reduction of MPC comparison protocol and hardware acceleration, by integrating hardware latency of the cryptographic building block into the DNN loss function to achieve high energy efficiency, accuracy, and security guarantee. Instead of heuristically checking the model sensitivity after a DNN is well-trained (through deleting or dropping some non-polynomial operators), our key design principle is to em enforce exactly what is assumed in the DNN design -- training a DNN that is both hardware efficient and secure, while escaping the local minima and saddle points and maintaining high accuracy. More specifically, we propose a straight through polynomial activation initialization method for cryptographic hardware friendly trainable polynomial activation function to replace the expensive 2P-ReLU operator. We develop a cryptographic hardware scheduler and the corresponding performance model for Field Programmable Gate Arrays (FPGA) platform

    Secure, Fast, and Energy-Efficient Outsourced Authentication for Smartphones

    Get PDF
    Common smartphone authentication mechanisms (e.g., PINs, graphical passwords, and fingerprint scans) are not designed to offer security post-login. Multi-modal continuous authentication addresses this issue by frequently and unobtrusively authenticating the user via behavioral biometric signals, such as touchscreen interaction and hand movements. Because smartphones can easily fall into the hands of the adversary, it is critical that the behavioral biometric information collected and processed on these devices is secured. This can be done by offloading encrypted template information to a remote server, and then performing authentication via privacy-preserving protocols. In this paper, we demonstrate that the energy overhead of current privacy-preserving protocols for continuous authentication is unsustainable on smartphones. To reduce energy consumption, we design a technique that leverages characteristics unique to the authentication setting in order to securely outsource computation to an untrusted Cloud. Our approach is secure against a colluding smartphone and Cloud, thus making it well suited for authentication. We performed extensive experimental evaluation. With our technique, the energy requirement for running an authentication instance that computes Manhattan distance is 0.2 mWh, which corresponds to a negligible fraction of the smartphone\u27s battery capacity. In addition, for Manhattan distance, our protocol runs in 0.72 and 2 s for 8 and 28 biometric features, respectively. We were also able to compute Hamming distance in 3.29 s, compared with 95.57 s achieved with the previous fastest outsourced computation protocol (Whitewash). These results demonstrate that ours is presently the only technique suitable for low-latency continuous authentication (e.g., with authentication scan windows of 60 s or shorter)

    MP2ML: A Mixed-Protocol Machine Learning Framework for Private Inference

    Get PDF
    Privacy-preserving machine learning (PPML) has many applications, from medical image classification and anomaly detection to financial analysis. nGraph-HE enables data scientists to perform private inference of deep learning (DL) models trained using popular frameworks such as TensorFlow. nGraph-HE computes linear layers using the CKKS homomorphic encryption (HE) scheme. The non-polynomial activation functions, such as MaxPool and ReLU, are evaluated in the clear by the data owner who obtains the intermediate feature maps. This leaks the feature maps to the data owner from which it may be possible to deduce the DL model weights. As a result, such protocols may not be suitable for deployment, especially when the DL model is intellectual property. In this work, we present MP2ML, a machine learning framework which integrates nGraph-HE and the secure two-party computation framework ABY, to overcome the limitations of leaking the intermediate feature maps to the data owner. We introduce a novel scheme for the conversion between CKKS and secure multi-party computation to execute DL inference while maintaining the privacy of both the input data and model weights. MP2ML is compatible with popular DL frameworks such as TensorFlow that can infer pre-trained neural networks with native ReLU activations. We benchmark MP2ML on the CryptoNets network with ReLU activations, on which it achieves a throughput of 33.3 images/s and an accuracy of 98.6%. This throughput matches the previous state-of-the-art work, even though our protocol is more accurate and scalable

    Homomorphic Encryption for Machine Learning in Medicine and Bioinformatics

    Get PDF
    Machine learning techniques are an excellent tool for the medical community to analyzing large amounts of medical and genomic data. On the other hand, ethical concerns and privacy regulations prevent the free sharing of this data. Encryption methods such as fully homomorphic encryption (FHE) provide a method evaluate over encrypted data. Using FHE, machine learning models such as deep learning, decision trees, and naive Bayes have been implemented for private prediction using medical data. FHE has also been shown to enable secure genomic algorithms, such as paternity testing, and secure application of genome-wide association studies. This survey provides an overview of fully homomorphic encryption and its applications in medicine and bioinformatics. The high-level concepts behind FHE and its history are introduced. Details on current open-source implementations are provided, as is the state of FHE for privacy-preserving techniques in machine learning and bioinformatics and future growth opportunities for FHE

    Information management and security protection for internet of vehicles

    Get PDF
    Considering the huge number of vehicles on the roads, the Internet of Vehicles is envisioned to foster a variety of new applications ranging from road safety enhancement to mobile entertainment. These new applications all face critical challenges which are how to handle a large volume of data streams of various kinds and how the secure architecture enhances the security of the Internet of Vehicles systems. This dissertation proposes a comprehensive message routing solution to provide the fundamental support of information management for the Internet of Vehicles. The proposed approach delivers messages via a self-organized moving-zone-based architecture formed using pure vehicle-to-vehicle communication and integrates moving object modeling and indexing techniques to vehicle management. It can significantly reduce the communication overhead while providing higher delivery rates. To ensure the identity and location privacy of the vehicles on the Internet of Vehicles environment, a highly efficient randomized authentication protocol, RAU+ is proposed to leverage homomorphic encryption and enable individual vehicles to easily generate a new randomized identity for each newly established communication while each authentication server would not know their real identities. In this way, not any single party can track the user. To minimize the infrastructure reliance, this dissertation further proposes a secure and lightweight identity management mechanism in which vehicles only need to contact a central authority once to obtain a global identity. Vehicles take turns serving as the captain authentication unit in self-organized groups. The local identities are computed from the vehicle's global identity and do not reveal true identities. Extensive experiments are conducted under a variety of Internet of Vehicles environments. The experimental results demonstrate the practicality, effectiveness, and efficiency of the proposed protocols.Includes bibliographical references

    System Abstractions for Scalable Application Development at the Edge

    Get PDF
    Recent years have witnessed an explosive growth of Internet of Things (IoT) devices, which collect or generate huge amounts of data. Given diverse device capabilities and application requirements, data processing takes place across a range of settings, from on-device to a nearby edge server/cloud and remote cloud. Consequently, edge-cloud coordination has been studied extensively from the perspectives of job placement, scheduling and joint optimization. Typical approaches focus on performance optimization for individual applications. This often requires domain knowledge of the applications, but also leads to application-specific solutions. Application development and deployment over diverse scenarios thus incur repetitive manual efforts. There are two overarching challenges to provide system-level support for application development at the edge. First, there is inherent heterogeneity at the device hardware level. The execution settings may range from a small cluster as an edge cloud to on-device inference on embedded devices, differing in hardware capability and programming environments. Further, application performance requirements vary significantly, making it even more difficult to map different applications to already heterogeneous hardware. Second, there are trends towards incorporating edge and cloud and multi-modal data. Together, these add further dimensions to the design space and increase the complexity significantly. In this thesis, we propose a novel framework to simplify application development and deployment over a continuum of edge to cloud. Our framework provides key connections between different dimensions of design considerations, corresponding to the application abstraction, data abstraction and resource management abstraction respectively. First, our framework masks hardware heterogeneity with abstract resource types through containerization, and abstracts away the application processing pipelines into generic flow graphs. Further, our framework further supports a notion of degradable computing for application scenarios at the edge that are driven by multimodal sensory input. Next, as video analytics is the killer app of edge computing, we include a generic data management service between video query systems and a video store to organize video data at the edge. We propose a video data unit abstraction based on a notion of distance between objects in the video, quantifying the semantic similarity among video data. Last, considering concurrent application execution, our framework supports multi-application offloading with device-centric control, with a userspace scheduler service that wraps over the operating system scheduler

    Multi-modal Spatial Crowdsourcing for Enriching Spatial Datasets

    Get PDF

    Privacy-Preserving Cross-Facility Early Warning for Unknown Epidemics

    Get PDF
    Syndrome-based early epidemic warning plays a vital role in preventing and controlling unknown epidemic outbreaks. It monitors the frequency of each syndrome, issues a warning if some frequency is aberrant, identifies potential epidemic outbreaks, and alerts governments as early as possible. Existing systems adopt a cloud-assisted paradigm to achieve cross-facility statistics on the syndrome frequencies. However, in these systems, all symptom data would be directly leaked to the cloud, which causes critical security and privacy issues. In this paper, we first analyze syndrome-based early epidemic warning systems and formalize two security notions, i.e., symptom confidentiality and frequency confidentiality, according to the inherent security requirements. We propose EpiOracle, a cross-facility early warning scheme for unknown epidemics. EpiOracle ensures that the contents and frequencies of syndromes will not be leaked to any unrelated parties; moreover, our construction uses only a symmetric-key encryption algorithm and cryptographic hash functions (e.g., [CBC]AES and SHA-3), making it highly efficient. We formally prove the security of EpiOracle in the random oracle model. We also implement an EpiOracle prototype and evaluate its performance using a set of real-world symptom lists. The evaluation results demonstrate its practical efficiency

    User-centric privacy preservation in Internet of Things Networks

    Get PDF
    Recent trends show how the Internet of Things (IoT) and its services are becoming more omnipresent and popular. The end-to-end IoT services that are extensively used include everything from neighborhood discovery to smart home security systems, wearable health monitors, and connected appliances and vehicles. IoT leverages different kinds of networks like Location-based social networks, Mobile edge systems, Digital Twin Networks, and many more to realize these services. Many of these services rely on a constant feed of user information. Depending on the network being used, how this data is processed can vary significantly. The key thing to note is that so much data is collected, and users have little to no control over how extensively their data is used and what information is being used. This causes many privacy concerns, especially for a na ̈ıve user who does not know the implications and consequences of severe privacy breaches. When designing privacy policies, we need to understand the different user data types used in these networks. This includes user profile information, information from their queries used to get services (communication privacy), and location information which is much needed in many on-the-go services. Based on the context of the application, and the service being provided, the user data at risk and the risks themselves vary. First, we dive deep into the networks and understand the different aspects of privacy for user data and the issues faced in each such aspect. We then propose different privacy policies for these networks and focus on two main aspects of designing privacy mechanisms: The quality of service the user expects and the private information from the user’s perspective. The novel contribution here is to focus on what the user thinks and needs instead of fixating on designing privacy policies that only satisfy the third-party applications’ requirement of quality of service
    • …
    corecore