685 research outputs found

    Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review

    Full text link
    Deep Neural Networks (DNNs) have led to unprecedented progress in various natural language processing (NLP) tasks. Owing to limited data and computation resources, using third-party data and models has become a new paradigm for adapting various tasks. However, research shows that it has some potential security vulnerabilities because attackers can manipulate the training process and data source. Such a way can set specific triggers, making the model exhibit expected behaviors that have little inferior influence on the model's performance for primitive tasks, called backdoor attacks. Hence, it could have dire consequences, especially considering that the backdoor attack surfaces are broad. To get a precise grasp and understanding of this problem, a systematic and comprehensive review is required to confront various security challenges from different phases and attack purposes. Additionally, there is a dearth of analysis and comparison of the various emerging backdoor countermeasures in this situation. In this paper, we conduct a timely review of backdoor attacks and countermeasures to sound the red alarm for the NLP security community. According to the affected stage of the machine learning pipeline, the attack surfaces are recognized to be wide and then formalized into three categorizations: attacking pre-trained model with fine-tuning (APMF) or prompt-tuning (APMP), and attacking final model with training (AFMT), where AFMT can be subdivided into different attack aims. Thus, attacks under each categorization are combed. The countermeasures are categorized into two general classes: sample inspection and model inspection. Overall, the research on the defense side is far behind the attack side, and there is no single defense that can prevent all types of backdoor attacks. An attacker can intelligently bypass existing defenses with a more invisible attack. ......Comment: 24 pages, 4 figure

    Constructing dummy query sequences to protect location privacy and query privacy in location-based services

    Full text link
    © 2020, Springer Science+Business Media, LLC, part of Springer Nature. Location-based services (LBS) have become an important part of people’s daily life. However, while providing great convenience for mobile users, LBS result in a serious problem on personal privacy, i.e., location privacy and query privacy. However, existing privacy methods for LBS generally take into consideration only location privacy or query privacy, without considering the problem of protecting both of them simultaneously. In this paper, we propose to construct a group of dummy query sequences, to cover up the query locations and query attributes of mobile users and thus protect users’ privacy in LBS. First, we present a client-based framework for user privacy protection in LBS, which requires not only no change to the existing LBS algorithm on the server-side, but also no compromise to the accuracy of a LBS query. Second, based on the framework, we introduce a privacy model to formulate the constraints that ideal dummy query sequences should satisfy: (1) the similarity of feature distribution, which measures the effectiveness of the dummy query sequences to hide a true user query sequence; and (2) the exposure degree of user privacy, which measures the effectiveness of the dummy query sequences to cover up the location privacy and query privacy of a mobile user. Finally, we present an implementation algorithm to well meet the privacy model. Besides, both theoretical analysis and experimental evaluation demonstrate the effectiveness of our proposed approach, which show that the location privacy and attribute privacy behind LBS queries can be effectively protected by the dummy queries generated by our approach

    Privacy-Preserving Federated Learning of Remote Sensing Image Classification With Dishonest Majority

    Get PDF
    The classification of remote sensing images can give valuable data for various practical applications for smart cities, including urban planning, construction, and water resource management. The federated learning (FL) solution is often adopted to resolve the problems of limited resources and the confidentiality of data in remote sensing image classification. Privacy-preserving federated learning (PPFL) is a state-of-art FL scheme tailored for the privacy-constrained situation. It is required to address safeguarding data privacy and optimizing model accuracy effectively. However, existing PPFL methods usually suffer from model poisoning attacks, especially in the case of dishonest-majority scenarios. To address this challenge, in this work, we propose a blockchain-empowered PPFL for remote sensing image classification framework with the poisonous dishonest majority, which is able to defend against encrypted model poisoning attacks without compromising users' privacy. Specifically, we first propose the method of proof of accuracy (PoA) aiming to evaluate the encrypted models in an authentic way. Then, we design the secure aggregation framework using PoA, which can achieve robustness in a majority proportion of adversary settings. The experimental results show that our scheme can reach 92.5%, 90.61%, 87.48%, and 81.84% accuracy when the attacker accounts for 20%, 40%, 60%, and 80%, respectively. This is consistent with the FedAvg accuracy when only benign clients own the corresponding proportion of data. The experiment results demonstrate the proposed scheme's superiority in defending against model poisoning attacks

    DeepClean : a robust deep learning technique for autonomous vehicle camera data privacy

    Get PDF
    Autonomous Vehicles (AVs) are equipped with several sensors which produce various forms of data, such as geo-location, distance, and camera data. The volume and utility of these data, especially camera data, have contributed to the advancement of high-performance self-driving applications. However, these vehicles and their collected data are prone to security and privacy attacks. One of the main attacks against AV-generated camera data is location inference, in which camera data is used to extract knowledge for tracking the users. A few research studies have proposed privacy-preserving approaches for analysing AV-generated camera data using powerful generative models, such as Variational Auto Encoder (VAE) and Generative Adversarial Network (GAN). However, the related work considers a weak geo-localisation attack model, which leads to weak privacy protection against stronger attack models. This paper proposes DeepClean, a robust deeplearning model that combines VAE and a private clustering technique. DeepClean learns distinct labelled object structures of the image data as clusters and generates a more visual representation of the non-private object clusters, e.g., roads. It then distorts the private object areas using a private Gaussian Mixture Model (GMM) to learn distinct cluster structures of the labelled object areas. The synthetic images generated from our model guarantee privacy and resist a robust location inference attack by less than 4% localisation accuracy. This result implies that using DeepClean for synthetic data generation makes it less likely for a subject to be localised by an attacker, even when using a robust geo-localisation attack. The overall image utility level of the generated synthetic images by DeepClean is comparable to the benchmark studies

    Facial Data Minimization: Shallow Model as Your Privacy Filter

    Full text link
    Face recognition service has been used in many fields and brings much convenience to people. However, once the user's facial data is transmitted to a service provider, the user will lose control of his/her private data. In recent years, there exist various security and privacy issues due to the leakage of facial data. Although many privacy-preserving methods have been proposed, they usually fail when they are not accessible to adversaries' strategies or auxiliary data. Hence, in this paper, by fully considering two cases of uploading facial images and facial features, which are very typical in face recognition service systems, we proposed a data privacy minimization transformation (PMT) method. This method can process the original facial data based on the shallow model of authorized services to obtain the obfuscated data. The obfuscated data can not only maintain satisfactory performance on authorized models and restrict the performance on other unauthorized models but also prevent original privacy data from leaking by AI methods and human visual theft. Additionally, since a service provider may execute preprocessing operations on the received data, we also propose an enhanced perturbation method to improve the robustness of PMT. Besides, to authorize one facial image to multiple service models simultaneously, a multiple restriction mechanism is proposed to improve the scalability of PMT. Finally, we conduct extensive experiments and evaluate the effectiveness of the proposed PMT in defending against face reconstruction, data abuse, and face attribute estimation attacks. These experimental results demonstrate that PMT performs well in preventing facial data abuse and privacy leakage while maintaining face recognition accuracy.Comment: 14 pages, 11 figure

    Using Granule to Search Privacy Preserving Voice in Home IoT Systems

    Get PDF
    The Home IoT Voice System (HIVS) such as Amazon Alexa or Apple Siri can provide voice-based interfaces for people to conduct the search tasks using their voice. However, how to protect privacy is a big challenge. This paper proposes a novel personalized search scheme of encrypting voice with privacy-preserving by the granule computing technique. Firstly, Mel-Frequency Cepstrum Coefficients (MFCC) are used to extract voice features. These features are obfuscated by obfuscation function to protect them from being disclosed the server. Secondly, a series of definitions are presented, including fuzzy granule, fuzzy granule vector, ciphertext granule, operators and metrics. Thirdly, the AES method is used to encrypt voices. A scheme of searchable encrypted voice is designed by creating the fuzzy granule of obfuscation features of voices and the ciphertext granule of the voice. The experiments are conducted on corpus including English, Chinese and Arabic. The results show the feasibility and good performance of the proposed scheme
    • …
    corecore