15 research outputs found

    Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review

    Full text link
    Deep Neural Networks (DNNs) have led to unprecedented progress in various natural language processing (NLP) tasks. Owing to limited data and computation resources, using third-party data and models has become a new paradigm for adapting various tasks. However, research shows that it has some potential security vulnerabilities because attackers can manipulate the training process and data source. Such a way can set specific triggers, making the model exhibit expected behaviors that have little inferior influence on the model's performance for primitive tasks, called backdoor attacks. Hence, it could have dire consequences, especially considering that the backdoor attack surfaces are broad. To get a precise grasp and understanding of this problem, a systematic and comprehensive review is required to confront various security challenges from different phases and attack purposes. Additionally, there is a dearth of analysis and comparison of the various emerging backdoor countermeasures in this situation. In this paper, we conduct a timely review of backdoor attacks and countermeasures to sound the red alarm for the NLP security community. According to the affected stage of the machine learning pipeline, the attack surfaces are recognized to be wide and then formalized into three categorizations: attacking pre-trained model with fine-tuning (APMF) or prompt-tuning (APMP), and attacking final model with training (AFMT), where AFMT can be subdivided into different attack aims. Thus, attacks under each categorization are combed. The countermeasures are categorized into two general classes: sample inspection and model inspection. Overall, the research on the defense side is far behind the attack side, and there is no single defense that can prevent all types of backdoor attacks. An attacker can intelligently bypass existing defenses with a more invisible attack. ......Comment: 24 pages, 4 figure

    LSF-IDM: Automotive Intrusion Detection Model with Lightweight Attribution and Semantic Fusion

    Full text link
    Autonomous vehicles (AVs) are more vulnerable to network attacks due to the high connectivity and diverse communication modes between vehicles and external networks. Deep learning-based Intrusion detection, an effective method for detecting network attacks, can provide functional safety as well as a real-time communication guarantee for vehicles, thereby being widely used for AVs. Existing works well for cyber-attacks such as simple-mode but become a higher false alarm with a resource-limited environment required when the attack is concealed within a contextual feature. In this paper, we present a novel automotive intrusion detection model with lightweight attribution and semantic fusion, named LSF-IDM. Our motivation is based on the observation that, when injected the malicious packets to the in-vehicle networks (IVNs), the packet log presents a strict order of context feature because of the periodicity and broadcast nature of the CAN bus. Therefore, this model first captures the context as the semantic feature of messages by the BERT language framework. Thereafter, the lightweight model (e.g., BiLSTM) learns the fused feature from an input packet's classification and its output distribution in BERT based on knowledge distillation. Experiment results demonstrate the effectiveness of our methods in defending against several representative attacks from IVNs. We also perform the difference analysis of the proposed method with lightweight models and Bert to attain a deeper understanding of how the model balance detection performance and model complexity.Comment: 18 pages, 8 figure

    Genomic Analyses Reveal Mutational Signatures and Frequently Altered Genes in Esophageal Squamous Cell Carcinoma

    Get PDF
    Esophageal squamous cell carcinoma (ESCC) is one of the most common cancers worldwide and the fourth most lethal cancer in China. However, although genomic studies have identified some mutations associated with ESCC, we know little of the mutational processes responsible. To identify genome-wide mutational signatures, we performed either whole-genome sequencing (WGS) or whole-exome sequencing (WES) on 104 ESCC individuals and combined our data with those of 88 previously reported samples. An APOBEC-mediated mutational signature in 47% of 192 tumors suggests that APOBEC-catalyzed deamination provides a source of DNA damage in ESCC. Moreover, PIK3CA hotspot mutations (c.1624G>A [p.Glu542Lys] and c.1633G>A [p.Glu545Lys]) were enriched in APOBEC-signature tumors, and no smoking-associated signature was observed in ESCC. In the samples analyzed by WGS, we identified focal (<100 kb) amplifications of CBX4 and CBX8. In our combined cohort, we identified frequent inactivating mutations in AJUBA, ZNF750, and PTCH1 and the chromatin-remodeling genes CREBBP and BAP1, in addition to known mutations. Functional analyses suggest roles for several genes (CBX4, CBX8, AJUBA, and ZNF750) in ESCC. Notably, high activity of hedgehog signaling and the PI3K pathway in approximately 60% of 104 ESCC tumors indicates that therapies targeting these pathways might be particularly promising strategies for ESCC. Collectively, our data provide comprehensive insights into the mutational signatures of ESCC and identify markers for early diagnosis and potential therapeutic targets

    STC-IDS: Spatial-Temporal Correlation Feature Analyzing based Intrusion Detection System for Intelligent Connected Vehicles

    Full text link
    Intrusion detection is an important defensive measure for the security of automotive communications. Accurate frame detection models assist vehicles to avoid malicious attacks. Uncertainty and diversity regarding attack methods make this task challenging. However, the existing works have the limitation of only considering local features or the weak feature mapping of multi-features. To address these limitations, we present a novel model for automotive intrusion detection by spatial-temporal correlation features of in-vehicle communication traffic (STC-IDS). Specifically, the proposed model exploits an encoding-detection architecture. In the encoder part, spatial and temporal relations are encoded simultaneously. To strengthen the relationship between features, the attention-based convolution network still captures spatial and channel features to increase the receptive field, while attention-LSTM build important relationships from previous time series or crucial bytes. The encoded information is then passed to the detector for generating forceful spatial-temporal attention features and enabling anomaly classification. In particular, single-frame and multi-frame models are constructed to present different advantages respectively. Under automatic hyper-parameter selection based on Bayesian optimization, the model is trained to attain the best performance. Extensive empirical studies based on a real-world vehicle attack dataset demonstrate that STC-IDS has outperformed baseline methods and cables fewer false-positive rates while maintaining efficiency

    TCAN-IDS: Intrusion Detection System for Internet of Vehicle Using Temporal Convolutional Attention Network

    No full text
    Intrusion detection systems based on recurrent neural network (RNN) have been considered as one of the effective methods to detect time-series data of in-vehicle networks. However, building a model for each arbitration bit is not only complex in structure but also has high computational overhead. Convolutional neural network (CNN) has always performed excellently in processing images, but they have recently shown great performance in learning features of normal and attack traffic by constructing message matrices in such a manner as to achieve real-time monitoring but suffer from the problem of temporal relationships in context and inadequate feature representation in key regions. Therefore, this paper proposes a temporal convolutional network with global attention to construct an in-vehicle network intrusion detection model, called TCAN-IDS. Specifically, the TCAN-IDS model continuously encodes 19-bit features consisting of an arbitration bit and data field of the original message into a message matrix, which is symmetric to messages recalling a historical moment. Thereafter, the feature extraction model extracts its spatial-temporal detail features. Notably, global attention enables global critical region attention based on channel and spatial feature coefficients, thus ignoring unimportant byte changes. Finally, anomalous traffic is monitored by a two-class classification component. Experiments show that TCAN-IDS demonstrates high detection performance on publicly known attack datasets and is able to accomplish real-time monitoring. In particular, it is anticipated to provide a high level of symmetry between information security and illegal intrusion

    TCAN-IDS: Intrusion Detection System for Internet of Vehicle Using Temporal Convolutional Attention Network

    No full text
    Intrusion detection systems based on recurrent neural network (RNN) have been considered as one of the effective methods to detect time-series data of in-vehicle networks. However, building a model for each arbitration bit is not only complex in structure but also has high computational overhead. Convolutional neural network (CNN) has always performed excellently in processing images, but they have recently shown great performance in learning features of normal and attack traffic by constructing message matrices in such a manner as to achieve real-time monitoring but suffer from the problem of temporal relationships in context and inadequate feature representation in key regions. Therefore, this paper proposes a temporal convolutional network with global attention to construct an in-vehicle network intrusion detection model, called TCAN-IDS. Specifically, the TCAN-IDS model continuously encodes 19-bit features consisting of an arbitration bit and data field of the original message into a message matrix, which is symmetric to messages recalling a historical moment. Thereafter, the feature extraction model extracts its spatial-temporal detail features. Notably, global attention enables global critical region attention based on channel and spatial feature coefficients, thus ignoring unimportant byte changes. Finally, anomalous traffic is monitored by a two-class classification component. Experiments show that TCAN-IDS demonstrates high detection performance on publicly known attack datasets and is able to accomplish real-time monitoring. In particular, it is anticipated to provide a high level of symmetry between information security and illegal intrusion

    PLMmark: A Secure and Robust Black-Box Watermarking Framework for Pre-trained Language Models

    No full text
    The huge training overhead, considerable commercial value, and various potential security risks make it urgent to protect the intellectual property (IP) of Deep Neural Networks (DNNs). DNN watermarking has become a plausible method to meet this need. However, most of the existing watermarking schemes focus on image classification tasks. The schemes designed for the textual domain lack security and reliability. Moreover, how to protect the IP of widely-used pre-trained language models (PLMs) remains a blank. To fill these gaps, we propose PLMmark, the first secure and robust black-box watermarking framework for PLMs. It consists of three phases: (1) In order to generate watermarks that contain owners’ identity information, we propose a novel encoding method to establish a strong link between a digital signature and trigger words by leveraging the original vocabulary tables of PLMs. Combining this with public key cryptography ensures the security of our scheme. (2) To embed robust, task-agnostic, and highly transferable watermarks in PLMs, we introduce a supervised contrastive loss to deviate the output representations of trigger sets from that of clean samples. In this way, the watermarked models will respond to the trigger sets anomaly and thus can identify the ownership. (3) To make the model ownership verification results reliable, we perform double verification, which guarantees the unforgeability of ownership. Extensive experiments on text classification tasks demonstrate that the embedded watermark can transfer to all the downstream tasks and can be effectively extracted and verified. The watermarking scheme is robust to watermark removing attacks (fine-pruning and re-initializing) and is secure enough to resist forgery attacks

    Genomic analyses reveal mutational signatures and frequently altered genes in esophageal squamous cell carcinoma

    No full text
    Esophageal squamous cell carcinoma (ESCC) is one of the most common cancers world wide and the fourth most lethal cancer in China. However, although genomic studies have identified some mutations associated with ESCC, we know little of the mutational processes responsible. To identify genome-wide mutational signatures, we performed either whole-genome sequencing (WGS) or whole-exome sequencing (WES) on 104 ESCC individuals and combined our data with those of 88 previously reported samples. An APOBEC-mediated mutational signature in 47% of 192 tumors suggests that APOBEC-catalyzed deamination provides a source of DNA damage in ESCC. Moreover, PIK3CA hotspot mutations (c.1624G> A [p.Glu542Lys] and c.1633G>A [p.Glu545Lys]) were enriched in APOBEC-signature tumors, and no smoking associated signature was observed in ESCC. In the samples analyzed by WGS, we identified focal
    corecore