39 research outputs found

    Invisible Backdoor Attack with Dynamic Triggers against Person Re-identification

    Full text link
    In recent years, person Re-identification (ReID) has rapidly progressed with wide real-world applications, but also poses significant risks of adversarial attacks. In this paper, we focus on the backdoor attack on deep ReID models. Existing backdoor attack methods follow an all-to-one/all attack scenario, where all the target classes in the test set have already been seen in the training set. However, ReID is a much more complex fine-grained open-set recognition problem, where the identities in the test set are not contained in the training set. Thus, previous backdoor attack methods for classification are not applicable for ReID. To ameliorate this issue, we propose a novel backdoor attack on deep ReID under a new all-to-unknown scenario, called Dynamic Triggers Invisible Backdoor Attack (DT-IBA). Instead of learning fixed triggers for the target classes from the training set, DT-IBA can dynamically generate new triggers for any unknown identities. Specifically, an identity hashing network is proposed to first extract target identity information from a reference image, which is then injected into the benign images by image steganography. We extensively validate the effectiveness and stealthiness of the proposed attack on benchmark datasets, and evaluate the effectiveness of several defense methods against our attack

    Learning Domain Invariant Prompt for Vision-Language Models

    Full text link
    Prompt learning is one of the most effective and trending ways to adapt powerful vision-language foundation models like CLIP to downstream datasets by tuning learnable prompt vectors with very few samples. However, although prompt learning achieves excellent performance over in-domain data, it still faces the major challenge of generalizing to unseen classes and domains. Some existing prompt learning methods tackle this issue by adaptively generating different prompts for different tokens or domains but neglecting the ability of learned prompts to generalize to unseen domains. In this paper, we propose a novel prompt learning paradigm that directly generates \emph{domain invariant} prompt that can be generalized to unseen domains, called MetaPrompt. Specifically, a dual-modality prompt tuning network is proposed to generate prompts for input from both image and text modalities. With a novel asymmetric contrastive loss, the representation from the original pre-trained vision-language model acts as supervision to enhance the generalization ability of the learned prompt. More importantly, we propose a meta-learning-based prompt tuning algorithm that explicitly constrains the task-specific prompt tuned for one domain or class to also achieve good performance in another domain or class. Extensive experiments on 11 datasets for base-to-new generalization and 4 datasets for domain generalization demonstrate that our method consistently and significantly outperforms existing methods.Comment: 12 pages, 6 figures, 5 table

    Sequential End-to-end Network for Efficient Person Search

    No full text
    Person search aims at jointly solving Person Detection and Person Re-identification (re-ID). Existing works have designed end-to-end networks based on Faster R-CNN. However, due to the parallel structure of Faster R-CNN, the extracted features come from the low-quality proposals generated by the Region Proposal Network, rather than the detected high-quality bounding boxes. Person search is a fine-grained task and such inferior features will significantly reduce re-ID performance. To address this issue, we propose a Sequential End-to-end Network (SeqNet) to extract superior features. In SeqNet, detection and re-ID are considered as a progressive process and tackled with two sub-networks sequentially. In addition, we design a robust Context Bipartite Graph Matching (CBGM) algorithm to effectively employ context information as an important complementary cue for person matching. Extensive experiments on two widely used person search benchmarks, CUHK-SYSU and PRW, have shown that our method achieves state-of-the-art results. Also, our model runs at 11.5 fps on a single GPU and can be integrated into the existing end-to-end framework easily

    S&I Reader: Multi-Granularity Gated Multi-Hop Skimming and Intensive Reading Model for Machine Reading Comprehension

    No full text
    Machine reading comprehension is a very challenging task, which aims to determine the answer span based on the given context and question. The newly developed pre-training language model has achieved a series of successes in various natural language understanding tasks with its powerful contextual representation ability. However, these pre-training language models generally lack the downstream processing structure for specific tasks, which limits further performance improvement. In order to solve this problem and deepen the model’s understanding of the question and context, this paper proposes S&I Reader. On the basis of the pre-training model, skimming, intensive reading, and gated mechanism modules are added to simulate the behavior of humans reading text and filtering information. Based on the idea of granular computing, a multi-granularity module for computing context granularity and sequence granularity is added to the model to simulate the behavior of human beings to understand the text from words to sentences, from parts to the whole. Compared with the previous machine reading comprehension model, our model structure is novel. The skimming module and multi-granularity module proposed in this paper are used to solve the problem that the previous model ignores the key information of the text and cannot understand the text with multi granularity. Experiments show that the model proposed in this paper is effective for both Chinese and English datasets. It can better understand the question and context and give a more accurate answer. The performance has made new progress on the basis of the baseline model

    Multidimensional Model-Based Decision Rules Mining

    No full text
    aBStract Decision rules mining is an important technique in machine learning and data mining, it has been studie

    Intuitionistic Fuzzy-Based Three-Way Label Enhancement for Multi-Label Classification

    No full text
    Multi-label classification deals with the determination of instance-label associations for unseen instances. Although many margin-based approaches are delicately developed, the uncertainty classifications for those with smaller separation margins remain unsolved. The intuitionistic fuzzy set is an effective tool to characterize the concept of uncertainty, yet it has not been examined for multi-label cases. This paper proposed a novel model called intuitionistic fuzzy three-way label enhancement (IFTWLE) for multi-label classification. The IFTWLE combines label enhancement with an intuitionistic fuzzy set under the framework of three-way decisions. For unseen instances, we generated the pseudo-label for label uncertainty evaluation from a logical label-based model. An intuitionistic fuzzy set-based instance selection principle seamlessly bridges logical label learning and numerical label learning. The principle is hierarchically developed. At the label level, membership and non-membership functions are pair-wisely defined to measure the local uncertainty and generate candidate uncertain instances. After upgrading to the instance level, we select instances from the candidates for label enhancement, whereas they remained unchanged for the remaining. To the best of our knowledge, this is the first attempt to combine logical label learning with numerical label learning into a unified framework for minimizing classification uncertainty. Extensive experiments demonstrate that, with the selectively reconstructed label importance, IFTWLE achieves statistically superior over the state-of-the-art multi-label classification algorithms in terms of classification accuracy. The computational complexity of this algorithm is On2mk, where n, m, and k denote the unseen instances count, label count, and average label-specific feature size, respectively