74 research outputs found

    Multimodal Adversarial Learning

    Get PDF
    Deep Convolutional Neural Networks (DCNN) have proven to be an exceptional tool for object recognition, generative modelling, and multi-modal learning in various computer vision applications. However, recent findings have shown that such state-of-the-art models can be easily deceived by inserting slight imperceptible perturbations to key pixels in the input. A good target detection systems can accurately identify targets by localizing their coordinates on the input image of interest. This is ideally achieved by labeling each pixel in an image as a background or a potential target pixel. However, prior research still confirms that such state of the art targets models are susceptible to adversarial attacks. In the case of generative models, facial sketches drawn by artists mostly used by law enforcement agencies depend on the ability of the artist to clearly replicate all the key facial features that aid in capturing the true identity of a subject. Recent works have attempted to synthesize these sketches into plausible visual images to improve visual recognition and identification. However, synthesizing photo-realistic images from sketches proves to be an even more challenging task, especially for sensitive applications such as suspect identification. However, the incorporation of hybrid discriminators, which perform attribute classification of multiple target attributes, a quality guided encoder that minimizes the perceptual dissimilarity of the latent space embedding of the synthesized and real image at different layers in the network have shown to be powerful tools towards better multi modal learning techniques. In general, our overall approach was aimed at improving target detection systems and the visual appeal of synthesized images while incorporating multiple attribute assignment to the generator without compromising the identity of the synthesized image. We synthesized sketches using XDOG filter for the CelebA, Multi-modal and CelebA-HQ datasets and from an auxiliary generator trained on sketches from CUHK, IIT-D and FERET datasets. Our results overall for different model applications are impressive compared to current state of the art

    Adversarial Examples in the Physical World: A Survey

    Full text link
    Deep neural networks (DNNs) have demonstrated high vulnerability to adversarial examples. Besides the attacks in the digital world, the practical implications of adversarial examples in the physical world present significant challenges and safety concerns. However, current research on physical adversarial examples (PAEs) lacks a comprehensive understanding of their unique characteristics, leading to limited significance and understanding. In this paper, we address this gap by thoroughly examining the characteristics of PAEs within a practical workflow encompassing training, manufacturing, and re-sampling processes. By analyzing the links between physical adversarial attacks, we identify manufacturing and re-sampling as the primary sources of distinct attributes and particularities in PAEs. Leveraging this knowledge, we develop a comprehensive analysis and classification framework for PAEs based on their specific characteristics, covering over 100 studies on physical-world adversarial examples. Furthermore, we investigate defense strategies against PAEs and identify open challenges and opportunities for future research. We aim to provide a fresh, thorough, and systematic understanding of PAEs, thereby promoting the development of robust adversarial learning and its application in open-world scenarios.Comment: Adversarial examples, physical-world scenarios, attacks and defense

    Fusion-based Few-Shot Morphing Attack Detection and Fingerprinting

    Full text link
    The vulnerability of face recognition systems to morphing attacks has posed a serious security threat due to the wide adoption of face biometrics in the real world. Most existing morphing attack detection (MAD) methods require a large amount of training data and have only been tested on a few predefined attack models. The lack of good generalization properties, especially in view of the growing interest in developing novel morphing attacks, is a critical limitation with existing MAD research. To address this issue, we propose to extend MAD from supervised learning to few-shot learning and from binary detection to multiclass fingerprinting in this paper. Our technical contributions include: 1) We propose a fusion-based few-shot learning (FSL) method to learn discriminative features that can generalize to unseen morphing attack types from predefined presentation attacks; 2) The proposed FSL based on the fusion of the PRNU model and Noiseprint network is extended from binary MAD to multiclass morphing attack fingerprinting (MAF). 3) We have collected a large-scale database, which contains five face datasets and eight different morphing algorithms, to benchmark the proposed few-shot MAF (FS-MAF) method. Extensive experimental results show the outstanding performance of our fusion-based FS-MAF. The code and data will be publicly available at https://github.com/nz0001na/mad maf

    Face Image and Video Analysis in Biometrics and Health Applications

    Get PDF
    Computer Vision (CV) enables computers and systems to derive meaningful information from acquired visual inputs, such as images and videos, and make decisions based on the extracted information. Its goal is to acquire, process, analyze, and understand the information by developing a theoretical and algorithmic model. Biometrics are distinctive and measurable human characteristics used to label or describe individuals by combining computer vision with knowledge of human physiology (e.g., face, iris, fingerprint) and behavior (e.g., gait, gaze, voice). Face is one of the most informative biometric traits. Many studies have investigated the human face from the perspectives of various different disciplines, ranging from computer vision, deep learning, to neuroscience and biometrics. In this work, we analyze the face characteristics from digital images and videos in the areas of morphing attack and defense, and autism diagnosis. For face morphing attacks generation, we proposed a transformer based generative adversarial network to generate more visually realistic morphing attacks by combining different losses, such as face matching distance, facial landmark based loss, perceptual loss and pixel-wise mean square error. In face morphing attack detection study, we designed a fusion-based few-shot learning (FSL) method to learn discriminative features from face images for few-shot morphing attack detection (FS-MAD), and extend the current binary detection into multiclass classification, namely, few-shot morphing attack fingerprinting (FS-MAF). In the autism diagnosis study, we developed a discriminative few shot learning method to analyze hour-long video data and explored the fusion of facial dynamics for facial trait classification of autism spectrum disorder (ASD) in three severity levels. The results show outstanding performance of the proposed fusion-based few-shot framework on the dataset. Besides, we further explored the possibility of performing face micro- expression spotting and feature analysis on autism video data to classify ASD and control groups. The results indicate the effectiveness of subtle facial expression changes on autism diagnosis

    A Survey on Physical Adversarial Attack in Computer Vision

    Full text link
    Over the past decade, deep learning has revolutionized conventional tasks that rely on hand-craft feature extraction with its strong feature learning capability, leading to substantial enhancements in traditional tasks. However, deep neural networks (DNNs) have been demonstrated to be vulnerable to adversarial examples crafted by malicious tiny noise, which is imperceptible to human observers but can make DNNs output the wrong result. Existing adversarial attacks can be categorized into digital and physical adversarial attacks. The former is designed to pursue strong attack performance in lab environments while hardly remaining effective when applied to the physical world. In contrast, the latter focus on developing physical deployable attacks, thus exhibiting more robustness in complex physical environmental conditions. Recently, with the increasing deployment of the DNN-based system in the real world, strengthening the robustness of these systems is an emergency, while exploring physical adversarial attacks exhaustively is the precondition. To this end, this paper reviews the evolution of physical adversarial attacks against DNN-based computer vision tasks, expecting to provide beneficial information for developing stronger physical adversarial attacks. Specifically, we first proposed a taxonomy to categorize the current physical adversarial attacks and grouped them. Then, we discuss the existing physical attacks and focus on the technique for improving the robustness of physical attacks under complex physical environmental conditions. Finally, we discuss the issues of the current physical adversarial attacks to be solved and give promising directions

    Malicious Agent Detection for Robust Multi-Agent Collaborative Perception

    Full text link
    Recently, multi-agent collaborative (MAC) perception has been proposed and outperformed the traditional single-agent perception in many applications, such as autonomous driving. However, MAC perception is more vulnerable to adversarial attacks than single-agent perception due to the information exchange. The attacker can easily degrade the performance of a victim agent by sending harmful information from a malicious agent nearby. In this paper, we extend adversarial attacks to an important perception task -- MAC object detection, where generic defenses such as adversarial training are no longer effective against these attacks. More importantly, we propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception that can be deployed by each agent to accurately detect and then remove any potential malicious agent in its local collaboration network. In particular, MADE inspects each agent in the network independently using a semi-supervised anomaly detector based on a double-hypothesis test with the Benjamini-Hochberg procedure to control the false positive rate of the inference. For the two hypothesis tests, we propose a match loss statistic and a collaborative reconstruction loss statistic, respectively, both based on the consistency between the agent to be inspected and the ego agent where our detector is deployed. We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X and show that with the protection of MADE, the drops in the average precision compared with the best-case "oracle" defender against our attack are merely 1.28% and 0.34%, respectively, much lower than 8.92% and 10.00% for adversarial training, respectively

    Towards robust convolutional neural networks in challenging environments

    Get PDF
    Image classification is one of the fundamental tasks in the field of computer vision. Although Artificial Neural Network (ANN) showed a lot of promise in this field, the lack of efficient computer hardware subdued its potential to a great extent. In the early 2000s, advances in hardware coupled with better network design saw the dramatic rise of Convolutional Neural Network (CNN). Deep CNNs pushed the State-of-The-Art (SOTA) in a number of vision tasks, including image classification, object detection, and segmentation. Presently, CNNs dominate these tasks. Although CNNs exhibit impressive classification performance on clean images, they are vulnerable to distortions, such as noise and blur. Fine-tuning a pre-trained CNN on mutually exclusive or a union set of distortions is a brute-force solution. This iterative fine-tuning process with all known types of distortion is, however, exhaustive and the network struggles to handle unseen distortions. CNNs are also vulnerable to image translation or shift, partly due to common Down-Sampling (DS) layers, e.g., max-pooling and strided convolution. These operations violate the Nyquist sampling rate and cause aliasing. The textbook solution is low-pass filtering (blurring) before down-sampling, which can benefit deep networks as well. Even so, non-linearity units, such as ReLU, often re-introduce the problem, suggesting that blurring alone may not suffice. Another important but under-explored issue for CNNs is unknown or Open Set Recognition (OSR). CNNs are commonly designed for closed set arrangements, where test instances only belong to some ‘Known Known’ (KK) classes used in training. As such, they predict a class label for a test sample based on the distribution of the KK classes. However, when used under the OSR setup (where an input may belong to an ‘Unknown Unknown’ or UU class), such a network will always classify a test instance as one of the KK classes even if it is from a UU class. Historically, CNNs have struggled with detecting objects in images with large difference in scale, especially small objects. This is because the DS layers inside a CNN often progressively wipe out the signal from small objects. As a result, the final layers are left with no signature from these objects leading to degraded performance. In this work, we propose solutions to the above four problems. First, we improve CNN robustness against distortion by proposing DCT based augmentation, adaptive regularisation, and noise suppressing Activation Functions (AF). Second, to ensure further performance gain and robustness to image transformations, we introduce anti-aliasing properties inside the AF and propose a novel DS method called blurpool. Third, to address the OSR problem, we propose a novel training paradigm that ensures detection of UU classes and accurate classification of the KK classes. Finally, we introduce a novel CNN that enables a deep detector to identify small objects with high precision and recall. We evaluate our methods on a number of benchmark datasets and demonstrate that they outperform contemporary methods in the respective problem set-ups.Doctor of Philosoph
    • …
    corecore