128 research outputs found

    Efficient image copy detection using multi-scale fingerprints

    Get PDF
    Inspired by multi-resolution histogram, we propose a multi-scale SIFT descriptor to improve the discriminability. A series of SIFT descriptions with different scale are first acquired by varying the actual size of each spatial bin. Then principle component analysis (PCA) is employed to reduce them to low dimensional vectors, which are further combined into one 128-dimension multi-scale SIFT description. Next, an entropy maximization based binarization is employed to encode the descriptions into binary codes called fingerprints for indexing the local features. Furthermore, an efficient search architecture consisting of lookup tables and inverted image ID list is designed to improve the query speed. Since the fingerprint building is of low-complexity, this method is very efficient and scalable to very large databases. In addition, the multi-scale fingerprints are very discriminative such that the copies can be effectively distinguished from similar objects, which leads to an improved performance in the detection of copies. The experimental evaluation shows that our approach outperforms the state of the art methods.Inspired by multi-resolution histogram, we propose a multi-scale SIFT descriptor to improve the discriminability. A series of SIFT descriptions with different scale are first acquired by varying the actual size of each spatial bin. Then principle component analysis (PCA) is employed to reduce them to low dimensional vectors, which are further combined into one 128-dimension multi-scale SIFT description. Next, an entropy maximization based binarization is employed to encode the descriptions into binary codes called fingerprints for indexing the local features. Furthermore, an efficient search architecture consisting of lookup tables and inverted image ID list is designed to improve the query speed. Since the fingerprint building is of low-complexity, this method is very efficient and scalable to very large databases. In addition, the multi-scale fingerprints are very discriminative such that the copies can be effectively distinguished from similar objects, which leads to an improved performance in the detection of copies. The experimental evaluation shows that our approach outperforms the state of the art methods

    Attention-Aware Network with Latent Semantic Analysis for Clothing Invariant Gait Recognition

    Get PDF
    Gait recognition is a complicated task due to the existence of co-factors like carrying conditions, clothing, viewpoints, and surfaces which change the appearance of gait more or less. Among those co-factors, clothing analysis is the most challenging one in the area. Conventional methods which are proposed for clothing invariant gait recognition show the body parts and the underlying relationships from them are important for gait recognition. Fortunately, attention mechanism shows dramatic performance for highlighting discriminative regions. Meanwhile, latent semantic analysis is known for the ability of capturing latent semantic variables to represent the underlying attributes and capturing the relationships from the raw input. Thus, we propose a new CNN-based method which leverages advantage of the latent semantic analysis and attention mechanism. Based on discriminative features extracted using attention and the latent semantic analysis module respectively, multi-modal fusion method is proposed to fuse those features for its high fault tolerance in the decision level. Experiments on the most challenging clothing variation dataset: OU-ISIR TEADMILL dataset B show that our method outperforms other state-of-art gait approaches

    Improving Visual Quality and Transferability of Adversarial Attacks on Face Recognition Simultaneously with Adversarial Restoration

    Full text link
    Adversarial face examples possess two critical properties: Visual Quality and Transferability. However, existing approaches rarely address these properties simultaneously, leading to subpar results. To address this issue, we propose a novel adversarial attack technique known as Adversarial Restoration (AdvRestore), which enhances both visual quality and transferability of adversarial face examples by leveraging a face restoration prior. In our approach, we initially train a Restoration Latent Diffusion Model (RLDM) designed for face restoration. Subsequently, we employ the inference process of RLDM to generate adversarial face examples. The adversarial perturbations are applied to the intermediate features of RLDM. Additionally, by treating RLDM face restoration as a sibling task, the transferability of the generated adversarial face examples is further improved. Our experimental results validate the effectiveness of the proposed attack method.Comment: \copyright 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

    Improving the Transferability of Adversarial Attacks on Face Recognition with Beneficial Perturbation Feature Augmentation

    Full text link
    Face recognition (FR) models can be easily fooled by adversarial examples, which are crafted by adding imperceptible perturbations on benign face images. To improve the transferability of adversarial face examples, we propose a novel attack method called Beneficial Perturbation Feature Augmentation Attack (BPFA), which reduces the overfitting of adversarial examples to surrogate FR models by constantly generating new models that have the similar effect of hard samples to craft the adversarial examples. Specifically, in the backpropagation, BPFA records the gradients on pre-selected features and uses the gradient on the input image to craft the adversarial example. In the next forward propagation, BPFA leverages the recorded gradients to add perturbations (i.e., beneficial perturbations) that can be pitted against the adversarial example on their corresponding features. The optimization process of the adversarial example and the optimization process of the beneficial perturbations added on the features correspond to a minimax two-player game. Extensive experiments demonstrate that BPFA can significantly boost the transferability of adversarial attacks on FR

    SSyncOA: Self-synchronizing Object-aligned Watermarking to Resist Cropping-paste Attacks

    Full text link
    Modern image processing tools have made it easy for attackers to crop the region or object of interest in images and paste it into other images. The challenge this cropping-paste attack poses to the watermarking technology is that it breaks the synchronization of the image watermark, introducing multiple superimposed desynchronization distortions, such as rotation, scaling, and translation. However, current watermarking methods can only resist a single type of desynchronization and cannot be applied to protect the object's copyright under the cropping-paste attack. With the finding that the key to resisting the cropping-paste attack lies in robust features of the object to protect, this paper proposes a self-synchronizing object-aligned watermarking method, called SSyncOA. Specifically, we first constrain the watermarked region to be aligned with the protected object, and then synchronize the watermark's translation, rotation, and scaling distortions by normalizing the object invariant features, i.e., its centroid, principal orientation, and minimum bounding square, respectively. To make the watermark embedded in the protected object, we introduce the object-aligned watermarking model, which incorporates the real cropping-paste attack into the encoder-noise layer-decoder pipeline and is optimized end-to-end. Besides, we illustrate the effect of different desynchronization distortions on the watermark training, which confirms the necessity of the self-synchronization process. Extensive experiments demonstrate the superiority of our method over other SOTAs.Comment: 7 pages, 5 figures (Have been accepted by ICME 2024

    Balanced Deep Supervised Hashing

    Get PDF
    Recently, Convolutional Neural Network (CNN) based hashing method has achieved its promising performance for image retrieval task. However, tackling the discrepancy between quantization error minimization and discriminability maximization of network outputs simultaneously still remains unsolved. Motivated by the concern, we propose a novel Balanced Deep Supervised Hashing (BDSH) based on variant posterior probability to learn compact discriminability-preserving binary code for large scale image data. Distinguished from the previous works, BDSH can search an equilibrium point within the discrepancy. Towards the goal, a delicate objective function is utilized to maximize the discriminability of the output space with the variant posterior probability of the pair-wise label. A quantization regularizer is utilized as a relaxation from real-value outputs to the desired discrete values (e.g., -1/+1). Extensive experiments on the benchmark datasets show that our method can yield state-of-the-art image retrieval performance from various perspectives

    DBDH: A Dual-Branch Dual-Head Neural Network for Invisible Embedded Regions Localization

    Full text link
    Embedding invisible hyperlinks or hidden codes in images to replace QR codes has become a hot topic recently. This technology requires first localizing the embedded region in the captured photos before decoding. Existing methods that train models to find the invisible embedded region struggle to obtain accurate localization results, leading to degraded decoding accuracy. This limitation is primarily because the CNN network is sensitive to low-frequency signals, while the embedded signal is typically in the high-frequency form. Based on this, this paper proposes a Dual-Branch Dual-Head (DBDH) neural network tailored for the precise localization of invisible embedded regions. Specifically, DBDH uses a low-level texture branch containing 62 high-pass filters to capture the high-frequency signals induced by embedding. A high-level context branch is used to extract discriminative features between the embedded and normal regions. DBDH employs a detection head to directly detect the four vertices of the embedding region. In addition, we introduce an extra segmentation head to segment the mask of the embedding region during training. The segmentation head provides pixel-level supervision for model learning, facilitating better learning of the embedded signals. Based on two state-of-the-art invisible offline-to-online messaging methods, we construct two datasets and augmentation strategies for training and testing localization models. Extensive experiments demonstrate the superior performance of the proposed DBDH over existing methods.Comment: 7 pages, 6 figures (Have been accepted by IJCNN 2024
    corecore