128 research outputs found
Efficient image copy detection using multi-scale fingerprints
Inspired by multi-resolution histogram, we propose
a multi-scale SIFT descriptor to improve the discriminability.
A series of SIFT descriptions with different scale are first
acquired by varying the actual size of each spatial bin. Then
principle component analysis (PCA) is employed to reduce them
to low dimensional vectors, which are further combined into one
128-dimension multi-scale SIFT description. Next, an entropy
maximization based binarization is employed to encode the
descriptions into binary codes called fingerprints for indexing
the local features. Furthermore, an efficient search architecture
consisting of lookup tables and inverted image ID list is designed
to improve the query speed. Since the fingerprint building is
of low-complexity, this method is very efficient and scalable to
very large databases. In addition, the multi-scale fingerprints
are very discriminative such that the copies can be effectively
distinguished from similar objects, which leads to an improved
performance in the detection of copies. The experimental evaluation shows that our approach outperforms the state of the art
methods.Inspired by multi-resolution histogram, we propose a multi-scale SIFT descriptor to improve the discriminability. A series of SIFT descriptions with different scale are first acquired by varying the actual size of each spatial bin. Then principle component analysis (PCA) is employed to reduce them to low dimensional vectors, which are further combined into one 128-dimension multi-scale SIFT description. Next, an entropy maximization based binarization is employed to encode the descriptions into binary codes called fingerprints for indexing the local features. Furthermore, an efficient search architecture consisting of lookup tables and inverted image ID list is designed to improve the query speed. Since the fingerprint building is of low-complexity, this method is very efficient and scalable to very large databases. In addition, the multi-scale fingerprints are very discriminative such that the copies can be effectively distinguished from similar objects, which leads to an improved performance in the detection of copies. The experimental evaluation shows that our approach outperforms the state of the art methods
Attention-Aware Network with Latent Semantic Analysis for Clothing Invariant Gait Recognition
Gait recognition is a complicated task due to the existence of co-factors like carrying conditions, clothing, viewpoints, and surfaces which change the appearance of gait more or less. Among those co-factors, clothing analysis is the most challenging one in the area. Conventional methods which are proposed for clothing invariant gait recognition show the body parts and the underlying relationships from them are important for gait recognition. Fortunately, attention mechanism shows dramatic performance for highlighting discriminative regions. Meanwhile, latent semantic analysis is known for the ability of capturing latent semantic variables to represent the underlying attributes and capturing the relationships from the raw input. Thus, we propose a new CNN-based method which leverages advantage of the latent semantic analysis and attention mechanism. Based on discriminative features extracted using attention and the latent semantic analysis module respectively, multi-modal fusion method is proposed to fuse those features for its high fault tolerance in the decision level. Experiments on the most challenging clothing variation dataset: OU-ISIR TEADMILL dataset B show that our method outperforms other state-of-art gait approaches
Improving Visual Quality and Transferability of Adversarial Attacks on Face Recognition Simultaneously with Adversarial Restoration
Adversarial face examples possess two critical properties: Visual Quality and
Transferability. However, existing approaches rarely address these properties
simultaneously, leading to subpar results. To address this issue, we propose a
novel adversarial attack technique known as Adversarial Restoration
(AdvRestore), which enhances both visual quality and transferability of
adversarial face examples by leveraging a face restoration prior. In our
approach, we initially train a Restoration Latent Diffusion Model (RLDM)
designed for face restoration. Subsequently, we employ the inference process of
RLDM to generate adversarial face examples. The adversarial perturbations are
applied to the intermediate features of RLDM. Additionally, by treating RLDM
face restoration as a sibling task, the transferability of the generated
adversarial face examples is further improved. Our experimental results
validate the effectiveness of the proposed attack method.Comment: \copyright 2023 IEEE. Personal use of this material is permitted.
Permission from IEEE must be obtained for all other uses, in any current or
future media, including reprinting/republishing this material for advertising
or promotional purposes, creating new collective works, for resale or
redistribution to servers or lists, or reuse of any copyrighted component of
this work in other work
Improving the Transferability of Adversarial Attacks on Face Recognition with Beneficial Perturbation Feature Augmentation
Face recognition (FR) models can be easily fooled by adversarial examples,
which are crafted by adding imperceptible perturbations on benign face images.
To improve the transferability of adversarial face examples, we propose a novel
attack method called Beneficial Perturbation Feature Augmentation Attack
(BPFA), which reduces the overfitting of adversarial examples to surrogate FR
models by constantly generating new models that have the similar effect of hard
samples to craft the adversarial examples. Specifically, in the
backpropagation, BPFA records the gradients on pre-selected features and uses
the gradient on the input image to craft the adversarial example. In the next
forward propagation, BPFA leverages the recorded gradients to add perturbations
(i.e., beneficial perturbations) that can be pitted against the adversarial
example on their corresponding features. The optimization process of the
adversarial example and the optimization process of the beneficial
perturbations added on the features correspond to a minimax two-player game.
Extensive experiments demonstrate that BPFA can significantly boost the
transferability of adversarial attacks on FR
SSyncOA: Self-synchronizing Object-aligned Watermarking to Resist Cropping-paste Attacks
Modern image processing tools have made it easy for attackers to crop the
region or object of interest in images and paste it into other images. The
challenge this cropping-paste attack poses to the watermarking technology is
that it breaks the synchronization of the image watermark, introducing multiple
superimposed desynchronization distortions, such as rotation, scaling, and
translation. However, current watermarking methods can only resist a single
type of desynchronization and cannot be applied to protect the object's
copyright under the cropping-paste attack. With the finding that the key to
resisting the cropping-paste attack lies in robust features of the object to
protect, this paper proposes a self-synchronizing object-aligned watermarking
method, called SSyncOA. Specifically, we first constrain the watermarked region
to be aligned with the protected object, and then synchronize the watermark's
translation, rotation, and scaling distortions by normalizing the object
invariant features, i.e., its centroid, principal orientation, and minimum
bounding square, respectively. To make the watermark embedded in the protected
object, we introduce the object-aligned watermarking model, which incorporates
the real cropping-paste attack into the encoder-noise layer-decoder pipeline
and is optimized end-to-end. Besides, we illustrate the effect of different
desynchronization distortions on the watermark training, which confirms the
necessity of the self-synchronization process. Extensive experiments
demonstrate the superiority of our method over other SOTAs.Comment: 7 pages, 5 figures (Have been accepted by ICME 2024
Balanced Deep Supervised Hashing
Recently, Convolutional Neural Network (CNN) based hashing method has achieved its promising performance for image retrieval task. However, tackling the discrepancy between quantization error minimization and discriminability maximization of network outputs simultaneously still remains unsolved. Motivated by the concern, we propose a novel Balanced Deep Supervised Hashing (BDSH) based on variant posterior probability to learn compact discriminability-preserving binary code for large scale image data. Distinguished from the previous works, BDSH can search an equilibrium point within the discrepancy. Towards the goal, a delicate objective function is utilized to maximize the discriminability of the output space with the variant posterior probability of the pair-wise label. A quantization regularizer is utilized as a relaxation from real-value outputs to the desired discrete values (e.g., -1/+1). Extensive experiments on the benchmark datasets show that our method can yield state-of-the-art image retrieval performance from various perspectives
DBDH: A Dual-Branch Dual-Head Neural Network for Invisible Embedded Regions Localization
Embedding invisible hyperlinks or hidden codes in images to replace QR codes
has become a hot topic recently. This technology requires first localizing the
embedded region in the captured photos before decoding. Existing methods that
train models to find the invisible embedded region struggle to obtain accurate
localization results, leading to degraded decoding accuracy. This limitation is
primarily because the CNN network is sensitive to low-frequency signals, while
the embedded signal is typically in the high-frequency form. Based on this,
this paper proposes a Dual-Branch Dual-Head (DBDH) neural network tailored for
the precise localization of invisible embedded regions. Specifically, DBDH uses
a low-level texture branch containing 62 high-pass filters to capture the
high-frequency signals induced by embedding. A high-level context branch is
used to extract discriminative features between the embedded and normal
regions. DBDH employs a detection head to directly detect the four vertices of
the embedding region. In addition, we introduce an extra segmentation head to
segment the mask of the embedding region during training. The segmentation head
provides pixel-level supervision for model learning, facilitating better
learning of the embedded signals. Based on two state-of-the-art invisible
offline-to-online messaging methods, we construct two datasets and augmentation
strategies for training and testing localization models. Extensive experiments
demonstrate the superior performance of the proposed DBDH over existing
methods.Comment: 7 pages, 6 figures (Have been accepted by IJCNN 2024
- …
