206,927 research outputs found
Recognizing Partial Biometric Patterns
Biometric recognition on partial captured targets is challenging, where only
several partial observations of objects are available for matching. In this
area, deep learning based methods are widely applied to match these partial
captured objects caused by occlusions, variations of postures or just partial
out of view in person re-identification and partial face recognition. However,
most current methods are not able to identify an individual in case that some
parts of the object are not obtainable, while the rest are specialized to
certain constrained scenarios. To this end, we propose a robust general
framework for arbitrary biometric matching scenarios without the limitations of
alignment as well as the size of inputs. We introduce a feature post-processing
step to handle the feature maps from FCN and a dictionary learning based
Spatial Feature Reconstruction (SFR) to match different sized feature maps in
this work. Moreover, the batch hard triplet loss function is applied to
optimize the model. The applicability and effectiveness of the proposed method
are demonstrated by the results from experiments on three person
re-identification datasets (Market1501, CUHK03, DukeMTMC-reID), two partial
person datasets (Partial REID and Partial iLIDS) and two partial face datasets
(CASIA-NIR-Distance and Partial LFW), on which state-of-the-art performance is
ensured in comparison with several state-of-the-art approaches. The code is
released online and can be found on the website:
https://github.com/lingxiao-he/Partial-Person-ReID.Comment: 13 pages, 11 figure
Face Identification using Local Ternary Tree Pattern based Spatial Structural Components
This paper reports a face identification system which makes use of a novel
local descriptor called Local Ternary Tree Pattern (LTTP). Exploiting and
extracting distinctive local descriptor from a face image plays a crucial role
in face identification task in the presence of a variety of face images
including constrained, unconstrained and plastic surgery images. LTTP has been
used to extract robust and useful spatial features which use to describe the
various structural components on a face. To extract the features, a ternary
tree is formed for each pixel with its eight neighbors in each block. LTTP
pattern can be generated in four forms such as LTTP Left Depth (LTTP LD), LTTP
Left Breadth (LTTP LB), LTTP Right Depth (LTTP RD) and LTTP Right Breadth (LTTP
RB). The encoding schemes of these patterns are very simple and efficient in
terms of computational as well as time complexity. The proposed face
identification system is tested on six face databases, namely, the UMIST, the
JAFFE, the extended Yale face B, the Plastic Surgery, the LFW and the UFI. The
experimental evaluation demonstrates the most promising results considering a
variety of faces captured under different environments. The proposed LTTP based
system is also compared with some local descriptors under identical conditions.Comment: 13 pages, 5 figures, conference pape
Maximum Entropy Binary Encoding for Face Template Protection
In this paper we present a framework for secure identification using deep
neural networks, and apply it to the task of template protection for face
authentication. We use deep convolutional neural networks (CNNs) to learn a
mapping from face images to maximum entropy binary (MEB) codes. The mapping is
robust enough to tackle the problem of exact matching, yielding the same code
for new samples of a user as the code assigned during training. These codes are
then hashed using any hash function that follows the random oracle model (like
SHA-512) to generate protected face templates (similar to text based password
protection). The algorithm makes no unrealistic assumptions and offers high
template security, cancelability, and state-of-the-art matching performance.
The efficacy of the approach is shown on CMU-PIE, Extended Yale B, and
Multi-PIE face databases. We achieve high (~95%) genuine accept rates (GAR) at
zero false accept rate (FAR) with up to 1024 bits of template security.Comment: arXiv admin note: text overlap with arXiv:1506.0434
Learning Disentangled Representations for Timber and Pitch in Music Audio
Timbre and pitch are the two main perceptual properties of musical sounds.
Depending on the target applications, we sometimes prefer to focus on one of
them, while reducing the effect of the other. Researchers have managed to
hand-craft such timbre-invariant or pitch-invariant features using domain
knowledge and signal processing techniques, but it remains difficult to
disentangle them in the resulting feature representations. Drawing upon
state-of-the-art techniques in representation learning, we propose in this
paper two deep convolutional neural network models for learning disentangled
representation of musical timbre and pitch. Both models use encoders/decoders
and adversarial training to learn music representations, but the second model
additionally uses skip connections to deal with the pitch information. As music
is an art of time, the two models are supervised by frame-level instrument and
pitch labels using a new dataset collected from MuseScore. We compare the
result of the two disentangling models with a new evaluation protocol called
"timbre crossover", which leads to interesting applications in audio-domain
music editing. Via various objective evaluations, we show that the second model
can better change the instrumentation of a multi-instrument music piece without
much affecting the pitch structure. By disentangling timbre and pitch, we
envision that the model can contribute to generating more realistic music audio
as well
Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey
Deep learning has recently achieved very promising results in a wide range of
areas such as computer vision, speech recognition and natural language
processing. It aims to learn hierarchical representations of data by using deep
architecture models. In a smart city, a lot of data (e.g. videos captured from
many distributed sensors) need to be automatically processed and analyzed. In
this paper, we review the deep learning algorithms applied to video analytics
of smart city in terms of different research topics: object detection, object
tracking, face recognition, image classification and scene labeling.Comment: 8 pages, 18 figure
When 3D-Aided 2D Face Recognition Meets Deep Learning: An extended UR2D for Pose-Invariant Face Recognition
Most of the face recognition works focus on specific modules or demonstrate a
research idea. This paper presents a pose-invariant 3D-aided 2D face
recognition system (UR2D) that is robust to pose variations as large as 90? by
leveraging deep learning technology. The architecture and the interface of UR2D
are described, and each module is introduced in detail. Extensive experiments
are conducted on the UHDB31 and IJB-A, demonstrating that UR2D outperforms
existing 2D face recognition systems such as VGG-Face, FaceNet, and a
commercial off-the-shelf software (COTS) by at least 9% on the UHDB31 dataset
and 3% on the IJB-A dataset on average in face identification tasks. UR2D also
achieves state-of-the-art performance of 85% on the IJB-A dataset by comparing
the Rank-1 accuracy score from template matching. It fills a gap by providing a
3D-aided 2D face recognition system that has compatible results with 2D face
recognition systems using deep learning techniques.Comment: Submitted to Special Issue on Biometrics in the Wild, Image and
Vision Computin
Detection and Demarcation of Tumor using Vector Quantization in MRI images
Segmenting a MRI images into homogeneous texture regions representing
disparate tissue types is often a useful preprocessing step in the
computer-assisted detection of breast cancer. That is why we proposed new
algorithm to detect cancer in mammogram breast cancer images. In this paper we
proposed segmentation using vector quantization technique. Here we used Linde
Buzo-Gray algorithm (LBG) for segmentation of MRI images. Initially a codebook
of size 128 was generated for MRI images. These code vectors were further
clustered in 8 clusters using same LBG algorithm. These 8 images were displayed
as a result. This approach does not leads to over segmentation or under
segmentation. For the comparison purpose we displayed results of watershed
segmentation and Entropy using Gray Level Co-occurrence Matrix along with this
method.Comment: 8 Page
Face Identity Disentanglement via Latent Space Mapping
Learning disentangled representations of data is a fundamental problem in
artificial intelligence. Specifically, disentangled latent representations
allow generative models to control and compose the disentangled factors in the
synthesis process. Current methods, however, require extensive supervision and
training, or instead, noticeably compromise quality. In this paper, we present
a method that learn show to represent data in a disentangled way, with minimal
supervision, manifested solely using available pre-trained networks. Our key
insight is to decouple the processes of disentanglement and synthesis, by
employing a leading pre-trained unconditional image generator, such as
StyleGAN. By learning to map into its latent space, we leverage both its
state-of-the-art quality generative power, and its rich and expressive latent
space, without the burden of training it.We demonstrate our approach on the
complex and high dimensional domain of human heads. We evaluate our method
qualitatively and quantitatively, and exhibit its success with
de-identification operations and with temporal identity coherency in image
sequences. Through this extensive experimentation, we show that our method
successfully disentangles identity from other facial attributes, surpassing
existing methods, even though they require more training and supervision.Comment: 17 pages, 10 figure
Towards Fine-grained Human Pose Transfer with Detail Replenishing Network
Human pose transfer (HPT) is an emerging research topic with huge potential
in fashion design, media production, online advertising and virtual reality.
For these applications, the visual realism of fine-grained appearance details
is crucial for production quality and user engagement. However, existing HPT
methods often suffer from three fundamental issues: detail deficiency, content
ambiguity and style inconsistency, which severely degrade the visual quality
and realism of generated images. Aiming towards real-world applications, we
develop a more challenging yet practical HPT setting, termed as Fine-grained
Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail
replenishment. Concretely, we analyze the potential design flaws of existing
methods via an illustrative example, and establish the core FHPT methodology by
combing the idea of content synthesis and feature transfer together in a
mutually-guided fashion. Thereafter, we substantiate the proposed methodology
with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine
model training scheme. Moreover, we build up a complete suite of fine-grained
evaluation protocols to address the challenges of FHPT in a comprehensive
manner, including semantic analysis, structural detection and perceptual
quality assessment. Extensive experiments on the DeepFashion benchmark dataset
have verified the power of proposed benchmark against start-of-the-art works,
with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization
accuracy, and near 40\% gain on face identity preservation. Moreover, the
evaluation results offer further insights to the subject matter, which could
inspire many promising future works along this direction.Comment: IEEE TIP submissio
Deep Secure Encoding: An Application to Face Recognition
In this paper we present Deep Secure Encoding: a framework for secure
classification using deep neural networks, and apply it to the task of
biometric template protection for faces. Using deep convolutional neural
networks (CNNs), we learn a robust mapping of face classes to high entropy
secure codes. These secure codes are then hashed using standard hash functions
like SHA-256 to generate secure face templates. The efficacy of the approach is
shown on two face databases, namely, CMU-PIE and Extended Yale B, where we
achieve state of the art matching performance, along with cancelability and
high security with no unrealistic assumptions. Furthermore, the scheme can work
in both identification and verification modes
- …