Search CORE

206,927 research outputs found

Recognizing Partial Biometric Patterns

Author: He Lingxiao
Sun Zhenan
Wang Yunbo
Zhu Yuhao
Publication venue
Publication date: 17/10/2018
Field of study

Biometric recognition on partial captured targets is challenging, where only several partial observations of objects are available for matching. In this area, deep learning based methods are widely applied to match these partial captured objects caused by occlusions, variations of postures or just partial out of view in person re-identification and partial face recognition. However, most current methods are not able to identify an individual in case that some parts of the object are not obtainable, while the rest are specialized to certain constrained scenarios. To this end, we propose a robust general framework for arbitrary biometric matching scenarios without the limitations of alignment as well as the size of inputs. We introduce a feature post-processing step to handle the feature maps from FCN and a dictionary learning based Spatial Feature Reconstruction (SFR) to match different sized feature maps in this work. Moreover, the batch hard triplet loss function is applied to optimize the model. The applicability and effectiveness of the proposed method are demonstrated by the results from experiments on three person re-identification datasets (Market1501, CUHK03, DukeMTMC-reID), two partial person datasets (Partial REID and Partial iLIDS) and two partial face datasets (CASIA-NIR-Distance and Partial LFW), on which state-of-the-art performance is ensured in comparison with several state-of-the-art approaches. The code is released online and can be found on the website: https://github.com/lingxiao-he/Partial-Person-ReID.Comment: 13 pages, 11 figure

arXiv.org e-Print Archive

Face Identification using Local Ternary Tree Pattern based Spatial Structural Components

Author: Gupta Phalguni
Kisku Dakshina Ranjan
Rakshit Rinku Datta
Tistarelli Massimo
Publication venue
Publication date: 16/07/2020
Field of study

This paper reports a face identification system which makes use of a novel local descriptor called Local Ternary Tree Pattern (LTTP). Exploiting and extracting distinctive local descriptor from a face image plays a crucial role in face identification task in the presence of a variety of face images including constrained, unconstrained and plastic surgery images. LTTP has been used to extract robust and useful spatial features which use to describe the various structural components on a face. To extract the features, a ternary tree is formed for each pixel with its eight neighbors in each block. LTTP pattern can be generated in four forms such as LTTP Left Depth (LTTP LD), LTTP Left Breadth (LTTP LB), LTTP Right Depth (LTTP RD) and LTTP Right Breadth (LTTP RB). The encoding schemes of these patterns are very simple and efficient in terms of computational as well as time complexity. The proposed face identification system is tested on six face databases, namely, the UMIST, the JAFFE, the extended Yale face B, the Plastic Surgery, the LFW and the UFI. The experimental evaluation demonstrates the most promising results considering a variety of faces captured under different environments. The proposed LTTP based system is also compared with some local descriptors under identical conditions.Comment: 13 pages, 5 figures, conference pape

arXiv.org e-Print Archive

Maximum Entropy Binary Encoding for Face Template Protection

Author: Govindaraju Venu
Kota Bhargava Urala
Pandey Rohit Kumar
Zhou Yingbo
Publication venue
Publication date: 05/12/2015
Field of study

In this paper we present a framework for secure identification using deep neural networks, and apply it to the task of template protection for face authentication. We use deep convolutional neural networks (CNNs) to learn a mapping from face images to maximum entropy binary (MEB) codes. The mapping is robust enough to tackle the problem of exact matching, yielding the same code for new samples of a user as the code assigned during training. These codes are then hashed using any hash function that follows the random oracle model (like SHA-512) to generate protected face templates (similar to text based password protection). The algorithm makes no unrealistic assumptions and offers high template security, cancelability, and state-of-the-art matching performance. The efficacy of the approach is shown on CMU-PIE, Extended Yale B, and Multi-PIE face databases. We achieve high (~95%) genuine accept rates (GAR) at zero false accept rate (FAR) with up to 1024 bits of template security.Comment: arXiv admin note: text overlap with arXiv:1506.0434

arXiv.org e-Print Archive

Learning Disentangled Representations for Timber and Pitch in Music Audio

Author: Chen Yi-An
Hung Yun-Ning
Yang Yi-Hsuan
Publication venue
Publication date: 08/11/2018
Field of study

Timbre and pitch are the two main perceptual properties of musical sounds. Depending on the target applications, we sometimes prefer to focus on one of them, while reducing the effect of the other. Researchers have managed to hand-craft such timbre-invariant or pitch-invariant features using domain knowledge and signal processing techniques, but it remains difficult to disentangle them in the resulting feature representations. Drawing upon state-of-the-art techniques in representation learning, we propose in this paper two deep convolutional neural network models for learning disentangled representation of musical timbre and pitch. Both models use encoders/decoders and adversarial training to learn music representations, but the second model additionally uses skip connections to deal with the pitch information. As music is an art of time, the two models are supervised by frame-level instrument and pitch labels using a new dataset collected from MuseScore. We compare the result of the two disentangling models with a new evaluation protocol called "timbre crossover", which leads to interesting applications in audio-domain music editing. Via various objective evaluations, we show that the second model can better change the instrumentation of a multi-instrument music piece without much affecting the pitch structure. By disentangling timbre and pitch, we envision that the model can contribute to generating more realistic music audio as well

arXiv.org e-Print Archive

Deep Learning Algorithms with Applications to Video Analytics for A Smart City: A Survey

Author: Sng Dennis
Wang Li
Publication venue
Publication date: 09/12/2015
Field of study

Deep learning has recently achieved very promising results in a wide range of areas such as computer vision, speech recognition and natural language processing. It aims to learn hierarchical representations of data by using deep architecture models. In a smart city, a lot of data (e.g. videos captured from many distributed sensors) need to be automatically processed and analyzed. In this paper, we review the deep learning algorithms applied to video analytics of smart city in terms of different research topics: object detection, object tracking, face recognition, image classification and scene labeling.Comment: 8 pages, 18 figure

arXiv.org e-Print Archive

When 3D-Aided 2D Face Recognition Meets Deep Learning: An extended UR2D for Pose-Invariant Face Recognition

Author: Dou Pengfei
Kakadiaris Ioannis A.
Le Ha A.
Xu Xiang
Publication venue
Publication date: 19/09/2017
Field of study

Most of the face recognition works focus on specific modules or demonstrate a research idea. This paper presents a pose-invariant 3D-aided 2D face recognition system (UR2D) that is robust to pose variations as large as 90? by leveraging deep learning technology. The architecture and the interface of UR2D are described, and each module is introduced in detail. Extensive experiments are conducted on the UHDB31 and IJB-A, demonstrating that UR2D outperforms existing 2D face recognition systems such as VGG-Face, FaceNet, and a commercial off-the-shelf software (COTS) by at least 9% on the UHDB31 dataset and 3% on the IJB-A dataset on average in face identification tasks. UR2D also achieves state-of-the-art performance of 85% on the IJB-A dataset by comparing the Rank-1 accuracy score from template matching. It fills a gap by providing a 3D-aided 2D face recognition system that has compatible results with 2D face recognition systems using deep learning techniques.Comment: Submitted to Special Issue on Biometrics in the Wild, Image and Vision Computin

arXiv.org e-Print Archive

Detection and Demarcation of Tumor using Vector Quantization in MRI images

Author: Gharge Saylee M.
Kekre H. B.
Sarode Tanuja K.
Publication venue
Publication date: 23/01/2010
Field of study

Segmenting a MRI images into homogeneous texture regions representing disparate tissue types is often a useful preprocessing step in the computer-assisted detection of breast cancer. That is why we proposed new algorithm to detect cancer in mammogram breast cancer images. In this paper we proposed segmentation using vector quantization technique. Here we used Linde Buzo-Gray algorithm (LBG) for segmentation of MRI images. Initially a codebook of size 128 was generated for MRI images. These code vectors were further clustered in 8 clusters using same LBG algorithm. These 8 images were displayed as a result. This approach does not leads to over segmentation or under segmentation. For the comparison purpose we displayed results of watershed segmentation and Entropy using Gray Level Co-occurrence Matrix along with this method.Comment: 8 Page

arXiv.org e-Print Archive

Face Identity Disentanglement via Latent Space Mapping

Author: Bermano Amit
Cohen-Or Daniel
Li Yangyan
Nitzan Yotam
Publication venue
Publication date: 07/08/2020
Field of study

Learning disentangled representations of data is a fundamental problem in artificial intelligence. Specifically, disentangled latent representations allow generative models to control and compose the disentangled factors in the synthesis process. Current methods, however, require extensive supervision and training, or instead, noticeably compromise quality. In this paper, we present a method that learn show to represent data in a disentangled way, with minimal supervision, manifested solely using available pre-trained networks. Our key insight is to decouple the processes of disentanglement and synthesis, by employing a leading pre-trained unconditional image generator, such as StyleGAN. By learning to map into its latent space, we leverage both its state-of-the-art quality generative power, and its rich and expressive latent space, without the burden of training it.We demonstrate our approach on the complex and high dimensional domain of human heads. We evaluate our method qualitatively and quantitatively, and exhibit its success with de-identification operations and with temporal identity coherency in image sequences. Through this extensive experimentation, we show that our method successfully disentangles identity from other facial attributes, surpassing existing methods, even though they require more training and supervision.Comment: 17 pages, 10 figure

arXiv.org e-Print Archive

Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

Author: Gao Wen
Gao Zhanning
Hua Xiansheng
Liu Chang
Ma Siwei
Ren Peiran
Wang Pan
Wang Shanshe
Yang Lingbo
Zhang Xinfeng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/05/2020
Field of study

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization accuracy, and near 40\% gain on face identity preservation. Moreover, the evaluation results offer further insights to the subject matter, which could inspire many promising future works along this direction.Comment: IEEE TIP submissio

arXiv.org e-Print Archive

Deep Secure Encoding: An Application to Face Recognition

Author: Govindaraju Venu
Pandey Rohit
Zhou Yingbo
Publication venue
Publication date: 13/06/2015
Field of study

In this paper we present Deep Secure Encoding: a framework for secure classification using deep neural networks, and apply it to the task of biometric template protection for faces. Using deep convolutional neural networks (CNNs), we learn a robust mapping of face classes to high entropy secure codes. These secure codes are then hashed using standard hash functions like SHA-256 to generate secure face templates. The efficacy of the approach is shown on two face databases, namely, CMU-PIE and Extended Yale B, where we achieve state of the art matching performance, along with cancelability and high security with no unrealistic assumptions. Furthermore, the scheme can work in both identification and verification modes

arXiv.org e-Print Archive