7,517 research outputs found
A Sparse Representation of Complete Local Binary Pattern Histogram for Human Face Recognition
Human face recognition has been a long standing problem in computer vision
and pattern recognition. Facial analysis can be viewed as a two-fold problem,
namely (i) facial representation, and (ii) classification. So far, many face
representations have been proposed, a well-known method is the Local Binary
Pattern (LBP), which has witnessed a growing interest. In this respect, we
treat in this paper the issues of face representation as well as classification
in a novel manner. On the one hand, we use a variant to LBP, so-called Complete
Local Binary Pattern (CLBP), which differs from the basic LBP by coding a given
local region using a given central pixel and Sing_ Magnitude difference.
Subsequently, most of LBPbased descriptors use a fixed grid to code a given
facial image, which technique is, in most cases, not robust to pose variation
and misalignment. To cope with such issue, a representative Multi-Resolution
Histogram (MH) decomposition is adopted in our work. On the other hand, having
the histograms of the considered images extracted, we exploit their sparsity to
construct a so-called Sparse Representation Classifier (SRC) for further face
classification. Experimental results have been conducted on ORL face database,
and pointed out the superiority of our scheme over other popular
state-of-the-art techniques.Comment: Accepted (but unattended) in IEEE-EMBS International Conferences on
Biomedical and Health Informatics (BHI
Face Recognition: From Traditional to Deep Learning Methods
Starting in the seventies, face recognition has become one of the most
researched topics in computer vision and biometrics. Traditional methods based
on hand-crafted features and traditional machine learning techniques have
recently been superseded by deep neural networks trained with very large
datasets. In this paper we provide a comprehensive and up-to-date literature
review of popular face recognition methods including both traditional
(geometry-based, holistic, feature-based and hybrid methods) and deep learning
methods
From BoW to CNN: Two Decades of Texture Representation for Texture Classification
Texture is a fundamental characteristic of many types of images, and texture
representation is one of the essential and challenging problems in computer
vision and pattern recognition which has attracted extensive research
attention. Since 2000, texture representations based on Bag of Words (BoW) and
on Convolutional Neural Networks (CNNs) have been extensively studied with
impressive performance. Given this period of remarkable evolution, this paper
aims to present a comprehensive survey of advances in texture representation
over the last two decades. More than 200 major publications are cited in this
survey covering different aspects of the research, which includes (i) problem
description; (ii) recent advances in the broad categories of BoW-based,
CNN-based and attribute-based methods; and (iii) evaluation issues,
specifically benchmark datasets and state of the art results. In retrospect of
what has been achieved so far, the survey discusses open challenges and
directions for future research.Comment: Accepted by IJC
Feature Selection via Sparse Approximation for Face Recognition
Inspired by biological vision systems, the over-complete local features with
huge cardinality are increasingly used for face recognition during the last
decades. Accordingly, feature selection has become more and more important and
plays a critical role for face data description and recognition. In this paper,
we propose a trainable feature selection algorithm based on the regularized
frame for face recognition. By enforcing a sparsity penalty term on the minimum
squared error (MSE) criterion, we cast the feature selection problem into a
combinatorial sparse approximation problem, which can be solved by greedy
methods or convex relaxation methods. Moreover, based on the same frame, we
propose a sparse Ho-Kashyap (HK) procedure to obtain simultaneously the optimal
sparse solution and the corresponding margin vector of the MSE criterion. The
proposed methods are used for selecting the most informative Gabor features of
face images for recognition and the experimental results on benchmark face
databases demonstrate the effectiveness of the proposed methods
A Review on Facial Micro-Expressions Analysis: Datasets, Features and Metrics
Facial micro-expressions are very brief, spontaneous facial expressions that
appear on the face of humans when they either deliberately or unconsciously
conceal an emotion. Micro-expression has shorter duration than
macro-expression, which makes it more challenging for human and machine. Over
the past ten years, automatic micro-expressions recognition has attracted
increasing attention from researchers in psychology, computer science,
security, neuroscience and other related disciplines. The aim of this paper is
to provide the insights of automatic micro-expressions and recommendations for
future research. There has been a lot of datasets released over the last decade
that facilitated the rapid growth in this field. However, comparison across
different datasets is difficult due to the inconsistency in experiment
protocol, features used and evaluation methods. To address these issues, we
review the datasets, features and the performance metrics deployed in the
literature. Relevant challenges such as the spatial temporal settings during
data collection, emotional classes versus objective classes in data labelling,
face regions in data analysis, standardisation of metrics and the requirements
for real-world implementation are discussed. We conclude by proposing some
promising future directions to advancing micro-expressions research.Comment: Preprint submitted to IEEE Transaction
Spatiotemporal Recurrent Convolutional Networks for Recognizing Spontaneous Micro-expressions
Recently, the recognition task of spontaneous facial micro-expressions has
attracted much attention with its various real-world applications. Plenty of
handcrafted or learned features have been employed for a variety of classifiers
and achieved promising performances for recognizing micro-expressions. However,
the micro-expression recognition is still challenging due to the subtle
spatiotemporal changes of micro-expressions. To exploit the merits of deep
learning, we propose a novel deep recurrent convolutional networks based
micro-expression recognition approach, capturing the spatial-temporal
deformations of micro-expression sequence. Specifically, the proposed deep
model is constituted of several recurrent convolutional layers for extracting
visual features and a classificatory layer for recognition. It is optimized by
an end-to-end manner and obviates manual feature design. To handle sequential
data, we exploit two types of extending the connectivity of convolutional
networks across temporal domain, in which the spatiotemporal deformations are
modeled in views of facial appearance and geometry separately. Besides, to
overcome the shortcomings of limited and imbalanced training samples, temporal
data augmentation strategies as well as a balanced loss are jointly used for
our deep network. By performing the experiments on three spontaneous
micro-expression datasets, we verify the effectiveness of our proposed
micro-expression recognition approach compared to the state-of-the-art methods.Comment: Submitted to IEEE TM
A Two-Layer Local Constrained Sparse Coding Method for Fine-Grained Visual Categorization
Fine-grained categories are more difficulty distinguished than generic
categories due to the similarity of inter-class and the diversity of
intra-class. Therefore, the fine-grained visual categorization (FGVC) is
considered as one of challenge problems in computer vision recently. A new
feature learning framework, which is based on a two-layer local constrained
sparse coding architecture, is proposed in this paper. The two-layer
architecture is introduced for learning intermediate-level features, and the
local constrained term is applied to guarantee the local smooth of coding
coefficients. For extracting more discriminative information, local orientation
histograms are the input of sparse coding instead of raw pixels. Moreover, a
quick dictionary updating process is derived to further improve the training
speed. Two experimental results show that our method achieves 85.29% accuracy
on the Oxford 102 flowers dataset and 67.8% accuracy on the CUB-200-2011 bird
dataset, and the performance of our framework is highly competitive with
existing literatures.Comment: 19 pages, 12 figures, 8 table
HEp-2 Cell Classification via Fusing Texture and Shape Information
Indirect Immunofluorescence (IIF) HEp-2 cell image is an effective evidence
for diagnosis of autoimmune diseases. Recently computer-aided diagnosis of
autoimmune diseases by IIF HEp-2 cell classification has attracted great
attention. However the HEp-2 cell classification task is quite challenging due
to large intra-class variation and small between-class variation. In this paper
we propose an effective and efficient approach for the automatic classification
of IIF HEp-2 cell image by fusing multi-resolution texture information and
richer shape information. To be specific, we propose to: a) capture the
multi-resolution texture information by a novel Pairwise Rotation Invariant
Co-occurrence of Local Gabor Binary Pattern (PRICoLGBP) descriptor, b) depict
the richer shape information by using an Improved Fisher Vector (IFV) model
with RootSIFT features which are sampled from large image patches in multiple
scales, and c) combine them properly. We evaluate systematically the proposed
approach on the IEEE International Conference on Pattern Recognition (ICPR)
2012, IEEE International Conference on Image Processing (ICIP) 2013 and ICPR
2014 contest data sets. The experimental results for the proposed methods
significantly outperform the winners of ICPR 2012 and ICIP 2013 contest, and
achieve comparable performance with the winner of the newly released ICPR 2014
contest.Comment: 11 pages, 7 figure
Towards Reading Hidden Emotions: A comparative Study of Spontaneous Micro-expression Spotting and Recognition Methods
Micro-expressions (MEs) are rapid, involuntary facial expressions which
reveal emotions that people do not intend to show. Studying MEs is valuable as
recognizing them has many important applications, particularly in forensic
science and psychotherapy. However, analyzing spontaneous MEs is very
challenging due to their short duration and low intensity. Automatic ME
analysis includes two tasks: ME spotting and ME recognition. For ME spotting,
previous studies have focused on posed rather than spontaneous videos. For ME
recognition, the performance of previous studies is low. To address these
challenges, we make the following contributions: (i)We propose the first method
for spotting spontaneous MEs in long videos (by exploiting feature difference
contrast). This method is training free and works on arbitrary unseen videos.
(ii)We present an advanced ME recognition framework, which outperforms previous
work by a large margin on two challenging spontaneous ME databases (SMIC and
CASMEII). (iii)We propose the first automatic ME analysis system (MESR), which
can spot and recognize MEs from spontaneous video data. Finally, we show our
method outperforms humans in the ME recognition task by a large margin, and
achieves comparable performance to humans at the very challenging task of
spotting and then recognizing spontaneous MEs
What you need to know about the state-of-the-art computational models of object-vision: A tour through the models
Models of object vision have been of great interest in computer vision and
visual neuroscience. During the last decades, several models have been
developed to extract visual features from images for object recognition tasks.
Some of these were inspired by the hierarchical structure of primate visual
system, and some others were engineered models. The models are varied in
several aspects: models that are trained by supervision, models trained without
supervision, and models (e.g. feature extractors) that are fully hard-wired and
do not need training. Some of the models come with a deep hierarchical
structure consisting of several layers, and some others are shallow and come
with only one or two layers of processing. More recently, new models have been
developed that are not hand-tuned but trained using millions of images, through
which they learn how to extract informative task-related features. Here I will
survey all these different models and provide the reader with an intuitive, as
well as a more detailed, understanding of the underlying computations in each
of the models.Comment: 36 pages, 22 figure
- …