130,331 research outputs found
EXPERTNet Exigent Features Preservative Network for Facial Expression Recognition
Facial expressions have essential cues to infer the humans state of mind,
that conveys adequate information to understand individuals actual feelings.
Thus, automatic facial expression recognition is an interesting and crucial
task to interpret the humans cognitive state through the machine. In this
paper, we proposed an Exigent Features Preservative Network (EXPERTNet), to
describe the features of the facial expressions. The EXPERTNet extracts only
pertinent features and neglect others by using exigent feature (ExFeat) block,
mainly comprises of elective layer. Specifically, elective layer selects the
desired edge variation features from the previous layer outcomes, which are
generated by applying different sized filters as 1 x 1, 3 x 3, 5 x 5 and 7 x 7.
Different sized filters aid to elicits both micro and high-level features that
enhance the learnability of neurons. ExFeat block preserves the spatial
structural information of the facial expression, which allows to discriminate
between different classes of facial expressions. Visual representation of the
proposed method over different facial expressions shows the learning capability
of the neurons of different layers. Experimental and comparative analysis
results over four comprehensive datasets CK+, MMI DISFA and GEMEP-FERA, ensures
the better performance of the proposed network as compared to existing
networks
Deep Trans-layer Unsupervised Networks for Representation Learning
Learning features from massive unlabelled data is a vast prevalent topic for
high-level tasks in many machine learning applications. The recent great
improvements on benchmark data sets achieved by increasingly complex
unsupervised learning methods and deep learning models with lots of parameters
usually requires many tedious tricks and much expertise to tune. However,
filters learned by these complex architectures are quite similar to standard
hand-crafted features visually. In this paper, unsupervised learning methods,
such as PCA or auto-encoder, are employed as the building block to learn filter
banks at each layer. The lower layer responses are transferred to the last
layer (trans-layer) to form a more complete representation retaining more
information. In addition, some beneficial methods such as local contrast
normalization and whitening are added to the proposed deep trans-layer networks
to further boost performance. The trans-layer representations are followed by
block histograms with binary encoder schema to learn translation and rotation
invariant representations, which are utilized to do high-level tasks such as
recognition and classification. Compared to traditional deep learning methods,
the implemented feature learning method has much less parameters and is
validated in several typical experiments, such as digit recognition on MNIST
and MNIST variations, object recognition on Caltech 101 dataset and face
verification on LFW dataset. The deep trans-layer unsupervised learning
achieves 99.45% accuracy on MNIST dataset, 67.11% accuracy on 15 samples per
class and 75.98% accuracy on 30 samples per class on Caltech 101 dataset,
87.10% on LFW dataset.Comment: 21 pages, 3 figure
Pose-Invariant Face Alignment with a Single CNN
Face alignment has witnessed substantial progress in the last decade. One of
the recent focuses has been aligning a dense 3D face shape to face images with
large head poses. The dominant technology used is based on the cascade of
regressors, e.g., CNN, which has shown promising results. Nonetheless, the
cascade of CNNs suffers from several drawbacks, e.g., lack of end-to-end
training, hand-crafted features and slow training speed. To address these
issues, we propose a new layer, named visualization layer, that can be
integrated into the CNN architecture and enables joint optimization with
different loss functions. Extensive evaluation of the proposed method on
multiple datasets demonstrates state-of-the-art accuracy, while reducing the
training time by more than half compared to the typical cascade of CNNs. In
addition, we compare multiple CNN architectures with the visualization layer to
further demonstrate the advantage of its utilization
Learning Channel Inter-dependencies at Multiple Scales on Dense Networks for Face Recognition
We propose a new deep network structure for unconstrained face recognition.
The proposed network integrates several key components together in order to
characterize complex data distributions, such as in unconstrained face images.
Inspired by recent progress in deep networks, we consider some important
concepts, including multi-scale feature learning, dense connections of network
layers, and weighting different network flows, for building our deep network
structure. The developed network is evaluated in unconstrained face matching,
showing the capability of learning complex data distributions caused by face
images with various qualities.Comment: 12 page
PCANet: A Simple Deep Learning Baseline for Image Classification?
In this work, we propose a very simple deep learning network for image
classification which comprises only the very basic data processing components:
cascaded principal component analysis (PCA), binary hashing, and block-wise
histograms. In the proposed architecture, PCA is employed to learn multistage
filter banks. It is followed by simple binary hashing and block histograms for
indexing and pooling. This architecture is thus named as a PCA network (PCANet)
and can be designed and learned extremely easily and efficiently. For
comparison and better understanding, we also introduce and study two simple
variations to the PCANet, namely the RandNet and LDANet. They share the same
topology of PCANet but their cascaded filters are either selected randomly or
learned from LDA. We have tested these basic networks extensively on many
benchmark visual datasets for different tasks, such as LFW for face
verification, MultiPIE, Extended Yale B, AR, FERET datasets for face
recognition, as well as MNIST for hand-written digits recognition.
Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with
the state of the art features, either prefixed, highly hand-crafted or
carefully learned (by DNNs). Even more surprisingly, it sets new records for
many classification tasks in Extended Yale B, AR, FERET datasets, and MNIST
variations. Additional experiments on other public datasets also demonstrate
the potential of the PCANet serving as a simple but highly competitive baseline
for texture classification and object recognition
Half-CNN: A General Framework for Whole-Image Regression
The Convolutional Neural Network (CNN) has achieved great success in image
classification. The classification model can also be utilized at image or patch
level for many other applications, such as object detection and segmentation.
In this paper, we propose a whole-image CNN regression model, by removing the
full connection layer and training the network with continuous feature maps.
This is a generic regression framework that fits many applications. We
demonstrate this method through two tasks: simultaneous face detection &
segmentation, and scene saliency prediction. The result is comparable with
other models in the respective fields, using only a small scale network. Since
the regression model is trained on corresponding image / feature map pairs,
there are no requirements on uniform input size as opposed to the
classification model. Our framework avoids classifier design, a process that
may introduce too much manual intervention in model development. Yet, it is
highly correlated to the classification network and offers some in-deep review
of CNN structures
FH-GAN: Face Hallucination and Recognition using Generative Adversarial Network
There are many factors affecting visual face recognition, such as low
resolution images, aging, illumination and pose variance, etc. One of the most
important problem is low resolution face images which can result in bad
performance on face recognition. Most of the general face recognition
algorithms usually assume a sufficient resolution for the face images. However,
in practice many applications often do not have sufficient image resolutions.
The modern face hallucination models demonstrate reasonable performance to
reconstruct high-resolution images from its corresponding low resolution
images. However, they do not consider identity level information during
hallucination which directly affects results of the recognition of low
resolution faces. To address this issue, we propose a Face Hallucination
Generative Adversarial Network (FH-GAN) which improves the quality of low
resolution face images and accurately recognize those low quality images.
Concretely, we make the following contributions: 1) we propose FH-GAN network,
an end-to-end system, that improves both face hallucination and face
recognition simultaneously. The novelty of this proposed network depends on
incorporating identity information in a GAN-based face hallucination algorithm
via combining a face recognition network for identity preserving. 2) We also
propose a new face hallucination network, namely Dense Sparse Network (DSNet),
which improves upon the state-of-art in face hallucination. 3) We demonstrate
benefits of training the face recognition and GAN-based DSNet jointly by
reporting good result on face hallucination and recognition.Comment: 9 page
ADS-ME: Anomaly Detection System for Micro-expression Spotting
Micro-expressions (MEs) are infrequent and uncontrollable facial events that
can highlight emotional deception and appear in a high-stakes environment. This
paper propose an algorithm for spatiotemporal MEs spotting. Since MEs are
unusual events, we treat them as abnormal patterns that diverge from expected
Normal Facial Behaviour (NFBs) patterns. NFBs correspond to facial muscle
activations, eye blink/gaze events and mouth opening/closing movements that are
all facial deformation but not MEs. We propose a probabilistic model to
estimate the probability density function that models the spatiotemporal
distributions of NFBs patterns. To rank the outputs, we compute the negative
log-likelihood and we developed an adaptive thresholding technique to identify
MEs from NFBs. While working only with NFBs data, the main challenge is to
capture intrinsic spatiotemoral features, hence we design a recurrent
convolutional autoencoder for feature representation. Finally, we show that our
system is superior to previous works for MEs spotting.Comment: 35 pages, 9 figures, 3 table
A PCA-Based Convolutional Network
In this paper, we propose a novel unsupervised deep learning model, called
PCA-based Convolutional Network (PCN). The architecture of PCN is composed of
several feature extraction stages and a nonlinear output stage. Particularly,
each feature extraction stage includes two layers: a convolutional layer and a
feature pooling layer. In the convolutional layer, the filter banks are simply
learned by PCA. In the nonlinear output stage, binary hashing is applied. For
the higher convolutional layers, the filter banks are learned from the feature
maps that were obtained in the previous stage. To test PCN, we conducted
extensive experiments on some challenging tasks, including handwritten digits
recognition, face recognition and texture classification. The results show that
PCN performs competitive with or even better than state-of-the-art deep
learning models. More importantly, since there is no back propagation for
supervised finetuning, PCN is much more efficient than existing deep networks.Comment: 8 pages,5 figure
Convolutional herbal prescription building method from multi-scale facial features
In Traditional Chinese Medicine (TCM), facial features are important basis
for diagnosis and treatment. A doctor of TCM can prescribe according to a
patient's physical indicators such as face, tongue, voice, symptoms, pulse.
Previous works analyze and generate prescription according to symptoms.
However, research work to mine the association between facial features and
prescriptions has not been found for the time being. In this work, we try to
use deep learning methods to mine the relationship between the patient's face
and herbal prescriptions (TCM prescriptions), and propose to construct
convolutional neural networks that generate TCM prescriptions according to the
patient's face image. It is a novel and challenging job. In order to mine
features from different granularities of faces, we design a multi-scale
convolutional neural network based on three-grained face, which mines the
patient's face information from the organs, local regions, and the entire face.
Our experiments show that convolutional neural networks can learn relevant
information from face to prescribe, and the multi-scale convolutional neural
networks based on three-grained face perform better
- …