67 research outputs found
Deformable Part Models are Convolutional Neural Networks
Deformable part models (DPMs) and convolutional neural networks (CNNs) are
two widely used tools for visual recognition. They are typically viewed as
distinct approaches: DPMs are graphical models (Markov random fields), while
CNNs are "black-box" non-linear classifiers. In this paper, we show that a DPM
can be formulated as a CNN, thus providing a novel synthesis of the two ideas.
Our construction involves unrolling the DPM inference algorithm and mapping
each step to an equivalent (and at times novel) CNN layer. From this
perspective, it becomes natural to replace the standard image features used in
DPM with a learned feature extractor. We call the resulting model DeepPyramid
DPM and experimentally validate it on PASCAL VOC. DeepPyramid DPM significantly
outperforms DPMs based on histograms of oriented gradients features (HOG) and
slightly outperforms a comparable version of the recently introduced R-CNN
detection system, while running an order of magnitude faster
A Deep Pyramid Deformable Part Model for Face Detection
We present a face detection algorithm based on Deformable Part Models and
deep pyramidal features. The proposed method called DP2MFD is able to detect
faces of various sizes and poses in unconstrained conditions. It reduces the
gap in training and testing of DPM on deep features by adding a normalization
layer to the deep convolutional neural network (CNN). Extensive experiments on
four publicly available unconstrained face detection datasets show that our
method is able to capture the meaningful structure of faces and performs
significantly better than many competitive face detection algorithms
Robust Minutiae Extractor: Integrating Deep Networks and Fingerprint Domain Knowledge
We propose a fully automatic minutiae extractor, called MinutiaeNet, based on
deep neural networks with compact feature representation for fast comparison of
minutiae sets. Specifically, first a network, called CoarseNet, estimates the
minutiae score map and minutiae orientation based on convolutional neural
network and fingerprint domain knowledge (enhanced image, orientation field,
and segmentation map). Subsequently, another network, called FineNet, refines
the candidate minutiae locations based on score map. We demonstrate the
effectiveness of using the fingerprint domain knowledge together with the deep
networks. Experimental results on both latent (NIST SD27) and plain (FVC 2004)
public domain fingerprint datasets provide comprehensive empirical support for
the merits of our method. Further, our method finds minutiae sets that are
better in terms of precision and recall in comparison with state-of-the-art on
these two datasets. Given the lack of annotated fingerprint datasets with
minutiae ground truth, the proposed approach to robust minutiae detection will
be useful to train network-based fingerprint matching algorithms as well as for
evaluating fingerprint individuality at scale. MinutiaeNet is implemented in
Tensorflow: https://github.com/luannd/MinutiaeNetComment: Accepted to International Conference on Biometrics (ICB 2018
When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data
Human action recognition from skeletal data is a hot research topic and
important in many open domain applications of computer vision, thanks to
recently introduced 3D sensors. In the literature, naive methods simply
transfer off-the-shelf techniques from video to the skeletal representation.
However, the current state-of-the-art is contended between to different
paradigms: kernel-based methods and feature learning with (recurrent) neural
networks. Both approaches show strong performances, yet they exhibit heavy, but
complementary, drawbacks. Motivated by this fact, our work aims at combining
together the best of the two paradigms, by proposing an approach where a
shallow network is fed with a covariance representation. Our intuition is that,
as long as the dynamics is effectively modeled, there is no need for the
classification network to be deep nor recurrent in order to score favorably. We
validate this hypothesis in a broad experimental analysis over 6 publicly
available datasets.Comment: 2017 IEEE Computer Vision and Pattern Recognition (CVPR) Workshop
Multi-view Face Detection Using Deep Convolutional Neural Networks
In this paper we consider the problem of multi-view face detection. While
there has been significant research on this problem, current state-of-the-art
approaches for this task require annotation of facial landmarks, e.g. TSM [25],
or annotation of face poses [28, 22]. They also require training dozens of
models to fully capture faces in all orientations, e.g. 22 models in HeadHunter
method [22]. In this paper we propose Deep Dense Face Detector (DDFD), a method
that does not require pose/landmark annotation and is able to detect faces in a
wide range of orientations using a single model based on deep convolutional
neural networks. The proposed method has minimal complexity; unlike other
recent deep learning object detection methods [9], it does not require
additional components such as segmentation, bounding-box regression, or SVM
classifiers. Furthermore, we analyzed scores of the proposed face detector for
faces in different orientations and found that 1) the proposed method is able
to detect faces from different angles and can handle occlusion to some extent,
2) there seems to be a correlation between dis- tribution of positive examples
in the training set and scores of the proposed face detector. The latter
suggests that the proposed methods performance can be further improved by using
better sampling strategies and more sophisticated data augmentation techniques.
Evaluations on popular face detection benchmark datasets show that our
single-model face detector algorithm has similar or better performance compared
to the previous methods, which are more complex and require annotations of
either different poses or facial landmarks.Comment: in International Conference on Multimedia Retrieval 2015 (ICMR
- …