93,770 research outputs found
HyperFace: A Deep Multi-task Learning Framework for Face Detection, Landmark Localization, Pose Estimation, and Gender Recognition
We present an algorithm for simultaneous face detection, landmarks
localization, pose estimation and gender recognition using deep convolutional
neural networks (CNN). The proposed method called, HyperFace, fuses the
intermediate layers of a deep CNN using a separate CNN followed by a multi-task
learning algorithm that operates on the fused features. It exploits the synergy
among the tasks which boosts up their individual performances. Additionally, we
propose two variants of HyperFace: (1) HyperFace-ResNet that builds on the
ResNet-101 model and achieves significant improvement in performance, and (2)
Fast-HyperFace that uses a high recall fast face detector for generating region
proposals to improve the speed of the algorithm. Extensive experiments show
that the proposed models are able to capture both global and local information
in faces and performs significantly better than many competitive algorithms for
each of these four tasks.Comment: Accepted in Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
Modeling of Facial Aging and Kinship: A Survey
Computational facial models that capture properties of facial cues related to
aging and kinship increasingly attract the attention of the research community,
enabling the development of reliable methods for age progression, age
estimation, age-invariant facial characterization, and kinship verification
from visual data. In this paper, we review recent advances in modeling of
facial aging and kinship. In particular, we provide an up-to date, complete
list of available annotated datasets and an in-depth analysis of geometric,
hand-crafted, and learned facial representations that are used for facial aging
and kinship characterization. Moreover, evaluation protocols and metrics are
reviewed and notable experimental results for each surveyed task are analyzed.
This survey allows us to identify challenges and discuss future research
directions for the development of robust facial models in real-world
conditions
Error-Correcting Factorization
Error Correcting Output Codes (ECOC) is a successful technique in multi-class
classification, which is a core problem in Pattern Recognition and Machine
Learning. A major advantage of ECOC over other methods is that the multi- class
problem is decoupled into a set of binary problems that are solved
independently. However, literature defines a general error-correcting
capability for ECOCs without analyzing how it distributes among classes,
hindering a deeper analysis of pair-wise error-correction. To address these
limitations this paper proposes an Error-Correcting Factorization (ECF) method,
our contribution is three fold: (I) We propose a novel representation of the
error-correction capability, called the design matrix, that enables us to build
an ECOC on the basis of allocating correction to pairs of classes. (II) We
derive the optimal code length of an ECOC using rank properties of the design
matrix. (III) ECF is formulated as a discrete optimization problem, and a
relaxed solution is found using an efficient constrained block coordinate
descent approach. (IV) Enabled by the flexibility introduced with the design
matrix we propose to allocate the error-correction on classes that are prone to
confusion. Experimental results in several databases show that when allocating
the error-correction to confusable classes ECF outperforms state-of-the-art
approaches.Comment: Under review at TPAM
Regression-based Hypergraph Learning for Image Clustering and Classification
Inspired by the recently remarkable successes of Sparse Representation (SR),
Collaborative Representation (CR) and sparse graph, we present a novel
hypergraph model named Regression-based Hypergraph (RH) which utilizes the
regression models to construct the high quality hypergraphs. Moreover, we plug
RH into two conventional hypergraph learning frameworks, namely hypergraph
spectral clustering and hypergraph transduction, to present Regression-based
Hypergraph Spectral Clustering (RHSC) and Regression-based Hypergraph
Transduction (RHT) models for addressing the image clustering and
classification issues. Sparse Representation and Collaborative Representation
are employed to instantiate two RH instances and their RHSC and RHT algorithms.
The experimental results on six popular image databases demonstrate that the
proposed RH learning algorithms achieve promising image clustering and
classification performances, and also validate that RH can inherit the
desirable properties from both hypergraph models and regression models.Comment: 11page
Lifting Object Detection Datasets into 3D
While data has certainly taken the center stage in computer vision in recent
years, it can still be difficult to obtain in certain scenarios. In particular,
acquiring ground truth 3D shapes of objects pictured in 2D images remains a
challenging feat and this has hampered progress in recognition-based object
reconstruction from a single image. Here we propose to bypass previous
solutions such as 3D scanning or manual design, that scale poorly, and instead
populate object category detection datasets semi-automatically with dense,
per-object 3D reconstructions, bootstrapped from:(i) class labels, (ii) ground
truth figure-ground segmentations and (iii) a small set of keypoint
annotations. Our proposed algorithm first estimates camera viewpoint using
rigid structure-from-motion and then reconstructs object shapes by optimizing
over visual hull proposals guided by loose within-class shape similarity
assumptions. The visual hull sampling process attempts to intersect an object's
projection cone with the cones of minimal subsets of other similar objects
among those pictured from certain vantage points. We show that our method is
able to produce convincing per-object 3D reconstructions and to accurately
estimate cameras viewpoints on one of the most challenging existing
object-category detection datasets, PASCAL VOC. We hope that our results will
re-stimulate interest on joint object recognition and 3D reconstruction from a
single image
Advances in Human Action Recognition: A Survey
Human action recognition has been an important topic in computer vision due
to its many applications such as video surveillance, human machine interaction
and video retrieval. One core problem behind these applications is
automatically recognizing low-level actions and high-level activities of
interest. The former is usually the basis for the latter. This survey gives an
overview of the most recent advances in human action recognition during the
past several years, following a well-formed taxonomy proposed by a previous
survey. From this state-of-the-art survey, researchers can view a panorama of
progress in this area for future research
How an Electrical Engineer Became an Artificial Intelligence Researcher, a Multiphase Active Contours Analysis
This essay examines how what is considered to be artificial intelligence (AI)
has changed over time and come to intersect with the expertise of the author.
Initially, AI developed on a separate trajectory, both topically and
institutionally, from pattern recognition, neural information processing,
decision and control systems, and allied topics by focusing on symbolic systems
within computer science departments rather than on continuous systems in
electrical engineering departments. The separate evolutions continued
throughout the author's lifetime, with some crossover in reinforcement learning
and graphical models, but were shocked into converging by the virality of deep
learning, thus making an electrical engineer into an AI researcher. Now that
this convergence has happened, opportunity exists to pursue an agenda that
combines learning and reasoning bridged by interpretable machine learning
models
Parameter Estimation in Finite Mixture Models by Regularized Optimal Transport: A Unified Framework for Hard and Soft Clustering
In this short paper, we formulate parameter estimation for finite mixture
models in the context of discrete optimal transportation with convex
regularization. The proposed framework unifies hard and soft clustering methods
for general mixture models. It also generalizes the celebrated
\nobreakdash-means and expectation-maximization algorithms in relation to
associated Bregman divergences when applied to exponential family mixture
models
Single Image Action Recognition by Predicting Space-Time Saliency
We propose a novel approach based on deep Convolutional Neural Networks (CNN)
to recognize human actions in still images by predicting the future motion, and
detecting the shape and location of the salient parts of the image. We make the
following major contributions to this important area of research: (i) We use
the predicted future motion in the static image (Walker et al., 2015) as a
means of compensating for the missing temporal information, while using the
saliency map to represent the the spatial information in the form of location
and shape of what is predicted as significant. (ii) We cast action
classification in static images as a domain adaptation problem by transfer
learning. We first map the input static image to a new domain that we refer to
as the Predicted Optical Flow-Saliency Map domain (POF-SM), and then fine-tune
the layers of a deep CNN model trained on classifying the ImageNet dataset to
perform action classification in the POF-SM domain. (iii) We tested our method
on the popular Willow dataset. But unlike existing methods, we also tested on a
more realistic and challenging dataset of over 2M still images that we
collected and labeled by taking random frames from the UCF-101 video dataset.
We call our dataset the UCF Still Image dataset or UCFSI-101 in short. Our
results outperform the state of the art
Survey of state-of-the-art mixed data clustering algorithms
Mixed data comprises both numeric and categorical features, and mixed
datasets occur frequently in many domains, such as health, finance, and
marketing. Clustering is often applied to mixed datasets to find structures and
to group similar objects for further analysis. However, clustering mixed data
is challenging because it is difficult to directly apply mathematical
operations, such as summation or averaging, to the feature values of these
datasets. In this paper, we present a taxonomy for the study of mixed data
clustering algorithms by identifying five major research themes. We then
present a state-of-the-art review of the research works within each research
theme. We analyze the strengths and weaknesses of these methods with pointers
for future research directions. Lastly, we present an in-depth analysis of the
overall challenges in this field, highlight open research questions and discuss
guidelines to make progress in the field.Comment: 20 Pages, 2 columns, 6 Tables, 209 Reference
- …