4,996 research outputs found
Facial Expression Recognition in the Wild using Rich Deep Features
Facial Expression Recognition is an active area of research in computer
vision with a wide range of applications. Several approaches have been
developed to solve this problem for different benchmark datasets. However,
Facial Expression Recognition in the wild remains an area where much work is
still needed to serve real-world applications. To this end, in this paper we
present a novel approach towards facial expression recognition. We fuse rich
deep features with domain knowledge through encoding discriminant facial
patches. We conduct experiments on two of the most popular benchmark datasets;
CK and TFE. Moreover, we present a novel dataset that, unlike its precedents,
consists of natural - not acted - expression images. Experimental results show
that our approach achieves state-of-the-art results over standard benchmarks
and our own datasetComment: in International Conference in Image Processing, 201
Human and Sheep Facial Landmarks Localisation by Triplet Interpolated Features
In this paper we present a method for localisation of facial landmarks on
human and sheep. We introduce a new feature extraction scheme called
triplet-interpolated feature used at each iteration of the cascaded shape
regression framework. It is able to extract features from similar semantic
location given an estimated shape, even when head pose variations are large and
the facial landmarks are very sparsely distributed. Furthermore, we study the
impact of training data imbalance on model performance and propose a training
sample augmentation scheme that produces more initialisations for training
samples from the minority. More specifically, the augmentation number for a
training sample is made to be negatively correlated to the value of the fitted
probability density function at the sample's position. We evaluate the proposed
scheme on both human and sheep facial landmarks localisation. On the benchmark
300w human face dataset, we demonstrate the benefits of our proposed methods
and show very competitive performance when comparing to other methods. On a
newly created sheep face dataset, we get very good performance despite the fact
that we only have a limited number of training samples and a set of sparse
landmarks are annotated.Comment: submitted to WACV201
Autofocus Correction of Azimuth Phase Error and Residual Range Cell Migration in Spotlight SAR Polar Format Imagery
Synthetic aperture radar (SAR) images are often blurred by phase
perturbations induced by uncompensated sensor motion and /or unknown
propagation effects caused by turbulent media. To get refocused images,
autofocus proves to be useful post-processing technique applied to estimate and
compensate the unknown phase errors. However, a severe drawback of the
conventional autofocus algorithms is that they are only capable of removing
one-dimensional azimuth phase errors (APE). As the resolution becomes finer,
residual range cell migration (RCM), which makes the defocus inherently
two-dimensional, becomes a new challenge. In this paper, correction of APE and
residual RCM are presented in the framework of polar format algorithm (PFA).
First, an insight into the underlying mathematical mechanism of polar
reformatting is presented. Then based on this new formulation, the effect of
polar reformatting on the uncompensated APE and residual RCM is investigated in
detail. By using the derived analytical relationship between APE and residual
RCM, an efficient two-dimensional (2-D) autofocus method is proposed.
Experimental results indicate the effectiveness of the proposed method.Comment: 29 pages, 14 figure
An Approximation of the Outage Probability for Multi-hop AF Fixed Gain Relay
In this letter, we present a closed-form approximation of the outage
probability for the multi-hop amplify-and-forward (AF) relaying systems with
fixed gain in Rayleigh fading channel. The approximation is derived from the
outage event for each hop. The simulation results show the tightness of the
proposed approximation in low and high signal-to-noise ratio (SNR) region.Comment: 3 pages, 3 figures, Submitted to IEEE Communication Letter
Group Emotion Recognition Using Machine Learning
Automatic facial emotion recognition is a challenging task that has gained
significant scientific interest over the past few years, but the problem of
emotion recognition for a group of people has been less extensively studied.
However, it is slowly gaining popularity due to the massive amount of data
available on social networking sites containing images of groups of people
participating in various social events. Group emotion recognition is a
challenging problem due to obstructions like head and body pose variations,
occlusions, variable lighting conditions, variance of actors, varied indoor and
outdoor settings and image quality. The objective of this task is to classify a
group's perceived emotion as Positive, Neutral or Negative. In this report, we
describe our solution which is a hybrid machine learning system that
incorporates deep neural networks and Bayesian classifiers. Deep Convolutional
Neural Networks (CNNs) work from bottom to top, analysing facial expressions
expressed by individual faces extracted from the image. The Bayesian network
works from top to bottom, inferring the global emotion for the image, by
integrating the visual features of the contents of the image obtained through a
scene descriptor. In the final pipeline, the group emotion category predicted
by an ensemble of CNNs in the bottom-up module is passed as input to the
Bayesian Network in the top-down module and an overall prediction for the image
is obtained. Experimental results show that the stated system achieves 65.27%
accuracy on the validation set which is in line with state-of-the-art results.
As an outcome of this project, a Progressive Web Application and an
accompanying Android app with a simple and intuitive user interface are
presented, allowing users to test out the system with their own pictures
Learning for Multi-Model and Multi-Type Fitting
Multi-model fitting has been extensively studied from the random sampling and
clustering perspectives. Most assume that only a single type/class of model is
present and their generalizations to fitting multiple types of
models/structures simultaneously are non-trivial. The inherent challenges
include choice of types and numbers of models, sampling imbalance and parameter
tuning, all of which render conventional approaches ineffective. In this work,
we formulate the multi-model multi-type fitting problem as one of learning deep
feature embedding that is clustering-friendly. In other words, points of the
same clusters are embedded closer together through the network. For inference,
we apply K-means to cluster the data in the embedded feature space and model
selection is enabled by analyzing the K-means residuals. Experiments are
carried out on both synthetic and real world multi-type fitting datasets,
producing state-of-the-art results. Comparisons are also made on single-type
multi-model fitting tasks with promising results as well
Towards Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models
Scene understanding includes many related sub-tasks, such as scene
categorization, depth estimation, object detection, etc. Each of these
sub-tasks is often notoriously hard, and state-of-the-art classifiers already
exist for many of them. These classifiers operate on the same raw image and
provide correlated outputs. It is desirable to have an algorithm that can
capture such correlation without requiring any changes to the inner workings of
any classifier.
We propose Feedback Enabled Cascaded Classification Models (FE-CCM), that
jointly optimizes all the sub-tasks, while requiring only a `black-box'
interface to the original classifier for each sub-task. We use a two-layer
cascade of classifiers, which are repeated instantiations of the original ones,
with the output of the first layer fed into the second layer as input. Our
training method involves a feedback step that allows later classifiers to
provide earlier classifiers information about which error modes to focus on. We
show that our method significantly improves performance in all the sub-tasks in
the domain of scene understanding, where we consider depth estimation, scene
categorization, event categorization, object detection, geometric labeling and
saliency detection. Our method also improves performance in two robotic
applications: an object-grasping robot and an object-finding robot.Comment: 14 pages, 11 figure
weedNet: Dense Semantic Weed Classification Using Multispectral Images and MAV for Smart Farming
Selective weed treatment is a critical step in autonomous crop management as
related to crop health and yield. However, a key challenge is reliable, and
accurate weed detection to minimize damage to surrounding plants. In this
paper, we present an approach for dense semantic weed classification with
multispectral images collected by a micro aerial vehicle (MAV). We use the
recently developed encoder-decoder cascaded Convolutional Neural Network (CNN),
Segnet, that infers dense semantic classes while allowing any number of input
image channels and class balancing with our sugar beet and weed datasets. To
obtain training datasets, we established an experimental field with varying
herbicide levels resulting in field plots containing only either crop or weed,
enabling us to use the Normalized Difference Vegetation Index (NDVI) as a
distinguishable feature for automatic ground truth generation. We train 6
models with different numbers of input channels and condition (fine-tune) it to
achieve about 0.8 F1-score and 0.78 Area Under the Curve (AUC) classification
metrics. For model deployment, an embedded GPU system (Jetson TX2) is tested
for MAV integration. Dataset used in this paper is released to support the
community and future work
Modeling and identification of uncertain-input systems
In this work, we present a new class of models, called uncertain-input
models, that allows us to treat system-identification problems in which a
linear system is subject to a partially unknown input signal. To encode prior
information about the input or the linear system, we use Gaussian-process
models. We estimate the model from data using the empirical Bayes approach: the
input and the impulse responses of the linear system are estimated using the
posterior means of the Gaussian-process models given the data, and the
hyperparameters that characterize the Gaussian-process models are estimated
from the marginal likelihood of the data. We propose an iterative algorithm to
find the hyperparameters that relies on the EM method and results in simple
update steps. In the most general formulation, neither the marginal likelihood
nor the posterior distribution of the unknowns is tractable. Therefore, we
propose two approximation approaches, one based on Markov-chain Monte Carlo
techniques and one based on variational Bayes approximation. We also show
special model structures for which the distributions are treatable exactly.
Through numerical simulations, we study the application of the uncertain-input
model to the identification of Hammerstein systems and cascaded linear systems.
As part of the contribution of the paper, we show that this model structure
encompasses many classical problems in system identification such as classical
PEM, Hammerstein models, errors-in-variables problems, blind system
identification, and cascaded linear systems. This allows us to build a
systematic procedure to apply the algorithms proposed in this work to a wide
class of classical problems.Comment: 27 Pages, submitted to Automatic
Weighted Null-Space Fitting for Identification of Cascade Networks
For identification of systems embedded in dynamic networks, applying the
prediction error method (PEM) to a correct tailor-made parametrization of the
complete network provided asymptotically efficient estimates. However, the
network complexity often hinders a successful application of PEM, which
requires minimizing a non-convex cost function that in general becomes more
difficult for more complex networks. For this reason, identification in dynamic
networks often focuses in obtaining consistent estimates of particular network
modules of interest. A downside of such approaches is that splitting the
network in several modules for identification often costs asymptotic
efficiency. In this paper, we consider the particular case of a dynamic network
with the individual systems connected in a serial cascaded manner, with
measurements affected by sensor noise. We propose an algorithm that estimates
all the modules in the network simultaneously without requiring the
minimization of a non-convex cost function. This algorithm is an extension of
Weighted Null-Space Fitting (WNSF), a weighted least-squares method that
provides asymptotically efficient estimates for single-input single-output
systems. We illustrate the performance of the algorithm with simulation
studies, which suggest that a network WNSF may also be asymptotically efficient
estimates when applied to cascade networks, and discuss the possibility of
extension to more general networks affected by sensor noise
- …