47,838 research outputs found
Automatic Facial Expression Recognition Using Features of Salient Facial Patches
Extraction of discriminative features from salient facial patches plays a
vital role in effective facial expression recognition. The accurate detection
of facial landmarks improves the localization of the salient patches on face
images. This paper proposes a novel framework for expression recognition by
using appearance features of selected facial patches. A few prominent facial
patches, depending on the position of facial landmarks, are extracted which are
active during emotion elicitation. These active patches are further processed
to obtain the salient patches which contain discriminative features for
classification of each pair of expressions, thereby selecting different facial
patches as salient for different pair of expression classes. One-against-one
classification method is adopted using these features. In addition, an
automated learning-free facial landmark detection technique has been proposed,
which achieves similar performances as that of other state-of-art landmark
detection methods, yet requires significantly less execution time. The proposed
method is found to perform well consistently in different resolutions, hence,
providing a solution for expression recognition in low resolution images.
Experiments on CK+ and JAFFE facial expression databases show the effectiveness
of the proposed system
What is the right way to represent document images?
In this article we study the problem of document image representation based
on visual features. We propose a comprehensive experimental study that compares
three types of visual document image representations: (1) traditional so-called
shallow features, such as the RunLength and the Fisher-Vector descriptors, (2)
deep features based on Convolutional Neural Networks, and (3) features
extracted from hybrid architectures that take inspiration from the two previous
ones.
We evaluate these features in several tasks (i.e. classification, clustering,
and retrieval) and in different setups (e.g. domain transfer) using several
public and in-house datasets. Our results show that deep features generally
outperform other types of features when there is no domain shift and the new
task is closely related to the one used to train the model. However, when a
large domain or task shift is present, the Fisher-Vector shallow features
generalize better and often obtain the best results
The Cross-Depiction Problem: Computer Vision Algorithms for Recognising Objects in Artwork and in Photographs
The cross-depiction problem is that of recognising visual objects regardless
of whether they are photographed, painted, drawn, etc. It is a potentially
significant yet under-researched problem. Emulating the remarkable human
ability to recognise objects in an astonishingly wide variety of depictive
forms is likely to advance both the foundations and the applications of
Computer Vision.
In this paper we benchmark classification, domain adaptation, and deep
learning methods; demonstrating that none perform consistently well in the
cross-depiction problem. Given the current interest in deep learning, the fact
such methods exhibit the same behaviour as all but one other method: they show
a significant fall in performance over inhomogeneous databases compared to
their peak performance, which is always over data comprising photographs only.
Rather, we find the methods that have strong models of spatial relations
between parts tend to be more robust and therefore conclude that such
information is important in modelling object classes regardless of appearance
details.Comment: 12 pages, 6 figure
Patterns of Activity in a Global Model of a Solar Active Region
In this work we investigate the global activity patterns predicted from a
model active region heated by distributions of nanoflares that have a range of
frequencies. What differs is the average frequency of the distributions. The
activity patterns are manifested in time lag maps of narrow-band instrument
channel pairs. We combine hydrodynamic and forward modeling codes with a
magnetic field extrapolation to create a model active region and apply the time
lag method to synthetic observations. Our aim is not to reproduce a particular
set of observations in detail, but to recover some typical properties and
patterns observed in active regions. Our key findings are the following. 1.
cooling dominates the time lag signature and the time lags between the channel
pairs are generally consistent with observed values. 2. shorter coronal loops
in the core cool more quickly than longer loops at the periphery. 3. all
channel pairs show zero time lag when the line-of-sight passes through coronal
loop foot-points. 4. there is strong evidence that plasma must be re-energized
on a time scale comparable to the cooling timescale to reproduce the observed
coronal activity, but it is likely that a relatively broad spectrum of heating
frequencies are operating across active regions. 5. due to their highly dynamic
nature, we find nanoflare trains produce zero time lags along entire flux tubes
in our model active region that are seen between the same channel pairs in
observed active regions
Iterated Function System Models in Data Analysis: Detection and Separation
We investigate the use of iterated function system (IFS) models for data
analysis. An IFS is a discrete dynamical system in which each time step
corresponds to the application of one of a finite collection of maps. The maps,
which represent distinct dynamical regimes, may act in some pre-determined
sequence or may be applied in random order. An algorithm is developed to detect
the sequence of regime switches under the assumption of continuity. This method
is tested on a simple IFS and applied to an experimental computer performance
data set. This methodology has a wide range of potential uses: from
change-point detection in time-series data to the field of digital
communications
Visualizing Deep Convolutional Neural Networks Using Natural Pre-Images
Image representations, from SIFT and bag of visual words to Convolutional
Neural Networks (CNNs) are a crucial component of almost all computer vision
systems. However, our understanding of them remains limited. In this paper we
study several landmark representations, both shallow and deep, by a number of
complementary visualization techniques. These visualizations are based on the
concept of "natural pre-image", namely a natural-looking image whose
representation has some notable property. We study in particular three such
visualizations: inversion, in which the aim is to reconstruct an image from its
representation, activation maximization, in which we search for patterns that
maximally stimulate a representation component, and caricaturization, in which
the visual patterns that a representation detects in an image are exaggerated.
We pose these as a regularized energy-minimization framework and demonstrate
its generality and effectiveness. In particular, we show that this method can
invert representations such as HOG more accurately than recent alternatives
while being applicable to CNNs too. Among our findings, we show that several
layers in CNNs retain photographically accurate information about the image,
with different degrees of geometric and photometric invariance.Comment: A substantially extended version of
http://www.robots.ox.ac.uk/~vedaldi/assets/pubs/mahendran15understanding.pdf.
arXiv admin note: text overlap with arXiv:1412.003
SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes
We present an open-source, real-time implementation of SemanticPaint, a
system for geometric reconstruction, object-class segmentation and learning of
3D scenes. Using our system, a user can walk into a room wearing a depth camera
and a virtual reality headset, and both densely reconstruct the 3D scene and
interactively segment the environment into object classes such as 'chair',
'floor' and 'table'. The user interacts physically with the real-world scene,
touching objects and using voice commands to assign them appropriate labels.
These user-generated labels are leveraged by an online random forest-based
machine learning algorithm, which is used to predict labels for previously
unseen parts of the scene. The entire pipeline runs in real time, and the user
stays 'in the loop' throughout the process, receiving immediate feedback about
the progress of the labelling and interacting with the scene as necessary to
refine the predicted segmentation.Comment: 33 pages, Project: http://www.semantic-paint.com, Code:
https://github.com/torrvision/spain
A Monocular Vision System for Playing Soccer in Low Color Information Environments
Humanoid soccer robots perceive their environment exclusively through
cameras. This paper presents a monocular vision system that was originally
developed for use in the RoboCup Humanoid League, but is expected to be
transferable to other soccer leagues. Recent changes in the Humanoid League
rules resulted in a soccer environment with less color coding than in previous
years, which makes perception of the game situation more challenging. The
proposed vision system addresses these challenges by using brightness and
texture for the detection of the required field features and objects. Our
system is robust to changes in lighting conditions, and is designed for
real-time use on a humanoid soccer robot. This paper describes the main
components of the detection algorithms in use, and presents experimental
results from the soccer field, using ROS and the igus Humanoid Open Platform as
a testbed. The proposed vision system was used successfully at RoboCup 2015.Comment: Proceedings of 10th Workshop on Humanoid Soccer Robots, International
Conference on Humanoid Robots (Humanoids), Seoul, Korea, 201
A review of EO image information mining
We analyze the state of the art of content-based retrieval in Earth
observation image archives focusing on complete systems showing promise for
operational implementation. The different paradigms at the basis of the main
system families are introduced. The approaches taken are analyzed, focusing in
particular on the phases after primitive feature extraction. The solutions
envisaged for the issues related to feature simplification and synthesis,
indexing, semantic labeling are reviewed. The methodologies for query
specification and execution are analyzed
Automatic Segmentation and Overall Survival Prediction in Gliomas using Fully Convolutional Neural Network and Texture Analysis
In this paper, we use a fully convolutional neural network (FCNN) for the
segmentation of gliomas from Magnetic Resonance Images (MRI). A fully
automatic, voxel based classification was achieved by training a 23 layer deep
FCNN on 2-D slices extracted from patient volumes. The network was trained on
slices extracted from 130 patients and validated on 50 patients. For the task
of survival prediction, texture and shape based features were extracted from T1
post contrast volume to train an XGBoost regressor. On BraTS 2017 validation
set, the proposed scheme achieved a mean whole tumor, tumor core and active
dice score of 0.83, 0.69 and 0.69 respectively and an accuracy of 52% for the
overall survival prediction.Comment: 10 Pages, 6 Figure
- …