5,267 research outputs found
Fast and Accurate Algorithm for Eye Localization for Gaze Tracking in Low Resolution Images
Iris centre localization in low-resolution visible images is a challenging
problem in computer vision community due to noise, shadows, occlusions, pose
variations, eye blinks, etc. This paper proposes an efficient method for
determining iris centre in low-resolution images in the visible spectrum. Even
low-cost consumer-grade webcams can be used for gaze tracking without any
additional hardware. A two-stage algorithm is proposed for iris centre
localization. The proposed method uses geometrical characteristics of the eye.
In the first stage, a fast convolution based approach is used for obtaining the
coarse location of iris centre (IC). The IC location is further refined in the
second stage using boundary tracing and ellipse fitting. The algorithm has been
evaluated in public databases like BioID, Gi4E and is found to outperform the
state of the art methods.Comment: 12 pages, 10 figures, IET Computer Vision, 201
Recognition of 3-D Objects from Multiple 2-D Views by a Self-Organizing Neural Architecture
The recognition of 3-D objects from sequences of their 2-D views is modeled by a neural architecture, called VIEWNET that uses View Information Encoded With NETworks. VIEWNET illustrates how several types of noise and varialbility in image data can be progressively removed while incornplcte image features are restored and invariant features are discovered using an appropriately designed cascade of processing stages. VIEWNET first processes 2-D views of 3-D objects using the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and removes noise from the images. Boundary regularization and cornpletion are achieved by the same mechanisms that suppress image noise. A log-polar transform is taken with respect to the centroid of the resulting figure and then re-centered to achieve 2-D scale and rotation invariance. The invariant images are coarse coded to further reduce noise, reduce foreshortening effects, and increase generalization. These compressed codes are input into a supervised learning system based on the fuzzy ARTMAP algorithm. Recognition categories of 2-D views are learned before evidence from sequences of 2-D view categories is accumulated to improve object recognition. Recognition is studied with noisy and clean images using slow and fast learning. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of 2-D views of jet aircraft with and without additive noise. A recognition rate of 90% is achieved with one 2-D view category and of 98.5% correct with three 2-D view categories.National Science Foundation (IRI 90-24877); Office of Naval Research (N00014-91-J-1309, N00014-91-J-4100, N00014-92-J-0499); Air Force Office of Scientific Research (F9620-92-J-0499, 90-0083
Ensemble of Hankel Matrices for Face Emotion Recognition
In this paper, a face emotion is considered as the result of the composition
of multiple concurrent signals, each corresponding to the movements of a
specific facial muscle. These concurrent signals are represented by means of a
set of multi-scale appearance features that might be correlated with one or
more concurrent signals. The extraction of these appearance features from a
sequence of face images yields to a set of time series. This paper proposes to
use the dynamics regulating each appearance feature time series to recognize
among different face emotions. To this purpose, an ensemble of Hankel matrices
corresponding to the extracted time series is used for emotion classification
within a framework that combines nearest neighbor and a majority vote schema.
Experimental results on a public available dataset shows that the adopted
representation is promising and yields state-of-the-art accuracy in emotion
classification.Comment: Paper to appear in Proc. of ICIAP 2015. arXiv admin note: text
overlap with arXiv:1506.0500
Determining the Mass of Kepler-78b With Nonparametric Gaussian Process Estimation
Kepler-78b is a transiting planet that is 1.2 times the radius of Earth and
orbits a young, active K dwarf every 8 hours. The mass of Kepler-78b has been
independently reported by two teams based on radial velocity measurements using
the HIRES and HARPS-N spectrographs. Due to the active nature of the host star,
a stellar activity model is required to distinguish and isolate the planetary
signal in radial velocity data. Whereas previous studies tested parametric
stellar activity models, we modeled this system using nonparametric Gaussian
process (GP) regression. We produced a GP regression of relevant Kepler
photometry. We then use the posterior parameter distribution for our
photometric fit as a prior for our simultaneous GP + Keplerian orbit models of
the radial velocity datasets. We tested three simple kernel functions for our
GP regressions. Based on a Bayesian likelihood analysis, we selected a
quasi-periodic kernel model with GP hyperparameters coupled between the two RV
datasets, giving a Doppler amplitude of 1.86 0.25 m s and
supporting our belief that the correlated noise we are modeling is
astrophysical. The corresponding mass of 1.87 M
is consistent with that measured in previous studies, and more robust due to
our nonparametric signal estimation. Based on our mass and the radius
measurement from transit photometry, Kepler-78b has a bulk density of
6.0 g cm. We estimate that Kepler-78b is 3226% iron
using a two-component rock-iron model. This is consistent with an Earth-like
composition, with uncertainty spanning Moon-like to Mercury-like compositions.Comment: 10 pages, 5 figures, accepted to ApJ 6/16/201
Interpretable Transformations with Encoder-Decoder Networks
Deep feature spaces have the capacity to encode complex transformations of
their input data. However, understanding the relative feature-space
relationship between two transformed encoded images is difficult. For instance,
what is the relative feature space relationship between two rotated images?
What is decoded when we interpolate in feature space? Ideally, we want to
disentangle confounding factors, such as pose, appearance, and illumination,
from object identity. Disentangling these is difficult because they interact in
very nonlinear ways. We propose a simple method to construct a deep feature
space, with explicitly disentangled representations of several known
transformations. A person or algorithm can then manipulate the disentangled
representation, for example, to re-render an image with explicit control over
parameterized degrees of freedom. The feature space is constructed using a
transforming encoder-decoder network with a custom feature transform layer,
acting on the hidden representations. We demonstrate the advantages of explicit
disentangling on a variety of datasets and transformations, and as an aid for
traditional tasks, such as classification.Comment: Accepted at ICCV 201
- …