433 research outputs found
From 3D Point Clouds to Pose-Normalised Depth Maps
We consider the problem of generating either pairwise-aligned or pose-normalised depth maps from noisy 3D point clouds in a relatively unrestricted poses. Our system is deployed in a 3D face alignment application and consists of the following four stages: (i) data filtering, (ii) nose tip identification and sub-vertex localisation, (iii) computation of the (relative) face orientation, (iv) generation of either a pose aligned or a pose normalised depth map. We generate an implicit radial basis function (RBF) model of the facial surface and this is employed within all four stages of the process. For example, in stage (ii), construction of novel invariant features is based on sampling this RBF over a set of concentric spheres to give a spherically-sampled RBF (SSR) shape histogram. In stage (iii), a second novel descriptor, called an isoradius contour curvature signal, is defined, which allows rotational alignment to be determined using a simple process of 1D correlation. We test our system on both the University of York (UoY) 3D face dataset and the Face Recognition Grand Challenge (FRGC) 3D data. For the more challenging UoY data, our SSR descriptors significantly outperform three variants of spin images, successfully identifying nose vertices at a rate of 99.6%. Nose localisation performance on the higher quality FRGC data, which has only small pose variations, is 99.9%. Our best system successfully normalises the pose of 3D faces at rates of 99.1% (UoY data) and 99.6% (FRGC data)
Recommended from our members
Robust Registration of Dynamic Facial Sequences.
Accurate face registration is a key step for several image analysis applications. However, existing registration methods are prone to temporal drift errors or jitter among consecutive frames. In this paper, we propose an iterative rigid registration framework that estimates the misalignment with trained regressors. The input of the regressors is a robust motion representation that encodes the motion between a misaligned frame and the reference frame(s), and enables reliable performance under non-uniform illumination variations. Drift errors are reduced when the motion representation is computed from multiple reference frames. Furthermore, we use the L2 norm of the representation as a cue for performing coarse-to-fine registration efficiently. Importantly, the framework can identify registration failures and correct them. Experiments show that the proposed approach achieves significantly higher registration accuracy than the state-of-the-art techniques in challenging sequences.The research work of Evangelos Sariyanidi and Hatice Gunes has been partially supported by the EPSRC under its IDEAS Factory Sandpits call on Digital Personhood (Grant Ref.: EP/L00416X/1)
Recommended from our members
SmileNet: Registration-Free Smiling Face Detection in the Wild
We present a novel smiling face detection framework called SmileNet for detecting faces and recognising smiles in the wild. SmileNet uses a Fully Convolutional Neural Network (FCNN) to detect multiple smiling faces in a given image of varying resolution. Our contributions are three-fold: 1) SmileNet is the first smiling face detection network that does not require pre-processing such as face detection and registration in advance to generate a normalised (cropped and aligned) input image; 2) the proposed SmileNet is a simple and single FCNN architecture simultaneously performing face detection and smile recognition, which are conventionally treated as separate consecutive pipelines; and 3) SmileNet ensures real-time processing speed (21.15 FPS) even when detecting multiple smiling faces in a given image (300x300). Experimental results show that SmileNet can deliver state-of-the-art performance (95.76%), even under occlusions, and variances of pose, scale, and illumination.This work is supported by the Technology Strategy Board / Innovate UK project Sensing Feeling (project no. 102547)
The Value of Seizure Semiology in Epilepsy Surgery: Epileptogenic-Zone Localisation in Presurgical Patients using Machine Learning and Semiology Visualisation Tool
Background
Eight million individuals have focal drug resistant epilepsy worldwide. If their epileptogenic focus is identified and resected, they may become seizure-free and experience significant improvements in quality of life. However, seizure-freedom occurs in less than half of surgical resections.
Seizure semiology - the signs and symptoms during a seizure - along with brain imaging and electroencephalography (EEG) are amongst the mainstays of seizure localisation. Although there have been advances in algorithmic identification of abnormalities on EEG and imaging, semiological analysis has remained more subjective.
The primary objective of this research was to investigate the localising value of clinician-identified semiology, and secondarily to improve personalised prognostication for epilepsy surgery.
Methods
I data mined retrospective hospital records to link semiology to outcomes. I trained machine learning models to predict temporal lobe epilepsy (TLE) and determine the value of semiology compared to a benchmark of hippocampal sclerosis (HS).
Due to the hospital dataset being relatively small, we also collected data from a systematic review of the literature to curate an open-access Semio2Brain database. We built the Semiology-to-Brain Visualisation Tool (SVT) on this database and retrospectively validated SVT in two separate groups of randomly selected patients and individuals with frontal lobe epilepsy.
Separately, a systematic review of multimodal prognostic features of epilepsy surgery was undertaken.
The concept of a semiological connectome was devised and compared to structural connectivity to investigate probabilistic propagation and semiology generation.
Results
Although a (non-chronological) list of patients’ semiologies did not improve localisation beyond the initial semiology, the list of semiology added value when combined with an imaging feature. The absolute added value of semiology in a support vector classifier in diagnosing TLE, compared to HS, was 25%. Semiology was however unable to predict postsurgical outcomes. To help future prognostic models, a list of essential multimodal prognostic features for epilepsy surgery were extracted from meta-analyses and a structural causal model proposed.
Semio2Brain consists of over 13000 semiological datapoints from 4643 patients across 309 studies and uniquely enabled a Bayesian approach to localisation to mitigate TLE publication bias. SVT performed well in a retrospective validation, matching the best expert clinician’s localisation scores and exceeding them for lateralisation, and showed modest value in localisation in individuals with frontal lobe epilepsy (FLE).
There was a significant correlation between the number of connecting fibres between brain regions and the seizure semiologies that can arise from these regions.
Conclusions
Semiology is valuable in localisation, but multimodal concordance is more valuable and highly prognostic. SVT could be suitable for use in multimodal models to predict the seizure focus
A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"
Recently, technologies such as face detection, facial landmark localisation
and face recognition and verification have matured enough to provide effective
and efficient solutions for imagery captured under arbitrary conditions
(referred to as "in-the-wild"). This is partially attributed to the fact that
comprehensive "in-the-wild" benchmarks have been developed for face detection,
landmark localisation and recognition/verification. A very important technology
that has not been thoroughly evaluated yet is deformable face tracking
"in-the-wild". Until now, the performance has mainly been assessed
qualitatively by visually assessing the result of a deformable face tracking
technology on short videos. In this paper, we perform the first, to the best of
our knowledge, thorough evaluation of state-of-the-art deformable face tracking
pipelines using the recently introduced 300VW benchmark. We evaluate many
different architectures focusing mainly on the task of on-line deformable face
tracking. In particular, we compare the following general strategies: (a)
generic face detection plus generic facial landmark localisation, (b) generic
model free tracking plus generic facial landmark localisation, as well as (c)
hybrid approaches using state-of-the-art face detection, model free tracking
and facial landmark localisation technologies. Our evaluation reveals future
avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second
authorshi
Pupil Localisation and Eye Centre Estimation using Machine Learning and Computer Vision
Various methods have been used to estimate the pupil location within an image or a real-time video frame in many fields. However, these methods lack the performance specifically in low-resolution images and varying background conditions. We propose a coarse-to-fine pupil localisation method using a composite of machine learning and image processing algorithms. First, a pre-trained model is employed for the facial landmark identification to extract the desired eye-frames within the input image. We then use multi-stage convolution to find the optimal horizontal and vertical coordinates of the pupil within the identified eye-frames. For this purpose, we define an adaptive kernel to deal with the varying resolution and size of input images. Furthermore, a dynamic threshold is calculated recursively for reliable identification of the best-matched candidate. We evaluated our method using various statistical and standard metrics along-with a standardized distance metric we introduce first time in this study. Proposed method outperforms previous works in terms of accuracy and reliability when benchmarked on multiple standard datasets. The work has diverse artificial intelligence and industrial applications including human computer interfaces, emotion recognition, psychological profiling, healthcare and automated deception detection
Hybrid component-based face recognition.
Masters Degree. University of KwaZulu-Natal, Pietermaritzburg.Facial recognition (FR) is the trusted biometric method for authentication. Compared
to other biometrics such as signature; which can be compromised, facial recognition
is non-intrusive and it can be apprehended at a distance in a concealed manner.
It has a significant role in conveying the identity of a person in social interaction
and its performance largely depends on a variety of factors such as illumination, facial
pose, expression, age span, hair, facial wear, and motion. In the light of these
considerations this dissertation proposes a hybrid component-based approach that
seeks to utilise any successfully detected components.
This research proposes a facial recognition technique to recognize faces at component
level. It employs the texture descriptors Grey-Level Co-occurrence (GLCM),
Gabor Filters, Speeded-Up Robust Features (SURF) and Scale Invariant Feature Transforms
(SIFT), and the shape descriptor Zernike Moments. The advantage of using
the texture attributes is their simplicity. However, they cannot completely characterise
the whole face recognition, hence the Zernike Moments descriptor was used to
compute the shape properties of the selected facial components. These descriptors
are effective facial components feature representations and are robust to illumination
and pose changes.
Experiments were performed on four different state of the art facial databases,
the FERET, FEI, SCface and CMU and Error-Correcting Output Code (ECOC) was
used for classification. The results show that component-based facial recognition is
more effective than whole face and the proposed methods achieve 98.75% of recognition
accuracy rate. This approach performs well compared to other componentbased
facial recognition approaches
- …