5 research outputs found
3D Monitor Based on Head Pose Detection
S vývojem možností zpracování obrazu, stereoskopického zobrazení, cen webových kamer a výkonu počítačů vyvstala možnost jak znásobit zážitek uživatele během práce s 3D programy. Z obrazu z webové kamery lze odhadnout polohu uživatelovy hlavy a podle této polohy natočit trojrozměrnou scénu zobrazovanou na monitoru počítače. Uživateli se potom při pohybu hlavy bude zdát, jako by byl monitor okno, skrze které může nahlížet na scénu za ním. Pomocí systému, který je výsledkem této práce, bude možné jednoduše a levně dodat uvedené chování libovolnému 3D programu.With the development of posibilities of image processing, stereoscopy, prices of web cameras and power of computers an opportunity to multiply an experience with working with 3D programs showed. From the picture from webcamera an estimation of a pose of user's head can be made. According to this pose a view on 3D scene can be changed. Then, when user moves his head, he will have a feeling as if monitor was a window through which one can see the scene behind. With the system which is the result of this project it will be possible to easily and cheaply add this kind of behaviour to any 3D application.
Parametric face alignment : generative and discriminative approaches
Tese de doutoramento em Engenharia Electrotécnica e de Computadores, apresentada à Faculdade de Ciências e Tecnologia da Universidade de CoimbraThis thesis addresses the matching of deformable human face models into 2D images.
Two di erent approaches are detailed: generative and discriminative methods. Generative
or holistic methods model the appearance/texture of all image pixels describing
the face by synthesizing the expected appearance (it builds synthetic versions of the target
face). Discriminative or patch-based methods model the local correlations between
pixel values. Such approach uses an ensemble of local feature detectors all connected
by a shape regularization model. Typically, generative approaches can achieve higher
tting accuracy, but discriminative methods perform a lot better in unseen images.
The Active Appearance Models (AAMs) are probably the most widely used generative
technique. AAMs match parametric models of shape and appearance into new
images by solving a nonlinear optimization that minimizes the di erence between a
synthetic template and the real appearance. The rst part of this thesis describes the
2.5D AAM, an extension of the original 2D AAM that deals with a full perspective
projection model. The 2.5D AAM uses a 3D Point Distribution Model (PDM) and a
2D appearance model whose control points are de ned by a perspective projection of
the PDM. Two model tting algorithms and their computational e cient approximations
are proposed: the Simultaneous Forwards Additive (SFA) and the Normalization
Forwards Additive (NFA). Robust solutions for the SFA and NFA are also proposed in
order to take into account the self-occlusion and/or partial occlusion of the face. Extensive
results, involving the tting convergence, tting performance in unseen data,
robustness to occlusion, tracking performance and pose estimation are shown.
The second main part of this thesis concerns to discriminative methods such as
the Constrained Local Models (CLM) or the Active Shape Models (ASM), where an ensemble of local feature detectors are constrained to lie within the subspace spanned
by a PDM. Fitting such a model to an image typically involves two steps: (1) a local
search using a detector, obtaining response maps for each landmark and (2) a global
optimization that nds the shape parameters that jointly maximize all the detection responses.
This work proposes: Discriminative Bayesian Active Shape Models (DBASM)
a new global optimization strategy, using a Bayesian approach, where the posterior distribution
of the shape parameters are inferred in a maximum a posteriori (MAP) sense
by means of a Linear Dynamical System (LDS). The DBASM approach models the covariance
of the latent variables i.e. it uses 2nd order statistics of the shape (and pose)
parameters. Later, Bayesian Active Shape Models (BASM) is presented. BASM is an
extension of the previous DBASM formulation where the prior distribution is explicitly
modeled by means of recursive Bayesian estimation. Extensive results are presented,
evaluating DBASM and BASM global optimization strategies, local face parts detectors
and tracking performance in several standard datasets. Qualitative results taken
from the challenging Labeled Faces in the Wild (LFW) dataset are also shown.
Finally, the last part of this thesis, addresses the identity and facial expression
recognition. Face geometry is extracted from input images using the AAM and low
dimensional manifolds were then derived using Laplacian EigenMaps (LE) resulting in
two types of manifolds, one for representing identity and the other for person-speci c
facial expression. The identity and facial expression recognition system uses a two
stage approach: First, a Support Vector Machines (SVM) is used to establish identity
across expression changes, then the second stage deals with person-speci c expression
recognition with a network of Hidden Markov Models (HMMs). Results taken from
people exhibiting the six basic expressions (happiness, sadness, anger, fear, surprise
and disgust) plus the neutral emotion are shown.Esta tese aborda a correspond^encia de modelos humanos de faces deform aveis em
imagens 2D. S~ao apresentadas duas abordagens diferentes: m etodos generativos e discriminativos.
Os modelos generativos ou hol sticos modelam a apar^encia/textura de
todos os pixeis que descrevem a face, sintetizando a apar^encia esperada (s~ao criadas
vers~oes sint eticas da face alvo). Os modelos discriminativos ou baseados em partes
modelam correla c~oes locais entre valores de pixeis. Esta abordagem utiliza um conjunto
de detectores locais de caracter sticas, conectados por um modelo de regulariza c~ao
geom etrico. Normalmente, as abordagens generativas permitem obter uma maior precis~
ao de ajuste do modelo, mas os m etodos discriminativos funcionam bastante melhor
em imagens nunca antes vistas.
Os Modelos Activos de Apar^encia (AAMs) s~ao provavelmente a t ecnica generativa
mais utilizada. Os AAMs ajustam modelos param etricos de forma e apar^encia em
imagens, resolvendo uma optimiza c~ao n~ao linear que minimiza a diferen ca entre o
modelo sint etico e a apar^encia real. A primeira parte desta tese descreve os AAM
2.5D, uma extens~ao do AAM original 2D que permite a utiliza c~ao de um modelo de
projec c~ao em perspectiva. Os AAM 2.5D utilizam um Modelo de Distribui c~ao de
Pointos (PDM) e um modelo de apar^encia 2D cujos pontos de controlo s~ao de nidos
por uma projec c~ao em perspectiva do PDM. Dois algoritmos de ajuste do modelo e as
suas aproxima c~oes e cientes s~ao propostas: Simultaneous Forwards Additive (SFA) e
o Normalization Forwards Additive (NFA). Solu c~oes robustas para o SFA e NFA, que
contemplam a oclus~ao parcial da face, s~ao igualmente propostas. Resultados extensos,
envolvendo a converg^encia de ajuste, o desempenho em imagens nunca vistas, robustez
a oclus~ao, desempenho de seguimento e estimativa de pose s~ao apresentados. A segunda parte desta da tese diz respeito os m etodos discriminativos, tais como
os Modelos Locais com Restri c~oes (CLM) ou os Modelos Activos de Forma (ASM),
onde um conjunto de detectores de caracteristicas locais est~ao restritos a pertencer ao
subespa co gerado por um PDM. O ajuste de um modelo deste tipo, envolve tipicamente
duas et apas: (1) uma pesquisa local utilizando um detector, obtendo mapas de resposta
para cada ponto de refer^encia e (2) uma estrat egia de optimiza c~ao global que encontra
os par^ametros do PDM que permitem maximizar todas as respostas conjuntamente.
Neste trabalho e proposto o Discriminative Bayesian Active Shape Models (DBASM),
uma nova estrat egia de optimiza c~ao global que utiliza uma abordagem Bayesiana, onde
a distribui c~ao a posteriori dos par^ametros de forma s~ao inferidos por meio de um
sistema din^amico linear. A abordagem DBASM modela a covari^ancia das vari aveis
latentes ou seja, e utilizado estat stica de segunda ordem na modela c~ao dos par^ametros.
Posteriormente e apresentada a formula c~ao Bayesian Active Shape Models (BASM). O
BASM e uma extens~ao do DBASM, onde a distribui c~ao a priori e explicitamente
modelada por meio de estima c~ao Bayesiana recursiva. S~ao apresentados resultados
extensos, avaliando as estrat egias de optimiza c~ao globais DBASM e BASM, detectores
locais de componentes da face, e desempenho de seguimento em v arias bases de dados
padr~ao. Resultados qualitativos extra dos da desa ante base de dados Labeled Faces
in the Wild (LFW) s~ao tamb em apresentados.
Finalmente, a ultima parte desta tese aborda o reconhecimento de id^entidade e
express~oes faciais. A geometria da face e extra da de imagens utilizando o AAM e
variedades de baixa dimensionalidade s~ao derivadas utilizando Laplacian EigenMaps
(LE), obtendo-se dois tipos de variedades, uma para representar a id^entidade e a outra
para express~oes faciais espec cas de cada pessoa. A id^entidade e o sistema de reconhecimento
de express~oes faciais utiliza uma abordagem de duas fases: Num primeiro
est agio e utilizado uma M aquina de Vectores de Suporte (SVM) para determinar a
id^entidade, dedicando-se o segundo est agio ao reconhecimento de express~oes. Este
est agio e especi co para cada pessoa e utiliza Modelos de Markov Escondidos (HMM).
S~ao mostrados resultados obtidos em pessoas exibindo as seis express~oes b asicas (alegria,
tristeza, raiva, medo, surpresa e nojo), e ainda a emo c~ao neutra
Towards 3D facial morphometry:facial image analysis applications in anesthesiology and 3D spectral nonrigid registration
In anesthesiology, the detection and anticipation of difficult tracheal intubation is crucial for patient safety. When undergoing general anesthesia, a patient who is unexpectedly difficult to intubate risks potential life-threatening complications with poor clinical outcomes, ranging from severe harm to brain damage or death. Conversely, in cases of suspected difficulty, specific equipment and personnel will be called upon to increase safety and the chances of successful intubation. Research in anesthesiology has associated a certain number of morphological features of the face and neck with higher risk of difficult intubation. Detecting and analyzing these and other potential features, thus allowing the prediction of difficulty of tracheal intubation in a robust, objective, and automatic way, may therefore improve the patients' safety. In this thesis, we first present a method to automatically classify images of the mouth cavity according to the visibility of certain oropharyngeal structures. This method is then integrated into a novel and completely automatic method, based on frontal and profile images of the patient's face, to predict the difficulty of intubation. We also provide a new database of three dimensional (3D) facial scans and present the initial steps towards a complete 3D model of the face suitable for facial morphometry applications, which include difficult tracheal intubation prediction. In order to develop and test our proposed method, we collected a large database of multimodal recordings of over 2700 patients undergoing general anesthesia. In the first part of this thesis, using two dimensional (2D) facial image analysis methods, we automatically extract morphological and appearance-based features from these images. These are used to train a classifier, which learns to discriminate between patients as being easy or difficult to intubate. We validate our approach on two different scenarios, one of them being close to a real-world clinical scenario, using 966 patients, and demonstrate that the proposed method achieves performance comparable to medical diagnosis-based predictions by experienced anesthesiologists. In the second part of this thesis, we focus on the development of a new 3D statistical model of the face to overcome some of the limitations of 2D methods. We first present EPFL3DFace, a new database of 3D facial expression scans, containing 120 subjects, performing 35 different facial expressions. Then, we develop a nonrigid alignment method to register the scans and allow for statistical analysis. Our proposed method is based on spectral geometry processing and makes use of an implicit representation of the scans in order to be robust to noise or holes in the surfaces. It presents the significant advantage of reducing the number of free parameters to optimize for in the alignment process by two orders of magnitude. We apply our proposed method on the data collected and discuss qualitative results. At its current level of performance, our fully automatic method to predict difficult intubation already has the potential to reduce the cost, and increase the availability of such predictions, by not relying on qualified anesthesiologists with years of medical training. Further data collection, in order to increase the number of patients who are difficult to intubate, as well as extracting morphological features from a 3D representation of the face are key elements to further improve the performance