11 research outputs found

    Adaptive face modelling for reconstructing 3D face shapes from single 2D images

    Get PDF
    Example-based statistical face models using principle component analysis (PCA) have been widely deployed for three-dimensional (3D) face reconstruction and face recognition. The two common factors that are generally concerned with such models are the size of the training dataset and the selection of different examples in the training set. The representational power (RP) of an example-based model is its capability to depict a new 3D face for a given 2D face image. The RP of the model can be increased by correspondingly increasing the number of training samples. In this contribution, a novel approach is proposed to increase the RP of the 3D face reconstruction model by deforming a set of examples in the training dataset. A PCA-based 3D face model is adapted for each new near frontal input face image to reconstruct the 3D face shape. Further an extended Tikhonov regularisation method has been

    3D reconstruction for plastic surgery simulation based on statistical shape models

    Get PDF
    This thesis has been accomplished in Crisalix in collaboration with the Universitat Pompeu Fabra within the program of Doctorats Industrials. Crisalix has the mission of enhancing the communication between professionals of plastic surgery and patients by providing a solution to the most common question during the surgery planning process of ``How will I look after the surgery?''. The solution proposed by Crisalix is based in 3D imaging technology. This technology generates the 3D reconstruction that accurately represents the area of the patient that is going to be operated. This is followed by the possibility of creating multiple simulations of the plastic procedure, which results in the representation of the possible outcomes of the surgery. This thesis presents a framework capable to reconstruct 3D shapes of faces and breasts of plastic surgery patients from 2D images and 3D scans. The 3D reconstruction of an object is a challenging problem with many inherent ambiguities. Statistical model based methods are a powerful approach to overcome some of these ambiguities. We follow the intuition of maximizing the use of available prior information by introducing it into statistical model based methods to enhance their properties. First, we explore Active Shape Models (ASM) which are a well known method to perform 2D shapes alignment. However, it is challenging to maintain prior information (e.g. small set of given landmarks) unchanged once the statistical model constraints are applied. We propose a new weighted regularized projection into the parameter space which allows us to obtain shapes that at the same time fulfill the imposed shape constraints and are plausible according to the statistical model. Second, we extend this methodology to be applied to 3D Morphable Models (3DMM), which are a widespread method to perform 3D reconstruction. However, existing methods present some limitations. Some of them are based in non-linear optimizations computationally expensive that can get stuck in local minima. Another limitation is that not all the methods provide enough resolution to represent accurately the anatomy details needed for this application. Given the medical use of the application, the accuracy and robustness of the method, are important factors to take into consideration. We show how 3DMM initialization and 3DMM fitting can be improved using our weighted regularized projection. Finally, we present a framework capable to reconstruct 3D shapes of plastic surgery patients from two possible inputs: 2D images and 3D scans. Our method is used in different stages of the 3D reconstruction pipeline: shape alignment; 3DMM initialization and 3DMM fitting. The developed methods have been integrated in the production environment of Crisalix, proving their validity.Aquesta tesi ha estat realitzada a Crisalix amb la col路laboraci贸 de la Universitat Pompeu Fabra sota el pla de Doctorats Industrials. Crisalix t茅 com a objectiu la millora de la comunicaci贸 entre els professionals de la cirurgia pl脿stica i els pacients, proporcionant una soluci贸 a la pregunta que sorgeix m茅s freq眉entment durant el proc茅s de planificaci贸 d'una operaci贸 quir煤rgica ``Com em veur茅 despr茅s de la cirurgia?''. La soluci贸 proposada per Crisalix est脿 basada en la tecnologia d'imatge 3D. Aquesta tecnologia genera la reconstrucci贸 3D de la zona del pacient operada, seguit de la possibilitat de crear m煤ltiples simulacions obtenint la representaci贸 dels possibles resultats de la cirurgia. Aquesta tesi presenta un sistema capa莽 de reconstruir cares i pits de pacients de cirurgia pl脿stica a partir de fotos 2D i escanegis. La reconstrucci贸 en 3D d'un objecte 茅s un problema complicat degut a la pres猫ncia d'ambig眉itats. Els m猫todes basats en models estad铆stics son adequats per mitigar-les. En aquest treball, hem seguit la intu茂ci贸 de maximitzar l'煤s d'informaci贸 pr猫via, introduint-la al model estad铆stic per millorar les seves propietats. En primer lloc, explorem els Active Shape Models (ASM) que s贸n un conegut m猫tode fet servir per alinear contorns d'objectes 2D. No obstant, un cop aplicades les correccions de forma del model estad铆stic, es dif铆cil de mantenir informaci贸 de la que es disposava a priori (per exemple, un petit conjunt de punts donat) inalterada. Proposem una nova projecci贸 ponderada amb un terme de regularitzaci贸, que permet obtenir formes que compleixen les restriccions de forma imposades i alhora s贸n plausibles en concordan莽a amb el model estad铆stic. En segon lloc, ampliem la metodologia per aplicar-la als anomenats 3D Morphable Models (3DMM) que s贸n un m猫tode extensivament utilitzat per fer reconstrucci贸 3D. No obstant, els m猫todes de 3DMM existents presenten algunes limitacions. Alguns estan basats en optimitzacions no lineals, computacionalment costoses i que poden quedar atrapades en m铆nims locals. Una altra limitaci贸, 茅s que no tots el m猫todes proporcionen la resoluci贸 adequada per representar amb precisi贸 els detalls de l'anatomia. Donat l'煤s m猫dic de l'aplicaci贸, la precisi贸 i la robustesa s贸n factors molt importants a tenir en compte. Mostrem com la inicialitzaci贸 i l'ajustament de 3DMM poden ser millorats fent servir la projecci贸 ponderada amb regularitzaci贸 proposada. Finalment, es presenta un sistema capa莽 de reconstruir models 3D de pacients de cirurgia pl脿stica a partir de dos possibles tipus de dades: imatges 2D i escaneigs en 3D. El nostre m猫tode es fa servir en diverses etapes del proc茅s de reconstrucci贸: alineament de formes en imatge, la inicialitzaci贸 i l'ajustament de 3DMM. Els m猫todes desenvolupats han estat integrats a l'entorn de producci贸 de Crisalix provant la seva validesa

    Multimodal Three Dimensional Scene Reconstruction, The Gaussian Fields Framework

    Get PDF
    The focus of this research is on building 3D representations of real world scenes and objects using different imaging sensors. Primarily range acquisition devices (such as laser scanners and stereo systems) that allow the recovery of 3D geometry, and multi-spectral image sequences including visual and thermal IR images that provide additional scene characteristics. The crucial technical challenge that we addressed is the automatic point-sets registration task. In this context our main contribution is the development of an optimization-based method at the core of which lies a unified criterion that solves simultaneously for the dense point correspondence and transformation recovery problems. The new criterion has a straightforward expression in terms of the datasets and the alignment parameters and was used primarily for 3D rigid registration of point-sets. However it proved also useful for feature-based multimodal image alignment. We derived our method from simple Boolean matching principles by approximation and relaxation. One of the main advantages of the proposed approach, as compared to the widely used class of Iterative Closest Point (ICP) algorithms, is convexity in the neighborhood of the registration parameters and continuous differentiability, allowing for the use of standard gradient-based optimization techniques. Physically the criterion is interpreted in terms of a Gaussian Force Field exerted by one point-set on the other. Such formulation proved useful for controlling and increasing the region of convergence, and hence allowing for more autonomy in correspondence tasks. Furthermore, the criterion can be computed with linear complexity using recently developed Fast Gauss Transform numerical techniques. In addition, we also introduced a new local feature descriptor that was derived from visual saliency principles and which enhanced significantly the performance of the registration algorithm. The resulting technique was subjected to a thorough experimental analysis that highlighted its strength and showed its limitations. Our current applications are in the field of 3D modeling for inspection, surveillance, and biometrics. However, since this matching framework can be applied to any type of data, that can be represented as N-dimensional point-sets, the scope of the method is shown to reach many more pattern analysis applications

    Interactive speech-driven facial animation

    Get PDF
    One of the fastest developing areas in the entertainment industry is digital animation. Television programmes and movies frequently use 3D animations to enhance or replace actors and scenery. With the increase in computing power, research is also being done to apply these animations in an interactive manner. Two of the biggest obstacles to the success of these undertakings are control (manipulating the models) and realism. This text describes many of the ways to improve control and realism aspects, in such a way that interactive animation becomes possible. Specifically, lip-synchronisation (driven by human speech), and various modeling and rendering techniques are discussed. A prototype that shows that interactive animation is feasible, is also described.Mr. A. Hardy Prof. S. von Solm

    3D face modelling from sparse data

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Animation of a hierarchical image based facial model and perceptual analysis of visual speech

    Get PDF
    In this Thesis a hierarchical image-based 2D talking head model is presented, together with robust automatic and semi-automatic animation techniques, and a novel perceptual method for evaluating visual-speech based on the McGurk effect. The novelty of the hierarchical facial model stems from the fact that sub-facial areas are modelled individually. To produce a facial animation, animations for a set of chosen facial areas are first produced, either by key-framing sub-facial parameter values, or using a continuous input speech signal, and then combined into a full facial output. Modelling hierarchically has several attractive qualities. It isolates variation in sub-facial regions from the rest of the face, and therefore provides a high degree of control over different facial parts along with meaningful image based animation parameters. The automatic synthesis of animations may be achieved using speech not originally included in the training set. The model is also able to automatically animate pauses, hesitations and non-verbal (or non-speech related) sounds and actions. To automatically produce visual-speech, two novel analysis and synthesis methods are proposed. The first method utilises a Speech-Appearance Model (SAM), and the second uses a Hidden Markov Coarticulation Model (HMCM) - based on a Hidden Markov Model (HMM). To evaluate synthesised animations (irrespective of whether they are rendered semi automatically, or using speech), a new perceptual analysis approach based on the McGurk effect is proposed. This measure provides both an unbiased and quantitative method for evaluating talking head visual speech quality and overall perceptual realism. A combination of this new approach, along with other objective and perceptual evaluation techniques, are employed for a thorough evaluation of hierarchical model animations.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A new method for generic three dimensional human face modelling for emotional bio-robots

    Get PDF
    Existing 3D human face modelling methods are confronted with difficulties in applying flexible control over all facial features and generating a great number of different face models. The gap between the existing methods and the requirements of emotional bio-robots applications urges the creation of a generic 3D human face model. This thesis focuses on proposing and developing two new methods involved in the research of emotional bio-robots: face detection in complex background images based on skin colour model and establishment of a generic 3D human face model based on NURBS. The contributions of this thesis are: A new skin colour based face detection method has been proposed and developed. The new method consists of skin colour model for skin regions detection and geometric rules for distinguishing faces from detected regions. By comparing to other previous methods, the new method achieved better results of detection rate of 86.15% and detection speed of 0.4-1.2 seconds without any training datasets. A generic 3D human face modelling method is proposed and developed. This generic parametric face model has the abilities of flexible control over all facial features and generating various face models for different applications. It includes: The segmentation of a human face of 21 surface features. These surfaces have 34 boundary curves. This feature-based segmentation enables the independent manipulation of different geometrical regions of human face. The NURBS curve face model and NURBS surface face model. These two models are built up based on cubic NURBS reverse computation. The elements of the curve model and surface model can be manipulated to change the appearances of the models by their parameters which are obtained by NURBS reverse computation. A new 3D human face modelling method has been proposed and implemented based on bi-cubic NURBS through analysing the characteristic features and boundary conditions of NURBS techniques. This model can be manipulated through control points on the NURBS facial features to build any specific face models for any kind of appearances and to simulate dynamic facial expressions for various applications such as emotional bio-robots, aesthetic surgery, films and games, and crime investigation and prevention, etc

    S铆ntesis Audiovisual Realista Personalizable

    Get PDF
    Es presenta un esquema 煤nic per a la s铆ntesi i an脿lisi audiovisual personalitzable realista de seq眉猫ncies audiovisuals de cares parlants i seq眉猫ncies visuals de llengua de signes en 脿mbit dom猫stic. En el primer cas, amb animaci贸 totalment sincronitzada a trav茅s d'una font de text o veu; en el segon, utilitzant la t猫cnica de lletrejar paraules mitjan莽ant la ma. Les seves possibilitats de personalitzaci贸 faciliten la creaci贸 de seq眉猫ncies audiovisuals per part d'usuaris no experts. Les aplicacions possibles d'aquest esquema de s铆ntesis comprenen des de la creaci贸 de personatges virtuals realistes per interacci贸 natural o v铆deo jocs fins v铆deo confer猫ncia des de molt baix ample de banda i telefonia visual per a les persones amb problemes d'o茂da, passant per oferir ajuda a la pronunciaci贸 i la comunicaci贸 a aquest mateix col路lectiu. El sistema permet processar seq眉猫ncies llargues amb un consum de recursos molt redu茂t, sobre tot, en el referent a l'emmagatzematge, gr脿cies al desenvolupament d'un nou procediment de c脿lcul incremental per a la descomposici贸 en valors singulars amb actualitzaci贸 de la informaci贸 mitja. Aquest procediment es complementa amb altres tres: el decremental, el de partici贸 i el de composici贸.Se presenta un esquema 煤nico para la s铆ntesis y an谩lisis audiovisual personalizable realista de secuencias audiovisuales de caras parlantes y secuencias visuales de lengua de signos en entorno dom茅stico. En el primer caso, con animaci贸n totalmente sincronizada a trav茅s de una fuente de texto o voz; en el segundo, utilizando la t茅cnica de deletreo de palabras mediante la mano. Sus posibilidades de personalizaci贸n facilitan la creaci贸n de secuencias audiovisuales por parte de usuarios no expertos. Las aplicaciones posibles de este esquema de s铆ntesis comprenden desde la creaci贸n de personajes virtuales realistas para interacci贸n natural o v铆deo juegos hasta v铆deo conferencia de muy bajo ancho de banda y telefon铆a visual para las personas con problemas de o铆do, pasando por ofrecer ayuda en la pronunciaci贸n y la comunicaci贸n a este mismo colectivo. El sistema permite procesar secuencias largas con un consumo de recursos muy reducido gracias al desarrollo de un nuevo procedimiento de c谩lculo incremental para la descomposici贸n en valores singulares con actualizaci贸n de la informaci贸n media.A shared framework for realistic and personalizable audiovisual synthesis and analysis of audiovisual sequences of talking heads and visual sequences of sign language is presented in a domestic environment. The former has full synchronized animation using a text or auditory source of information; the latter consists in finger spelling. Their personalization capabilities ease the creation of audiovisual sequences by non expert users. The applications range from realistic virtual avatars for natural interaction or videogames to low bandwidth videoconference and visual telephony for the hard of hearing, including help to speech therapists. Long sequences can be processed with reduced resources, specially storing ones. This is allowed thanks to the proposed scheme for the incremental singular value decomposition with mean preservation. This scheme is complemented with another three: the decremental, the split and the composed ones

    Animation of a hierarchical image based facial model and perceptual analysis of visual speech

    Get PDF
    In this Thesis a hierarchical image-based 2D talking head model is presented, together with robust automatic and semi-automatic animation techniques, and a novel perceptual method for evaluating visual-speech based on the McGurk effect. The novelty of the hierarchical facial model stems from the fact that sub-facial areas are modelled individually. To produce a facial animation, animations for a set of chosen facial areas are first produced, either by key-framing sub-facial parameter values, or using a continuous input speech signal, and then combined into a full facial output. Modelling hierarchically has several attractive qualities. It isolates variation in sub-facial regions from the rest of the face, and therefore provides a high degree of control over different facial parts along with meaningful image based animation parameters. The automatic synthesis of animations may be achieved using speech not originally included in the training set. The model is also able to automatically animate pauses, hesitations and non-verbal (or non-speech related) sounds and actions. To automatically produce visual-speech, two novel analysis and synthesis methods are proposed. The first method utilises a Speech-Appearance Model (SAM), and the second uses a Hidden Markov Coarticulation Model (HMCM) - based on a Hidden Markov Model (HMM). To evaluate synthesised animations (irrespective of whether they are rendered semi automatically, or using speech), a new perceptual analysis approach based on the McGurk effect is proposed. This measure provides both an unbiased and quantitative method for evaluating talking head visual speech quality and overall perceptual realism. A combination of this new approach, along with other objective and perceptual evaluation techniques, are employed for a thorough evaluation of hierarchical model animations
    corecore