104,354 research outputs found

    HeadOn: Real-time Reenactment of Human Portrait Videos

    Get PDF
    We propose HeadOn, the first real-time source-to-target reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel real-time reenactment algorithm employs this proxy to photo-realistically map the captured motion from the source actor to the target actor. On top of the coarse geometric proxy, we propose a video-based rendering technique that composites the modified target portrait video via view- and pose-dependent texturing, and creates photo-realistic imagery of the target actor under novel torso and head poses, facial expressions, and gaze directions. To this end, we propose a robust tracking of the face and torso of the source actor. We extensively evaluate our approach and show significant improvements in enabling much greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at Siggraph'1

    3D visualization of cadastre : assessing the suitability of visual variables and enhancement techniques in the 3D model of condominium property units

    Get PDF
    La visualisation 3D de données cadastrales a été exploitée dans de nombreuses études, car elle offre de nouvelles possibilités d’examiner des situations de supervision verticale des propriétés. Les chercheurs actifs dans ce domaine estiment que la visualisation 3D pourrait fournir aux utilisateurs une compréhension plus intuitive d’une situation où des propriétés se superposent, ainsi qu’une plus grande capacité et avec moins d’ambiguïté de montrer des problèmes potentiels de chevauchement des unités de propriété. Cependant, la visualisation 3D est une approche qui apporte de nombreux défis par rapport à la visualisation 2D. Les précédentes recherches effectuées en cadastre 3D, et qui utilisent la visualisation 3D, ont très peu enquêté l’impact du choix des variables visuelles (ex. couleur, style) sur la prise de décision. Dans l’optique d'améliorer la visualisation 3D de données cadastres, cette thèse de doctorat examine l’adéquation du choix des variables visuelles et des techniques de rehaussement associées afin de produire un modèle de condominium 3D optimal, et ce, en fonction de certaines tâches spécifiques de visualisation. Les tâches visées sont celles dédiées à la compréhension dans l’espace 3D des limites de propriété du condominium. En ce sens, ce sont principalement des tâches notariales qui ont été ciblées. De plus, cette thèse va mettre en lumière les différences de l’impact des variables visuelles entre une visualisation 2D et 3D. Cette thèse identifie dans un premier temps un cadre théorique pour l'interprétation des variables visuelles dans le contexte d’une visualisation 3D et de données cadastrales au regard d’une revue de littéraire. Dans un deuxième temps, des expérimentations ont été réalisées afin de mettre à l’épreuve la performance des variables visuelles (ex. couleur, valeur, texture) et des techniques de rehaussement (transparence, annotation, déplacement). Trois approches distinctes ont été utilisées : 1) discussion directe avec des personnes œuvrant en géomatique, 2) entrevue face à face avec des notaires et 3) questionnaire en ligne avec des groupes ciblés. L’utilisabilité mesurée en termes d’efficacité, d’efficience et de degré de satisfaction a servi aux comparaisons des expérimentations. Les principaux résultats de cette recherche sont : 1) Une liste de tâches visuelles notariales utiles à la délimitation des unités de propriété dans le contexte de la visualisation 3D de condominium ; 2) Des recommandations quant à l'adéquation de huit variables visuelles et de trois techniques de rehaussement afin d’optimiser la réalisation d’un certain nombre de tâches notariales ; 3) Une analyse comparative de la performance de ces variables entre une visualisation 2D et 3D.3D visualization is being widely used in GIS (geographic information system) and CAD (computer-aided design) applications. It has also been introduced in cadastre studies to better communicate overlaps to the viewer, where the property units vertically stretch over or cover one part of the land parcel. Researchers believe that 3D visualization could provide viewers with a more intuitive perception, and it has the capability to demonstrate overlapping property units in condominiums unambiguously. However, 3D visualization has many challenges compared with 2D visualization. Many cadastre researchers adopted 3D visualization without thoroughly investigating the potential users, the visual tasks for decision-making, and the appropriateness of their representation design. Neither designers nor users may be aware of the risk of producing an inadequate 3D visualization, especially in an era when 3D visualization is relatively novel in the cadastre domain. With a general aim to improve the 3D visualization of cadastre data, this dissertation addresses the design of the 3D cadastre model from a graphics semiotics viewpoint including visual variables and enhancement techniques. The research questions are, firstly, what is the suitability of the visual variables and enhancement techniques in the 3D cadastre model to support the intended users' decision-making goal of delimitating condominium property units, and secondly, what are the perceptual properties of visual variables in 3D visualization compared with 2D visualization? This dissertation firstly identifies the theoretical framework for the interpretation of visual variables in 3D visualization as well as cadastre-related knowledge with literature review. Then, we carry out a preliminary evaluation of the feasibility of visual variables and enhancement techniques in a form of an expert-group review. With the result of the preliminary evaluation, this research then performs the hypothetico-deductive scientific approach to establishing a list of hypotheses to be validated by empirical tests regarding the suitability of visual variables and enhancement techniques in a cartographic representation of property units in condominiums for 3D visualization. The evaluation is based on the usability specification, which contains three measurements: effectiveness, efficiency, and preference. Several empirical tests are conducted with cadastral users in the forms of face-to-face interviews and online questionnaires, followed by statistical analysis. Size, shape, brightness, saturation, hue, orientation, texture, and transparency are the most discussed and used visual variables in existing cartographic research and implementations; thus, these eight visual variables have been involved in the tests. Their perceptual properties exhibited in the empirical test with concrete 3D models in this work are compared with those in a 2D visualization, which is derived from a literature-based synthesis. Three enhancement techniques, including labeling, 3D explosion, and highlighting, are tested as well. There are three main outcomes of this work. First, we established a list of visual tasks adapted to notaries for delimiting property units in the context of 3D visualization of condominium cadastres. Second, we describe the suitability of eight visual variables (Size, Shape, Brightness, Saturation, Hue, Orientation, Texture, and Transparency) of the property units and three enhancement techniques (labeling, 3D explosion and highlighting) in the context of 3D visualisation of condominium property units, based on the usability specification for delimitating visual tasks. For example, brightness only shows good performance in helping users distinguish private and common parts in the context of 3D visualization of property units in condominiums. As well, color hue and saturation are effective and preferable. The third outcome is a statement of the perceptual properties’ differences of visual variables between 3D visualization and 2D visualization. For example, according to Bertin (1983)’s definition, orientation is associative and selective in 2D, yet it does not perform in a 3D visualization. In addition, 3D visualization affects the performance of brightness, making it marginally dissociative and selective

    Text-based Editing of Talking-head Video

    No full text
    Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts). Our method automatically annotates an input talking-head video with phonemes, visemes, 3D face pose and geometry, reflectance, expression and scene illumination per frame. To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material. The annotated parameters corresponding to the selected segments are seamlessly stitched together and used to produce an intermediate video representation in which the lower half of the face is rendered with a parametric face model. Finally, a recurrent video generation network transforms this representation to a photorealistic video that matches the edited transcript. We demonstrate a large variety of edits, such as the addition, removal, and alteration of words, as well as convincing language translation and full sentence synthesis

    Mean value coordinates–based caricature and expression synthesis

    Get PDF
    We present a novel method for caricature synthesis based on mean value coordinates (MVC). Our method can be applied to any single frontal face image to learn a specified caricature face pair for frontal and 3D caricature synthesis. This technique only requires one or a small number of exemplar pairs and a natural frontal face image training set, while the system can transfer the style of the exemplar pair across individuals. Further exaggeration can be fulfilled in a controllable way. Our method is further applied to facial expression transfer, interpolation, and exaggeration, which are applications of expression editing. Additionally, we have extended our approach to 3D caricature synthesis based on the 3D version of MVC. With experiments we demonstrate that the transferred expressions are credible and the resulting caricatures can be characterized and recognized

    A survey on mouth modeling and analysis for Sign Language recognition

    Get PDF
    © 2015 IEEE.Around 70 million Deaf worldwide use Sign Languages (SLs) as their native languages. At the same time, they have limited reading/writing skills in the spoken language. This puts them at a severe disadvantage in many contexts, including education, work, usage of computers and the Internet. Automatic Sign Language Recognition (ASLR) can support the Deaf in many ways, e.g. by enabling the development of systems for Human-Computer Interaction in SL and translation between sign and spoken language. Research in ASLR usually revolves around automatic understanding of manual signs. Recently, ASLR research community has started to appreciate the importance of non-manuals, since they are related to the lexical meaning of a sign, the syntax and the prosody. Nonmanuals include body and head pose, movement of the eyebrows and the eyes, as well as blinks and squints. Arguably, the mouth is one of the most involved parts of the face in non-manuals. Mouth actions related to ASLR can be either mouthings, i.e. visual syllables with the mouth while signing, or non-verbal mouth gestures. Both are very important in ASLR. In this paper, we present the first survey on mouth non-manuals in ASLR. We start by showing why mouth motion is important in SL and the relevant techniques that exist within ASLR. Since limited research has been conducted regarding automatic analysis of mouth motion in the context of ALSR, we proceed by surveying relevant techniques from the areas of automatic mouth expression and visual speech recognition which can be applied to the task. Finally, we conclude by presenting the challenges and potentials of automatic analysis of mouth motion in the context of ASLR

    Morphable Face Models - An Open Framework

    Full text link
    In this paper, we present a novel open-source pipeline for face registration based on Gaussian processes as well as an application to face image analysis. Non-rigid registration of faces is significant for many applications in computer vision, such as the construction of 3D Morphable face models (3DMMs). Gaussian Process Morphable Models (GPMMs) unify a variety of non-rigid deformation models with B-splines and PCA models as examples. GPMM separate problem specific requirements from the registration algorithm by incorporating domain-specific adaptions as a prior model. The novelties of this paper are the following: (i) We present a strategy and modeling technique for face registration that considers symmetry, multi-scale and spatially-varying details. The registration is applied to neutral faces and facial expressions. (ii) We release an open-source software framework for registration and model-building, demonstrated on the publicly available BU3D-FE database. The released pipeline also contains an implementation of an Analysis-by-Synthesis model adaption of 2D face images, tested on the Multi-PIE and LFW database. This enables the community to reproduce, evaluate and compare the individual steps of registration to model-building and 3D/2D model fitting. (iii) Along with the framework release, we publish a new version of the Basel Face Model (BFM-2017) with an improved age distribution and an additional facial expression model
    • …
    corecore