24 research outputs found
EgoFace: Egocentric Face Performance Capture and Videorealistic Reenactment
Face performance capture and reenactment techniques use multiple cameras and sensors, positioned at a distance from the face or mounted on heavy wearable devices. This limits their applications in mobile and outdoor environments. We present EgoFace, a radically new lightweight setup for face performance capture and front-view videorealistic reenactment using a single egocentric RGB camera. Our lightweight setup allows operations in uncontrolled environments, and lends itself to telepresence applications such as video-conferencing from dynamic environments. The input image is projected into a low dimensional latent space of the facial expression parameters. Through careful adversarial training of the parameter-space synthetic rendering, a videorealistic animation is produced. Our problem is challenging as the human visual system is sensitive to the smallest face irregularities that could occur in the final results. This sensitivity is even stronger for video results. Our solution is trained in a pre-processing stage, through a supervised manner without manual annotations. EgoFace captures a wide variety of facial expressions, including mouth movements and asymmetrical expressions. It works under varying illuminations, background, movements, handles people from different ethnicities and can operate in real time
Shape-from-intrinsic operator
Shape-from-X is an important class of problems in the fields of geometry
processing, computer graphics, and vision, attempting to recover the structure
of a shape from some observations. In this paper, we formulate the problem of
shape-from-operator (SfO), recovering an embedding of a mesh from intrinsic
differential operators defined on the mesh. Particularly interesting instances
of our SfO problem include synthesis of shape analogies, shape-from-Laplacian
reconstruction, and shape exaggeration. Numerically, we approach the SfO
problem by splitting it into two optimization sub-problems that are applied in
an alternating scheme: metric-from-operator (reconstruction of the discrete
metric from the intrinsic operator) and embedding-from-metric (finding a shape
embedding that would realize a given metric, a setting of the multidimensional
scaling problem)
Real-Time Cleaning and Refinement of Facial Animation Signals
With the increasing demand for real-time animated 3D content in the
entertainment industry and beyond, performance-based animation has garnered
interest among both academic and industrial communities. While recent solutions
for motion-capture animation have achieved impressive results, handmade
post-processing is often needed, as the generated animations often contain
artifacts. Existing real-time motion capture solutions have opted for standard
signal processing methods to strengthen temporal coherence of the resulting
animations and remove inaccuracies. While these methods produce smooth results,
they inherently filter-out part of the dynamics of facial motion, such as high
frequency transient movements. In this work, we propose a real-time animation
refining system that preserves -- or even restores -- the natural dynamics of
facial motions. To do so, we leverage an off-the-shelf recurrent neural network
architecture that learns proper facial dynamics patterns on clean animation
data. We parametrize our system using the temporal derivatives of the signal,
enabling our network to process animations at any framerate. Qualitative
results show that our system is able to retrieve natural motion signals from
noisy or degraded input animation.Comment: ICGSP 2020: Proceedings of the 2020 The 4th International Conference
on Graphics and Signal Processin
Deep Insights of Deepfake Technology : A Review
Under the aegis of computer vision and deep learning technology, a new
emerging techniques has introduced that anyone can make highly realistic but
fake videos, images even can manipulates the voices. This technology is widely
known as Deepfake Technology. Although it seems interesting techniques to make
fake videos or image of something or some individuals but it could spread as
misinformation via internet. Deepfake contents could be dangerous for
individuals as well as for our communities, organizations, countries religions
etc. As Deepfake content creation involve a high level expertise with
combination of several algorithms of deep learning, it seems almost real and
genuine and difficult to differentiate. In this paper, a wide range of articles
have been examined to understand Deepfake technology more extensively. We have
examined several articles to find some insights such as what is Deepfake, who
are responsible for this, is there any benefits of Deepfake and what are the
challenges of this technology. We have also examined several creation and
detection techniques. Our study revealed that although Deepfake is a threat to
our societies, proper measures and strict regulations could prevent this
CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images
With the powerfulness of convolution neural networks (CNN), CNN based face
reconstruction has recently shown promising performance in reconstructing
detailed face shape from 2D face images. The success of CNN-based methods
relies on a large number of labeled data. The state-of-the-art synthesizes such
data using a coarse morphable face model, which however has difficulty to
generate detailed photo-realistic images of faces (with wrinkles). This paper
presents a novel face data generation method. Specifically, we render a large
number of photo-realistic face images with different attributes based on
inverse rendering. Furthermore, we construct a fine-detailed face image dataset
by transferring different scales of details from one image to another. We also
construct a large number of video-type adjacent frame pairs by simulating the
distribution of real video data. With these nicely constructed datasets, we
propose a coarse-to-fine learning framework consisting of three convolutional
networks. The networks are trained for real-time detailed 3D face
reconstruction from monocular video as well as from a single image. Extensive
experimental results demonstrate that our framework can produce high-quality
reconstruction but with much less computation time compared to the
state-of-the-art. Moreover, our method is robust to pose, expression and
lighting due to the diversity of data.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine
Intelligence, 201
FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality
We introduce FaceVR, a novel method for gaze-aware facial reenactment in the Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD), as well as a new data-driven approach for eye tracking from monocular videos. In addition to these face reconstruction components, FaceVR incorporates photo-realistic re-rendering in real time, thus allowing artificial modifications of face and eye appearances. For instance, we can alter facial expressions, change gaze directions, or remove the VR goggles in realistic re-renderings. In a live setup with a source and a target actor, we apply these newly-introduced algorithmic components. We assume that the source actor is wearing a VR device, and we capture his facial expressions and eye movement in real-time. For the target video, we mimic a similar tracking process; however, we use the source input to drive the animations of the target video, thus enabling gaze-aware facial reenactment. To render the modified target video on a stereo display, we augment our capture and reconstruction process with stereo data. In the end, FaceVR produces compelling results for a variety of applications, such as gaze-aware facial reenactment, reenactment in virtual reality, removal of VR goggles, and re-targeting of somebody's gaze direction in a video conferencing call
Dynamic 3D Avatar Creation from Hand-held Video Input
We present a complete pipeline for creating fully rigged, personalized 3D facial avatars from hand-held video. Our system faithfully recovers facial expression dynamics of the user by adapting a blendshape template to an image sequence of recorded expressions using an optimization that integrates feature tracking, optical flow, and shape from shading. Fine-scale details such as wrinkles are captured separately in normal maps and ambient occlusion maps. From this user- and expression-specific data, we learn a regressor for on-the-fly detail synthesis during animation to enhance the perceptual realism of the avatars. Our system demonstrates that the use of appropriate reconstruction priors yields compelling face rigs even with a minimalistic acquisition system and limited user assistance. This facilitates a range of new applications in computer animation and consumer-level online communication based on personalized avatars. We present realtime application demos to validate our method
Captura e Reprodução de Expressões Faciais
Na dobragem de produções audiovisuais só são traduzidas as falas, existindo uma discrepância entre o áudio e o que a personagem diz (movimentos da face, boca e lábios), que resulta em falhas na compreensão da fala e em dobragens pouco realistas. Nos últimos anos, têm sido desenvolvidos métodos de captura e reprodução de movimentos faciais que capturam os movimentos faciais de um ator de dobragem e reproduzem esses movimentos na face de um ator previamente gravado. Este documento contém a análise e avaliação do estado da arte de métodos de captura e de reprodução de movimentos faciais, e a descrição de uma solução de captura e reprodução em tempo-real, utilizando uma câmara normal, desenvolvida para tentar resolver os problemas existentes com as dobragens tradicionais. A solução implementada foi avaliada através de questionários efetuados, demonstrando qualidade ainda inferior às dobragens tradicionais.In the dubbing of audio-visual productions, only the lines are translated, and there is a discrepancy between the audio and what the character says (movements of the face, mouth, and lips), which results in poor speech comprehension and unrealistic dubs. In recent years, methods of capturing and reproducing facial movements have been developed that capture the facial movements of a dubbing actor and reproduce these movements in the face of a previously recorded actor. This document contains the analysis and evaluation of the state of the art in methods of capture and reproduction of facial movements, and the description of a real-time capture and reproduction solution, using a normal camera, developed to address existing problems with traditional dubbing. The implemented solution was evaluated through questionnaires, showing a quality that is still inferior to traditional dubbing