45 research outputs found
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
Object segmentation from low depth of field images and video sequences
This thesis addresses the problem of autonomous object segmentation. To do so
the proposed segementation method uses some prior information, namely that the
image to be segmented will have a low depth of field and that the object of interest
will be more in focus than the background. To differentiate the object from the
background scene, a multiscale wavelet based assessment is proposed. The focus
assessment is used to generate a focus intensity map, and a sparse fields level set
implementation of active contours is used to segment the object of interest. The
initial contour is generated using a grid based technique.
The method is extended to segment low depth of field video sequences with
each successive initialisation for the active contours generated from the binary dilation
of the previous frame's segmentation. Experimental results show good segmentations
can be achieved with a variety of different images, video sequences, and
objects, with no user interaction or input.
The method is applied to two different areas. In the first the segmentations
are used to automatically generate trimaps for use with matting algorithms. In the
second, the method is used as part of a shape from silhouettes 3D object reconstruction
system, replacing the need for a constrained background when generating
silhouettes. In addition, not using a thresholding to perform the silhouette segmentation
allows for objects with dark components or areas to be segmented accurately.
Some examples of 3D models generated using silhouettes are shown
Artistic minimal rendering with lines and blocks
Many non-photorealistic rendering techniques exist to produce artistic effects from given images. Inspired by various artists, interesting effects can be produced by using a minimal rendering, where the minimum refers to the number of tones as well as the number and complexity of the primitives used for rendering. Our method is based on various computer vision techniques, and uses a combination of refined lines and blocks (potentially simplified), as well as a small number of tones, to produce abstracted artistic rendering with sufficient elements from the original image. We also considered a variety of methods to produce different artistic styles, such as colour and 2-tone drawings, and use semantic information to improve renderings for faces. By changing some intuitive parameters a wide range of visually pleasing results can be produced. Our method is fully automatic. We demonstrate the effectiveness of our method with extensive experiments and a user study
Stochastic Methods for Fine-Grained Image Segmentation and Uncertainty Estimation in Computer Vision
In this dissertation, we exploit concepts of probability theory, stochastic methods and machine learning to address three existing limitations of deep learning-based models for image understanding. First, although convolutional neural networks (CNN) have substantially improved the state of the art in image understanding, conventional CNNs provide segmentation masks that poorly adhere to object boundaries, a critical limitation for many potential applications. Second, training deep learning models requires large amounts of carefully selected and annotated data, but large-scale annotation of image segmentation datasets is often prohibitively expensive. And third, conventional deep learning models also lack the capability of uncertainty estimation, which compromises both decision making and model interpretability. To address these limitations, we introduce the Region Growing Refinement (RGR) algorithm, an unsupervised post-processing algorithm that exploits Monte Carlo sampling and pixel similarities to propagate high-confidence labels into regions of low-confidence classification. The probabilistic Region Growing Refinement (pRGR) provides RGR with a rigorous mathematical foundation that exploits concepts of Bayesian estimation and variance reduction techniques. Experiments demonstrate both the effectiveness of (p)RGR for the refinement of segmentation predictions, as well as its suitability for uncertainty estimation, since its variance estimates obtained in the Monte Carlo iterations are highly correlated with segmentation accuracy. We also introduce FreeLabel, an intuitive open-source web interface that exploits RGR to allow users to obtain high-quality segmentation masks with just a few freehand scribbles, in a matter of seconds. Designed to benefit the computer vision community, FreeLabel can be used for both crowdsourced or private annotation and has a modular structure that can be easily adapted for any image dataset. The practical relevance of methods developed in this dissertation are illustrated through applications on agricultural and healthcare-related domains. We have combined RGR and modern CNNs for fine segmentation of fruit flowers, motivated by the importance of automated bloom intensity estimation for optimization of fruit orchard management and, possibly, automatizing procedures such as flower thinning and pollination. We also exploited an early version of FreeLabel to annotate novel datasets for segmentation of fruit flowers, which are currently publicly available. Finally, this dissertation also describes works on fine segmentation and gaze estimation for images collected from assisted living environments, with the ultimate goal of assisting geriatricians in evaluating health status of patients in such facilities
Artistic minimal rendering with lines and blocks
Many non-photorealistic rendering techniques exist to produce artistic effects from given images. Inspired by various artists, interesting effects can be produced by using a minimal rendering, where the minimum refers to the number of tones as well as the number and complexity of the primitives used for rendering. Our method is based on various computer vision techniques, and uses a combination of refined lines and blocks (potentially simplified), as well as a small number of tones, to produce abstracted artistic rendering with sufficient elements from the original image. We also considered a variety of methods to produce different artistic styles, such as colour and 2-tone drawings, and use semantic information to improve renderings for faces. By changing some intuitive parameters a wide range of visually pleasing results can be produced. Our method is fully automatic. We demonstrate the effectiveness of our method with extensive experiments and a user study
ClearPhoto - augmented photography
The widespread use of mobile devices has made known to the general public new areas
that were hitherto confined to specialized devices. In general, the smartphone came to
give all users the ability to execute multiple tasks, and among them, take photographs using the integrated cameras.
Although these devices are continuously receiving improved cameras, their manufacturers do not take advantage of their full potential, since the operating systems normally offer simple APIs and applications for shooting. Therefore, taking advantage of
this environment for mobile devices, we find ourselves in the best scenario to develop
applications that help the user obtaining a good result when shooting.
In an attempt to provide a set of techniques and tools more applied to the task, this
dissertation presents, as a contribution, a set of tools for mobile devices that provides
information in real-time on the composition of the scene before capturing an image.
Thus, the proposed solution gives support to a user while capturing a scene with a
mobile device. The user will be able to receive multiple suggestions on the composition of the scene, which will be based on rules of photography or other useful tools for photographers. The tools include horizon detection and graphical visualization of the color palette presented on the scenario being photographed. These tools were evaluated regarding the mobile device implementation and how users assess their usefulness