43,346 research outputs found
HeadOn: Real-time Reenactment of Human Portrait Videos
We propose HeadOn, the first real-time source-to-target reenactment approach
for complete human portrait videos that enables transfer of torso and head
motion, face expression, and eye gaze. Given a short RGB-D video of the target
actor, we automatically construct a personalized geometry proxy that embeds a
parametric head, eye, and kinematic torso model. A novel real-time reenactment
algorithm employs this proxy to photo-realistically map the captured motion
from the source actor to the target actor. On top of the coarse geometric
proxy, we propose a video-based rendering technique that composites the
modified target portrait video via view- and pose-dependent texturing, and
creates photo-realistic imagery of the target actor under novel torso and head
poses, facial expressions, and gaze directions. To this end, we propose a
robust tracking of the face and torso of the source actor. We extensively
evaluate our approach and show significant improvements in enabling much
greater flexibility in creating realistic reenacted output videos.Comment: Video: https://www.youtube.com/watch?v=7Dg49wv2c_g Presented at
Siggraph'1
Exploiting flow dynamics for super-resolution in contrast-enhanced ultrasound
Ultrasound localization microscopy offers new radiation-free diagnostic tools
for vascular imaging deep within the tissue. Sequential localization of echoes
returned from inert microbubbles with low-concentration within the bloodstream
reveal the vasculature with capillary resolution. Despite its high spatial
resolution, low microbubble concentrations dictate the acquisition of tens of
thousands of images, over the course of several seconds to tens of seconds, to
produce a single super-resolved image. %since each echo is required to be well
separated from adjacent microbubbles. Such long acquisition times and stringent
constraints on microbubble concentration are undesirable in many clinical
scenarios. To address these restrictions, sparsity-based approaches have
recently been developed. These methods reduce the total acquisition time
dramatically, while maintaining good spatial resolution in settings with
considerable microbubble overlap. %Yet, non of the reported methods exploit the
fact that microbubbles actually flow within the bloodstream. % to improve
recovery. Here, we further improve sparsity-based super-resolution ultrasound
imaging by exploiting the inherent flow of microbubbles and utilize their
motion kinematics. While doing so, we also provide quantitative measurements of
microbubble velocities. Our method relies on simultaneous tracking and
super-localization of individual microbubbles in a frame-by-frame manner, and
as such, may be suitable for real-time implementation. We demonstrate the
effectiveness of the proposed approach on both simulations and {\it in-vivo}
contrast enhanced human prostate scans, acquired with a clinically approved
scanner.Comment: 11 pages, 9 figure
Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models
Conventional deep neural networks (DNN) for speech acoustic modeling rely on
Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary
class labels as the targets for DNN training. Subword classes in speech
recognition systems correspond to context-dependent tied states or senones. The
present work addresses some limitations of GMM-HMM senone alignments for DNN
training. We hypothesize that the senone probabilities obtained from a DNN
trained with binary labels can provide more accurate targets to learn better
acoustic models. However, DNN outputs bear inaccuracies which are exhibited as
high dimensional unstructured noise, whereas the informative components are
structured and low-dimensional. We exploit principle component analysis (PCA)
and sparse coding to characterize the senone subspaces. Enhanced probabilities
obtained from low-rank and sparse reconstructions are used as soft-targets for
DNN acoustic modeling, that also enables training with untranscribed data.
Experiments conducted on AMI corpus shows 4.6% relative reduction in word error
rate
Recommended from our members
The Pandora multi-algorithm approach to automated pattern recognition of cosmic-ray muon and neutrino events in the MicroBooNE detector.
The development and operation of liquid-argon time-projection chambers for neutrino physics has created a need for new approaches to pattern recognition in order to fully exploit the imaging capabilities offered by this technology. Whereas the human brain can excel at identifying features in the recorded events, it is a significant challenge to develop an automated, algorithmic solution. The Pandora Software Development Kit provides functionality to aid the design and implementation of pattern-recognition algorithms. It promotes the use of a multi-algorithm approach to pattern recognition, in which individual algorithms each address a specific task in a particular topology. Many tens of algorithms then carefully build up a picture of the event and, together, provide a robust automated pattern-recognition solution. This paper describes details of the chain of over one hundred Pandora algorithms and tools used to reconstruct cosmic-ray muon and neutrino events in the MicroBooNE detector. Metrics that assess the current pattern-recognition performance are presented for simulated MicroBooNE events, using a selection of final-state event topologies
- …