42,399 research outputs found
Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models
Conventional deep neural networks (DNN) for speech acoustic modeling rely on
Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary
class labels as the targets for DNN training. Subword classes in speech
recognition systems correspond to context-dependent tied states or senones. The
present work addresses some limitations of GMM-HMM senone alignments for DNN
training. We hypothesize that the senone probabilities obtained from a DNN
trained with binary labels can provide more accurate targets to learn better
acoustic models. However, DNN outputs bear inaccuracies which are exhibited as
high dimensional unstructured noise, whereas the informative components are
structured and low-dimensional. We exploit principle component analysis (PCA)
and sparse coding to characterize the senone subspaces. Enhanced probabilities
obtained from low-rank and sparse reconstructions are used as soft-targets for
DNN acoustic modeling, that also enables training with untranscribed data.
Experiments conducted on AMI corpus shows 4.6% relative reduction in word error
rate
Extreme 3D Face Reconstruction: Seeing Through Occlusions
Existing single view, 3D face reconstruction methods can produce beautifully
detailed 3D results, but typically only for near frontal, unobstructed
viewpoints. We describe a system designed to provide detailed 3D
reconstructions of faces viewed under extreme conditions, out of plane
rotations, and occlusions. Motivated by the concept of bump mapping, we propose
a layered approach which decouples estimation of a global shape from its
mid-level details (e.g., wrinkles). We estimate a coarse 3D face shape which
acts as a foundation and then separately layer this foundation with details
represented by a bump map. We show how a deep convolutional encoder-decoder can
be used to estimate such bump maps. We further show how this approach naturally
extends to generate plausible details for occluded facial regions. We test our
approach and its components extensively, quantitatively demonstrating the
invariance of our estimated facial details. We further provide numerous
qualitative examples showing that our method produces detailed 3D face shapes
in viewing conditions where existing state of the art often break down.Comment: Accepted to CVPR'18. Previously titled: "Extreme 3D Face
Reconstruction: Looking Past Occlusions
- …