Search CORE

115 research outputs found

Bimodal sensor integration on the example of speech-reading

Author: Bregler C.
Hild Hermann
Manke Stefan
Waibel Alex
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Improving connected letter recognition by lipreading

Author: Bregler C.
Hild Hermann
Manke Stefan
Waibel Alex
Publication venue
Publication date: 02/08/2007
Field of study

KITopen

Learning Population Codes by Minimizing Description Length

Author: Bregler C.
Geoffrey E. Hinton
Hinton G. E.
Richard S. Zemel
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref

Real-Time Cleaning and Refinement of Facial Animation Signals

Author: Akhter I
Berson E
Bray M
Bregler C.
Bruderlin A
Cao C
Garrido P
Graves A
Holden D
Holden D
Huang C
Jain A
Lehrmann AM
Li H
Sutskever I
Taylor GW
Wang JM
Weise T
Weng S-K
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/08/2020
Field of study

With the increasing demand for real-time animated 3D content in the entertainment industry and beyond, performance-based animation has garnered interest among both academic and industrial communities. While recent solutions for motion-capture animation have achieved impressive results, handmade post-processing is often needed, as the generated animations often contain artifacts. Existing real-time motion capture solutions have opted for standard signal processing methods to strengthen temporal coherence of the resulting animations and remove inaccuracies. While these methods produce smooth results, they inherently filter-out part of the dynamics of facial motion, such as high frequency transient movements. In this work, we propose a real-time animation refining system that preserves -- or even restores -- the natural dynamics of facial motions. To do so, we leverage an off-the-shelf recurrent neural network architecture that learns proper facial dynamics patterns on clean animation data. We parametrize our system using the temporal derivatives of the signal, enabling our network to process animations at any framerate. Qualitative results show that our system is able to retrieve natural motion signals from noisy or degraded input animation.Comment: ICGSP 2020: Proceedings of the 2020 The 4th International Conference on Graphics and Signal Processin

arXiv.org e-Print Archive

Crossref

Motion Capture of Hands in Action Using Discriminative Salient Points

Author: A. Erol
B. Stenger
C. Bregler
I. Oikonomidis
J. Gall
J. MacCormick
J.M. Rehg
M. Bray
M. Gorce de La
M.W. Jones
Q. Delamarre
T. Brox
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Abstract. Capturing the motion of two hands interacting with an object is a very challenging task due to the large number of degrees of freedom, self-occlusions, and similarity between the fingers, even in the case of multiple cameras observing the scene. In this paper we propose to use discriminatively learned salient points on the fingers and to estimate the finger-salient point associations simultaneously with the estimation of the hand pose. We introduce a differentiable objective function that also takes edges, optical flow and collisions into account. Our qualitative and quantitative evaluations show that the proposed approach achieves very accurate results for several challenging sequences containing hands and objects in action.

CiteSeerX

Crossref

MPG.PuRe

We address the problem of robust lip tracking, visual speech feature extraction, and sensor integration for audio-visual speech recognition applications. An appearance based model of the articulators, which represents linguistically important features, is learned from example images and is used to locate, track, and recover visual speech information. We tackle the problem of joint temporal modelling of the acoustic and visual speech signals by applying Multi-Stream hidden Markov models. This approach allows the use of different temporal topologies and levels of stream integration and hence enables to model temporal dependencies more accurately. The system has been evaluated for a continuously spoken digit recognition task of 37 subjects

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Loose-limbed People: Estimating 3D Human Pose and Motion Using Non-parametric Belief Propagation

Author: A. Agarwal
A. Balan
A. Banerjee
A. Doucet
A. Doucet
A. Elgammal
A. Opelt
A. T. Ihler
B. Rosenhahn
C. Bregler
C. Sminchisescu
C. Sminchisescu
C. Sminchisescu
C.-S. Lee
D. A. Forsyth
D. C. Hogg
D. Comaniciu
D. Gavrila
D. Gavrila
D. Knossow
D. Koller
D. Marr
D. Ramanan
D. Ramanan
E. Sudderth
E. Sudderth
G. Cooper
G. E. Hinton
G. Elidan
G. Hua
G. K. M. Cheung
G. Mori
G. Shakhnarovich
G. Wywill
H. Sidenbladh
H. Sidenbladh
Horst Haussecker
I. A. Kakadiaris
J. Canny
J. Deutscher
J. Deutscher
J. Deutscher
J. Foley
J. Gall
J. Gall
J. MacCormick
J. Rodgers
J. Sun
K. Choo
K. Grauman
K. Kinoshita
L. Bo
L. Sigal
L. Sigal
L. Sigal
L. Sigal
L. Sigal
Leonid Sigal
M. Andriluka
M. Bergtholdt
M. Eichner
M. Fischler
M. I. Jordan
M. Isard
M. Siddiqui
M. Weber
Michael Isard
Michael J. Black
P. Felzenszwalb
P. Guan
P. Viola
P. Wang
R. Fergus
R. Kehl
R. Li
R. Li
R. Navaratnam
R. Nevatia
R. Poppe
R. Rosales
R. Rosales
R. Urtasun
R. Urtasun
R. W. Poppe
S. Bhatia
S. Corazza
S. Ioffe
S. Ioffe
S. Ju
S. Wachter
S. Yonemoto
T. Moeslund
T.-J. Cham
T.-P. Tian
V. John
W. G. Cochran
X. Lan
X. Xu
Y. Weiss
Y. Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref