Search CORE

6 research outputs found

An audio-visual corpus for multimodal automatic speech recognition

Author
Publication venue: Springer
Publication date: 07/01/2017
Field of study

Springer - Publisher Connector

An audio-visual corpus for multimodal automatic speech recognition

Author: A Czyzewski
A Czyzewski
AG Chitu
Andrzej Czyzewski
Bozena Kostek
D Petrovska-Delacrétaz
D Stewart
E Trentin
H Lane
H Lane
H McGurk
Jozef Kotus
K Noda
M Cooke
Marcin Szykulski
P Zelasko
Piotr Bratoszewski
RS Bolia
S Pigeon
YW Wong
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Deep learning for audio-visual speech recognition

Author: Κουμπαρούλης Αλέξανδρος Κ.
Publication venue
Publication date: 01/01/2017
Field of study

University of Thessaly Institutional Repository

A new multi-purpose audio-visual UNMC-VIER database with multiple variabilities

Author: Ang L.
Chew W.J.
Chin S.W.
Ch’ng S.I.
Lim Hann
Seng K.P.
Wong Y.W.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Audio-visual recognition system is becoming popular because it overcomes certain problems of traditional audio-only recognition system. However, difficulties due to visual variations in video sequencecan significantly degrade the recognition performance of the system. This problem can be further complicated when more than one visual variation happen at the same time. Although several databases have been created in this area, none of them includes realistic visual variations in video sequence. With the aim to facilitate the development of robust audio-visual recognition systems, the new audio-visualUNMC-VIER database is created. This database contains various visual variations including illumination,facial expression, head pose, and image resolution variations. The most unique aspect of this database is that it includes more than one visual variation in the same video recording. For the audio part, the utterances are spoken in slow and normal speech pace to improve the learning process of audio-visual speech recognition system. Hence, this database is useful for the development of robust audio-visual person,speech recognition and face recognition systems

espace@Curtin