Search CORE

56,689 research outputs found

Confidence-based adaptive frame rate up-conversion

Author: Dong-Gyu Sim
Kyung-Yeon Min
Publication venue: Springer Nature
Publication date: 01/01/2013
Field of study

Springer - Publisher Connector

Confidence-based adaptive frame rate up-conversion

Author: B-D Choi
C Cafforio
C Wang
D Alfonso
D Wang
G Haan
G-I Lee
G-I Lee
H Sasai
H Sheikh
H Sheikh
J Shi
J Zhai
KY Min
N Damera-Venkata
R Castagno
S Fujiwara
S-H Lee
SJ Kang
T Chen
T Thaipanich
T-H Tsai
Y Zhang
Y-K Chen
Y-T Yang
YL Lee
Z Wang
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

A Comparison of Front-Ends for Bitstream-Based ASR over IP

Author: Díaz de María Fernando
Gallardo Antolín Ascensión
Gómez Cajas D. F.
Peláez Moreno Carmen
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

Automatic speech recognition (ASR) is called to play a relevant role in the provision of spoken interfaces for IP-based applications. However, as a consequence of the transit of the speech signal over these particular networks, ASR systems need to face two new challenges: the impoverishment of the speech quality due to the compression needed to fit the channel capacity and the inevitable occurrence of packet losses. In this framework, bitstream-based approaches that obtain the ASR feature vectors directly from the coded bitstream, avoiding the speech decoding process, have been proposed ([S.H. Choi, H.K. Kim, H.S. Lee, Speech recognition using quantized LSP parameters and their transformations in digital communications, Speech Commun. 30 (4) (2000) 223–233. A. Gallardo-Antolín, C. Pelàez-Moreno, F. Díaz-de-María, Recognizing GSM digital speech, IEEE Trans. Speech Audio Process., to appear. H.K. Kim, R.V. Cox, R.C. Rose, Performance improvement of a bitstream-based front-end for wireless speech recognition in adverse environments, IEEE Trans. Speech Audio Process. 10 (8) (2002) 591–604. C. Peláez-Moreno, A. Gallardo-Antolín, F. Díaz-de-María, Recognizing voice over IP networks: a robust front-end for speech recognition on the WWW, IEEE Trans. Multimedia 3(2) (2001) 209–218], among others) to improve the robustness of ASR systems. LSP (Line Spectral Pairs) are the preferred set of parameters for the description of the speech spectral envelope in most of the modern speech coders. Nevertheless, LSP have proved to be unsuitable for ASR, and they must be transformed into cepstrum-type parameters. In this paper we comparatively evaluate the robustness of the most significant LSP to cepstrum transformations in a simulated VoIP (voice over IP) environment which includes two of the most popular codecs used in that network (G.723.1 and G.729) and several network conditions. In particular, we compare ‘pseudocepstrum’ [H.K. Kim, S.H. Choi, H.S. Lee, On approximating Line Spectral Frequencies to LPC cepstral coefficients, IEEE Trans. Speech Audio Process. 8 (2) (2000) 195–199], an approximated but straightforward transformation of LSP into LP cepstral coefficients, with a more computationally demanding but exact one. Our results show that pseudocepstrum is preferable when network conditions are good or computational resources low, while the exact procedure is recommended when network conditions become more adverse.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

Video Streaming in Evolving Networks under Fuzzy Logic Control

Author: Fleury M
Ghanbari M
Jammeh EA
Moiron S
Razavi R
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

University of Essex Research Repository

IntechOpen

Crossref