Search CORE

1,274,771 research outputs found

Phoneme recognition with statistical modeling of the prediction error of neural networks

Author: Freitag Fèlix
Monte Moreno Enrique
Publication venue: International Speech Communication Association (ISCA)
Publication date: 01/01/1998
Field of study

This paper presents a speech recognition system which incorporates predictive neural networks. The neural networks are used to predict observation vectors of speech. The prediction error vectors are modeled on the state level by Gaussian densities, which provide the local similarity measure for the Viterbi algorithm during recognition. The system is evaluated on a continuous speech phoneme recognition task. Compared with a HMM reference system, the proposed system obtained better results in the speech recognition experiments.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections

Author: Huijbregts Marijn
Ordelman Roeland
Wooters Chuck
Publication venue: International Speech Communication Association
Publication date: 01/01/2007
Field of study

In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system trains its models on-line, it is robust for handling out-of-domain data. The speech activity error rate on an out-of-domain test set, recordings of English conference meetings, was 4.4%. The overall error rate on twelve randomly selected five minute TRECVID fragments was 11.5%

University of Twente Research Information

Generating expressive speech for storytelling applications

Author: Bailly G.
Campbell N.
Hamza W.
Heylen Dirk K.J.
Hoge H.
Jianhua T.
Meijs Koen
Ordelman Roeland J.F.
Theune Mariet
Publication venue: IEEE
Publication date: 01/01/2006
Field of study

Work on expressive speech synthesis has long focused on the expression of basic emotions. In recent years, however, interest in other expressive styles has been increasing. The research presented in this paper aims at the generation of a storytelling speaking style, which is suitable for storytelling applications and more in general, for applications aimed at children. Based on an analysis of human storytellers' speech, we designed and implemented a set of prosodic rules for converting "neutral" speech, as produced by a text-to-speech system, into storytelling speech. An evaluation of our storytelling speech generation system showed encouraging results

University of Twente Research Information