3 research outputs found
A baseline system for the transcription of catalan broadcast conversation
The paper describes aspects, methods and results of the development of an automatic transcription system for Catalan broadcast conversation by means of speech recognition. Emphasis is given to Catalan language, acoustic and language
modellingmethods and recognition. Results are discussed in context of phenomena and challenges in spontaneous speech, in particular regarding phoneme duration and feature space reduction.Postprint (published version
A Catalan broadcast conversational speech database
Data driven methods in speech and linguistic research, and system develoment require appropriate speech databases. A
new Catalan speech database has been developed with a particular
emphasis on broadcast conversational speech. The article describes origin and nature of the broadcasts and its acoustic
environment. Annotation and transcription provide statistics on specific phenomena of exhibited speech, speaker characteristics
and acoustic events. It concludes with perspective uses and limitations.Peer ReviewedPostprint (published version