Search CORE

7 research outputs found

Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project

Author: Byrne WJ
Hajic J
Ircing P
Mirovsky J
Psutka J
Psutka JV
Publication venue: Institut fur Kommunikationsforschung und Phonetik, Universitat Bonn
Publication date: 16/09/2005
Field of study

CUED - Cambridge University Engineering Department

Towards automatic transcription of spontaneous Czech speech in the MALACH project

Author: Byrne WJ
Gustman S
Hajic J
Ircing P
Psutka J
Psutka JV
Radova V
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/09/2003
Field of study

CUED - Cambridge University Engineering Department

Issues in annotation of the Czech spontaneous speech corpus in the MALACH project

Author: Byrne WJ
Gustman S
Hajic J
Ircing P
Psutka J
Psutka JV
Radova V
Publication venue: European Language Resources Association
Publication date: 16/09/2004
Field of study

CUED - Cambridge University Engineering Department

Large vocabulary ASR for spontaneous Czech in the MALACH project

Author: Byrne WJ
Gustman S
Hajic J
Ircing P
Mirovsky J
Psutka J
Psutka JV
Radova V
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2003
Field of study

This paper describes LVCSR research into the automatic transcription of spontaneous Czech speech in the MALACH (Multilingual Access to Large Spoken Archives) project. This project attempts to provide improved access to the large multilingual spoken archives collected by the Survivors of the Shoah Visual History Foundation (VHF) (www.vhf.org) by advancing the state of the art in automated speech recognition. We describe a baseline ASR system and discuss the problems in language modeling that arise from the nature of Czech as a highly inflectional language that also exhibits diglossia between its written and spontaneous forms. The difficulties of this task are compounded by heavily accented, emotional and disfluent speech along with frequent switching between languages. To overcome the limited amount of relevant language model data we use statistical techniques for selecting an appropriate training corpus from a large unstructured text collection resulting in significant reductions in word error rate. 1

CiteSeerX

CUED - Cambridge University Engineering Department

Automatic transcription of Czech language oral history in the MALACH project: resources and initial experiments

Author: Byrne WJ
Gustman S
Hajic J
Ircing P
Psutka J
Psutka JV
Radova V
Ramabhadran B
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/09/2002
Field of study

CUED - Cambridge University Engineering Department

Building LVCSR system for transcription of spontaneously pronounced Russian testimonies in the MALACH project: initial steps and first results

Author: Byrne WJ
Gustman S
Hajic J
Iljuchin I
Ircing P
Psutka J
Psutka JV
Trejbal V
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/09/2003
Field of study

CUED - Cambridge University Engineering Department

Awareness of sexually transmitted infection and protection methods among university students in Ireland

Author: A Horgan
AV Hollub
BJ Hill
C Cook
C Gunby
C. Dunne
D Mehra
D. McGrath
D. Meagher
E Marek
E O’Connell
EL McCave
EW Moore
EW Moore
F Brink van den
F Straw
J. C. Coffey
JA Higgins
JA Higgins
JV Bailey
K Guse
K. Lally
KE Lechner
KP Mark
KR Shin
M Ekelin
M Taylor
ME Eisenberg
ME Patrick
MV Tolli
O McCarthy
R Psutka
R Vivancos
S. Dunne
SA Vasilenko
SA Vasilenko
T Korhonen
W. Cullen
X Chi
Y. Nathan-V
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref