Search CORE

2 research outputs found

Detecting gross alignment errors in the Spoken British National Corpus

Author: Baghai-Ravary Ladan
Grau Sergio
Kochanski Greg
Publication venue
Publication date: 01/01/2010
Field of study

The paper presents methods for evaluating the accuracy of alignments between transcriptions and audio recordings. The methods have been applied to the Spoken British National Corpus, which is an extensive and varied corpus of natural unscripted speech. Early results show good agreement with human ratings of alignment accuracy. The methods also provide an indication of the location of likely alignment problems; this should allow efficient manual examination of large corpora. Automatic checking of such alignments is crucial when analysing any very large corpus, since even the best current speech alignment systems will occasionally make serious errors. The methods described here use a hybrid approach based on statistics of the speech signal itself, statistics of the labels being evaluated, and statistics linking the two.Comment: Four pages, 3 figures. Presented at "New Tools and Methods for Very-Large-Scale Phonetics Research", University of Pennsylvania, January 28-31, 201

arXiv.org e-Print Archive

Oxford University Research Archive

Precision of Phoneme Boundaries Derived using Hidden Markov Models

Author: Greg Kochanski
John Coleman
Ladan Baghai-ravary
Publication venue
Publication date: 01/01/2009
Field of study

Some phoneme boundaries correspond to abrupt changes in the acoustic signal. Others are less clear-cut because the transition from one phoneme to the next is gradual. This paper compares the phoneme boundaries identified by a large number of different alignment systems, using different signal representations and Hidden Markov Model structures. The variability of the different boundaries is analysed statistically, with the boundaries grouped in terms of the broad phonetic classes of the respective phonemes. The mutual consistency between the boundaries from the various systems is analysed to identify which classes of phoneme boundary can be identified reliably by an automatic labelling system, and which are ill-defined and ambiguous. The results presented here provide a starting point for future development of techniques for objective comparisons between systems without giving undue weight to variations in those phoneme boundaries which are inherently ambiguous. Such techniques should improve the efficiency with which new alignment and HMM training algorithms can be developed. 1

CiteSeerX