Search CORE

1 research outputs found

Automatic segmentation and labeling of continuous speech without bootstrapping” 2004

Author: Hema A. Murthy
N. Hemalatha
T. Nagarajan
Publication venue
Publication date: 28/10/2011
Field of study

In this paper, a novel approach is proposed for automatically segmenting and transcribing continuous speech signal without the use of manually annotated speech corpora. The continuous speech signal is first segmented into syllablelike units by considering short-term energy as a magnitude spectrum of some arbitrary signal. Similar syllable segments are then grouped together using an unsupervised incremental clustering technique. Separate models are generated for each cluster of syllable segments. At this stage, labels are assigned for each group of syllable segments manually. The syllable models of these clusters are then used to transcribe/recognize the continuous speech signal of closedset speakers as well open-set speakers. As a syllable recognizer, our initial results on Indian television news bulletins of the the languages Tamil and Telugu shows that the performance is 43.3 % and 32.9 % respectively. 1

CiteSeerX