831 research outputs found

    Language Model Combination and Adaptation Using Weighted Finite State Transducers

    Get PDF
    In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaption may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequence

    Improving lightly supervised training for broadcast transcription

    Get PDF
    This paper investigates improving lightly supervised acoustic model training for an archive of broadcast data. Standard lightly supervised training uses automatically derived decoding hypotheses using a biased language model. However, as the actual speech can deviate significantly from the original programme scripts that are supplied, the quality of standard lightly supervised hypotheses can be poor. To address this issue, word and segment level combination approaches are used between the lightly supervised transcripts and the original programme scripts which yield improved transcriptions. Experimental results show that systems trained using these improved transcriptions consistently outperform those trained using only the original lightly supervised decoding hypotheses. This is shown to be the case for both the maximum likelihood and minimum phone error trained systems.The research leading to these results was supported by EPSRC Programme Grant EP/I031022/1 (Natural Speech Technology).This is the accepted manuscript version. The final version is available at http://www.isca-speech.org/archive/interspeech_2013/i13_2187.html

    Effect of Sitting Posture on Development of Scoliosis in Duchenne Muscular Dystrophy Cases

    Get PDF
    Background: Scoliosis is a frequent association in boys with Duchenne Muscular Dystrophy when the ability to walk is lost around nine to 12 years of age. This study assessed the contribution of physical factors including lumbar posture to scoliosis in non-ambulatory youth with DMD in Nepal. Methods: Linear regression was used to assess effects of time since loss of ambulation, muscle strength, functional severity and lumbar angle as a binary variable on coronal Cobb angle; again logistic regression was used to assess effects of muscle strength and cross-legged sitting on the presence of a lordotic lumbar posture in 22 non-ambulant boys and young men. Results: The boys and young men had a mean (SD) age of 15.1 (4.0) years, had been non-ambulant for 48.6 (33.8) months and used a median of 3.5 (range 2 to 7) postures a day. The mean Cobb angle was 15.1 (range 0 to 70) degrees. Optimal accuracy in predicting scoliosis was obtained with a lumbar angle of -6° as measured by skin markers, and both a lumbar angle ≀-6° (P=0.112) and better functional ability (P=0.102) were associated with less scoliosis. Use of cross-legged sitting postures during the day was associated with a lumbar angle ≀-6° (OR 0.061; 95% CI 0.005 - 0.672; P=0.022). Conclusions: Use of cross-legged sitting posture was associated with increase in lumbar lordosis. Higher angle of lumbar lordosis and better functional ability are associated with lesser degree of scoliosis

    All Politics is Local: The Renminbi's Prospects as a Future Global Currency

    Get PDF
    . In this article we describe methods for improving the RWTH German speech recognizer used within the VERBMOBIL project. In particular, we present acceleration methods for the search based on both within-word and across-word phoneme models. We also study incremental methods to reduce the response time of the online speech recognizer. Finally, we present experimental off-line results for the three VERBMOBIL scenarios. We report on word error rates and real-time factors for both speaker independent and speaker dependent recognition. 1 Introduction The goal of the VERBMOBIL project is to develop a speech-to-speech translation system that performs close to real-time. In this system, speech recognition is followed by subsequent VERBMOBIL modules (like syntactic analysis and translation) which depend on the recognition result. Therefore, in this application it is particularly important to keep the recognition time as short as possible. There are VERBMOBIL modules which are capable to work ..

    The MGB Challenge: Evaluating Multi-genre Broadcast Media Recognition

    Get PDF
    This paper describes the Multi-Genre Broadcast (MGB) Challenge at ASRU 2015, an evaluation focused on speech recognition, speaker diarization, and "lightly supervised" alignment of BBC TV recordings. The challenge training data covered the whole range of seven weeks BBC TV output across four channels, resulting in about 1,600 hours of broadcast audio. In addition several hundred million words of BBC subtitle text was provided for language modelling. A novel aspect of the evaluation was the exploration of speech recognition and speaker diarization in a longitudinal setting - i.e. recognition of several episodes of the same show, and speaker diarization across these episodes, linking speakers. The longitudinal tasks also offered the opportunity for systems to make use of supplied metadata including show title, genre tag, and date/time of transmission. This paper describes the task data and evaluation process used in the MGB challenge, and summarises the results obtained
    • 

    corecore