43 research outputs found

    A Posterior-Based Multistream Formulation for G2P Conversion

    Get PDF

    An HMM-Based Formalism for Automatic Subword Unit Derivation and Pronunciation Generation

    Get PDF
    We propose a novel hidden Markov model (HMM) formalism for automatic derivation of subword units and pronunciation generation using only transcribed speech data. In this approach, the subword units are derived from the clustered context-dependent units in a grapheme based system using maximum-likelihood criterion. The subword unit based pronunciations are then learned in the framework of Kullback-Leibler divergence based HMM. The automatic speech recognition (ASR) experiments on WSJ0 English corpus show that the approach leads to 12.7 % relative reduction in word error rate compared to grapheme-based system. Our approach can be bene-ficial in reducing the need for expert knowledge in development of ASR as well as text-to-speech systems. Index Terms — automatic subword unit derivation, pronuncia-tion generation, hidden Markov model, Kullback-Leibler divergence based hidden Markov model 1

    Predictive Value of Risk Factors for Chronic Idiopathic Thrombocytopenic Purpura in Patients with Acute Type of Disease

    Get PDF
    BACKGROUND: Immune thrombocytopenic purpura (ITP) is an autoimmune disease in which autoantibodies react with platelet surface antigens and results in mild to severe thrombocytopenia due to decreased platelet count or inhibition of platelet production. Given the relatively high prevalence of ITP among children and the lack of standard diagnostic testing for the diagnosis of chronic disease, this study evaluated the predictive value of risk factors for chronic ITP in hospitalized patients. METHODS: This prospective cohort study was performed on 65 children with ITP who referred to Ali Asghar and Rasool Akram Hospitals in Tehran, Iran, during the years 2017 and 2018. Relationships between different risk factors, including age of diagnosis, gender, white cell count, primary platelet count, mean platelet volume (MPV), history and type of the previous patient infection, FCG gene mutation, and type of FCG mutation with a chronic disease incidence were investigated using multiple logistic regression model. RESULTS: Of 65 patients, 31 (47.69%) were male and 34 (52.31%) were female included in the study. Twenty-eight patients (43.08%) had acute ITP and 37 (56.92%) had chronic ITP. Frequency of FCG gene mutation in patients with chronic and acute type ITP was 16.36% and 7.27%, respectively (p = 0.51). No association was found between the history of the previous infection and its type with the chronic incidence of ITP. The multiple logistic regression model showed that three factors, including the absolute number of lymphocytes, age of diagnosis, and primary white blood cells (WBC) count were directly linked to chronic ITP. Furthermore, three factors of platelet, sex, and MPV were indirectly related to chronic ITP. In addition, the absolute number of lymphocytes, age of diagnosis and primary WBC count were significantly associated with chronic ITP. The receiver operating characteristic analysis showed that the cutoff rate of these factors was 0.31. Further analysis of these risk factors in comparison with the gold standard demonstrated that the diagnostic sensitivity and specificity of these risk factors for chronic ITP were 73.08% and their specificity was 88.57%, indicating the high importance and predictive power of these risk factors. CONCLUSIONS: According to the results of this study, for the first time in Iran, six risk factors, including the absolute number of lymphocytes, age at diagnosis, sex, MPV level, platelet level at time of diagnosis, and primary WBC count were considered as the most important risk factors affecting the incidence of chronic ITP. Of course, more comprehensive studies can definitely lead to more comprehensive models

    Integrated Pronunciation Learning for Automatic Speech Recognition Using Probabilistic Lexical Modeling

    Get PDF
    Standard automatic speech recognition (ASR) systems use phoneme-based pronunciation lexicon prepared by linguistic experts. When the hand crafted pronunciations fail to cover the vocabulary of a new domain, a grapheme-to-phoneme (G2P) converter is used to extract pronunciations for new words and then a phoneme- based ASR system is trained. G2P converters are typically trained only on the existing lexicons. In this paper, we propose a grapheme-based ASR approach in the framework of probabilistic lexical modeling that integrates pronunciation learning as a stage in ASR system training, and exploits both acoustic and lexical resources (not necessarily from the domain or language of interest). The proposed approach is evaluated on four lexical resource constrained ASR tasks and compared with the conventional two stage approach where G2P training is followed by ASR system development

    Acoustic Data-Driven Grapheme-to-Phoneme Conversion in the Probabilistic Lexical Modeling Framework

    Get PDF
    One of the primary steps in building automatic speech recognition (ASR) and text-to-speech systems is the development of a phonemic lexicon that provides a mapping between each word and its pronunciation as a sequence of phonemes. Phoneme lexicons can be developed by humans through use of linguistic knowledge, however, this would be a costly and time-consuming task. To facilitate this process, grapheme-to phoneme conversion (G2P) techniques are used in which, given an initial phoneme lexicon, the relationship between graphemes and phonemes is learned through data-driven methods. This article presents a novel G2P formalism which learns the grapheme-to-phoneme relationship through acoustic data and potentially relaxes the need for an initial phonemic lexicon in the target language. The formalism involves a training part followed by an inference part. In the training part, the grapheme-to-phoneme relationship is captured in a probabilistic lexical modeling framework. In this framework, a hidden Markov model (HMM) is trained in which each HMM state representing a grapheme is parameterized by a categorical distribution of phonemes. Then in the inference part, given the orthographic transcription of the word and the learned HMM, the most probable sequence of phonemes is inferred. In this article, we show that the recently proposed acoustic G2P approach in the Kullback Leibler divergence-based HMM (KL-HMM) framework is a particular case of this formalism. We then benchmark the approach against two popular G2P approaches, namely joint multigram approach and decision tree-based approach. Our experimental studies on English and French show that despite relatively poor performance at the pronunciation level, the performance of the proposed approach is not significantly different than the state-of-the-art G2P methods at the ASR level. (C) 2016 Elsevier B.V. All rights reserved

    Towards Weakly Supervised Acoustic Subword Unit Discovery and Lexicon Development Using Hidden Markov Models

    Get PDF
    Developing a phonetic lexicon for a language requires linguistic knowledge as well as human effort, which may not be available, particularly for under-resourced languages. An alternative to development of a phonetic lexicon is to automatically derive subword units using acoustic information and generate associated pronunciations. In the literature, this has been mostly studied from the pronunciation variation modeling perspective. In this article, we investigate automatic subword unit derivation from the under-resourced language point of view. Towards that, we present a novel hidden Markov model (HMM) formalism for automatic derivation of subword units and pronunciation generation using only transcribed speech data. In this approach, the subword units are derived from the clustered context-dependent units in a grapheme based system using the maximum-likelihood criterion. The subword unit based pronunciations are then generated either by deterministic or probabilistic learning of the relationship between the graphemes and the acoustic subword units (ASWUs). In this article, we first establish the proposed framework on a well resourced language by comparing it against related approaches in the literature and investigating the transferability of the derived subword units to other domains. We then show the scalability of the proposed approach on real under-resourced scenarios by conducting studies on Scottish Gaelic, a genuinely minority and endangered language, and comparing the approach against state-of-the-art grapheme-based approaches in under-resourced scenarios. Our experimental studies on English show that the derived subword units can not only lead to better ASR systems compared to graphemes, but can also be exploited to build out-of-domain ASR systems. The experimental studies on Scottish Gaelic show that the proposed ASWU-based lexicon development approach retains its dominance over grapheme-based lexicon. Alternately, the proposed approach yields significant gains in ASR performance, even when multilingual resources from resource-rich languages are exploited in the development of ASR systems
    corecore