28,568 research outputs found

    Articulatory and bottleneck features for speaker-independent ASR of dysarthric speech

    Full text link
    The rapid population aging has stimulated the development of assistive devices that provide personalized medical support to the needies suffering from various etiologies. One prominent clinical application is a computer-assisted speech training system which enables personalized speech therapy to patients impaired by communicative disorders in the patient's home environment. Such a system relies on the robust automatic speech recognition (ASR) technology to be able to provide accurate articulation feedback. With the long-term aim of developing off-the-shelf ASR systems that can be incorporated in clinical context without prior speaker information, we compare the ASR performance of speaker-independent bottleneck and articulatory features on dysarthric speech used in conjunction with dedicated neural network-based acoustic models that have been shown to be robust against spectrotemporal deviations. We report ASR performance of these systems on two dysarthric speech datasets of different characteristics to quantify the achieved performance gains. Despite the remaining performance gap between the dysarthric and normal speech, significant improvements have been reported on both datasets using speaker-independent ASR architectures.Comment: to appear in Computer Speech & Language - https://doi.org/10.1016/j.csl.2019.05.002 - arXiv admin note: substantial text overlap with arXiv:1807.1094

    Automated assessment of second language comprehensibility: Review, training, validation, and generalization studies

    Get PDF
    Whereas many scholars have emphasized the relative importance of comprehensibility as an ecologically valid goal for L2 speech training, testing, and development, eliciting listeners’ judgments is time-consuming. Following calls for research on more efficient L2 speech rating methods in applied linguistics, and growing attention toward using machine learning on spontaneous unscripted speech in speech engineering, the current study examined the possibility of establishing quick and reliable automated comprehensibility assessments. Orchestrating a set of phonological (maximum posterior probabilities and gaps between L1 and L2 speech), prosodic (pitch and intensity variation), and temporal measures (articulation rate, pause frequency), the regression model significantly predicted how naïve listeners intuitively judged low, mid, high, and nativelike comprehensibility among 100 L1 and L2 speakers’ picture descriptions. The strength of the correlation (r = .823 for machine vs. human ratings) was comparable to naïve listeners’ interrater agreement (r = .760 for humans vs. humans). The findings were successfully replicated when the model was applied to a new dataset of 45 L1 and L2 speakers (r = .827) and tested under a more freely constructed interview task condition (r = .809)

    The Impact of Fast ForWord on Phonological Awareness and Literacy Skills

    Get PDF
    This study investigated the impact of Fast ForWord on individual children\u27s phonological awareness and reading skills as well as general language and auditory processing skills. Five children, ages six through eight years, served as experimental subjects. The duration of participation in the Fast ForWord (FFW) training program was approximately 2 hours per day, 5 days a week, for 6 to 8 weeks. Fast ForWord training was completed when the subject reached at least 90 percent completion on five of the seven training exercises or when it was determined by the Fast ForWord professional that the child had received maximum benefit from the program. Three children, within the same age range, served as control subjects and did not receive any type of speech or language training. Results indicated that significant group mean gains (a minimal increase of one standard deviation) were not evidenced by the experimental subjects on any of the five assessment measures. The largest mean standard score increases were noted on the Language Processing Test-Revised (8 points) and the Test of Language Development-Primary:2 (4 points). Individually, two subjects increased standard scores by a minimum of one standard deviation. None of the five children reached the FFW completion rate of 90% completion on five of the seven games. Post test data for the control subjects revealed a similar, slightly larger increase in standard scores as for the experimental subjects
    • …
    corecore