1,251 research outputs found

    A scientific and clinical evaluation of dental implants placed in compromised bone volume

    Get PDF

    Predictive Values of Factors Affecting Reading Comprehension Assessment

    Get PDF
    The complex nature of reading comprehension makes it challenging to assess. Comprehension assessment results frequently do not directly indicate which skills should be addressed to remediate comprehension difficulties. The current study investigated which underlying skills are most related to a common reading comprehension test. The reading skills that were measured in this study include single-word reading accuracy, single-word reading fluency, textual reading fluency and accuracy, oral reading comprehension abilities, and silent reading comprehension abilities. The findings indicate that reading rate is associated with oral reading comprehension abilities and that word reading accuracy is important for oral reading comprehension abilities

    Articulatory features for conversational speech recognition

    Get PDF

    Relationships between cognitive status, speech impairment and communicative participation in Parkinson’s disease

    Get PDF
    Aim: To assess the relationships between cognitive status, speech impairment and communicative participation in Parkinson’s disease. Introduction: Speech and communication difficulties, as well as cognitive impairment, are prevalent in Parkinson’s. The contributions of cognitive impairment and acoustic speech characteristics remain equivocal. Relationships between Impairment and Participation levels of the International Classification of Functioning, Disability and Health (ICF) have not been thoroughly investigated. Methods: 45 people with Parkinson’s and 29 familiar controls performed read, mood and conversational speech tasks as part of a multimethod investigation. Data analysis formed three main parts. Depression, cognition and communication were assessed using questionnaires. Phonetic analysis was used to produce an acoustic characterisation of speech. Listener assessment was used to assess conveyance of emotion and intelligibility. Qualitative Content Analysis was used to provide a participant’s insight into speech and communicative difficulties associated with Parkinson’s disease. Results: Cognitive status was significantly associated with certain read speech acoustic characteristics, emotional conveyance and communicative participation. No association was found with intelligibility or conversational speech acoustic characteristics. The only acoustic speech characteristics that predicted intelligibility were intensity and pause in the read speech condition. The contribution of intelligibility to communicative participation was modest. People with Parkinson’s disease reported a range of psychosocial, cognitive and physical factors affecting their speech and communication. Conclusions: I provide evidence for a role for cognitive status in emotional conveyance and communicative participation, but not necessarily general speech production, in Parkinson’s disease. I demonstrate that there may not be a strong relationship between ICF Impairment level speech measures and functional measures of communication. I also highlight the distinction between measures of communication at the ICF Activity and Participation levels. This study demonstrates that reduced participation in everyday communication in Parkinson’s disease appears to result from a complex interplay of physical, cognitive and psychosocial factors. Further research is required to apply these findings to contribute to future advances in speech and language therapy for Parkinson’s disease

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Phonological and Speech Motor Abilities in Children with Childhood Apraxia of Speech and Phonological Disorder

    Get PDF
    This thesis investigated whether childhood apraxia of speech (CAS) differs from phonological disorder (PD) regarding their causal origin. After developing and validating measures targeting components of phonology and speech motor control, we explored if speech motor ability constrained phonological development in CAS more than in PD. This thesis demonstrated children with CAS show a distinct profile of speech impairments but little evidence that their motor deficit constrains phonological development in a way distinct from PD

    Celebrating 50 years of ACAL

    Get PDF
    The papers in this volume were presented at the 50th Annual Conference on African Linguistics held at the University of British Columbia in 2019. The contributions span a range of theoretical topics as well as topics in descriptive and applied linguistics. The papers reflect the typological and genetic diversity of languages in Africa and also represent the breadth of the ACAL community, with papers from both students and more senior scholars, based in North America and beyond. They thus provide a snapshot on current research in African linguistics, from multiple perspectives. To mark the 50th anniversary of the conference, the volume editors reminisce, in the introductory chapter, about their memorable ACALs

    Discriminative and adaptive training for robust speech recognition and understanding

    Get PDF
    Robust automatic speech recognition (ASR) and understanding (ASU) under various conditions remains to be a challenging problem even with the advances of deep learning. To achieve robust ASU, two discriminative training objectives are proposed for keyword spotting and topic classification: (1) To accurately recognize the semantically important keywords, the non-uniform error cost minimum classification error training of deep neural network (DNN) and bi-directional long short-term memory (BLSTM) acoustic models is proposed to minimize the recognition errors of only the keywords. (2) To compensate for the mismatched objectives of speech recognition and understanding, minimum semantic error cost training of the BLSTM acoustic model is proposed to generate semantically accurate lattices for topic classification. Further, to expand the application of the ASU system to various conditions, four adaptive training approaches are proposed to improve the robustness of the ASR under different conditions: (1) To suppress the effect of inter-speaker variability on speaker-independent DNN acoustic model, speaker-invariant training is proposed to learn a deep representation in the DNN that is both senone-discriminative and speaker-invariant through adversarial multi-task training (2) To achieve condition-robust unsupervised adaptation with parallel data, adversarial teacher-student learning is proposed to suppress multiple factors of condition variability in the procedure of knowledge transfer from a well-trained source domain LSTM acoustic model to the target domain. (3) To further improve the adversarial learning for unsupervised adaptation with unparallel data, domain separation networks are used to enhance the domain-invariance of the senone-discriminative deep representation by explicitly modeling the private component that is unique to each domain. (4) To achieve robust far-field ASR, an LSTM adaptive beamforming network is proposed to estimate the real-time beamforming filter coefficients to cope with non-stationary environmental noise and dynamic nature of source and microphones positions.Ph.D

    A Comparative Study of Spectral Peaks Versus Global Spectral Shape as Invariant Acoustic Cues for Vowels

    Get PDF
    The primary objective of this study was to compare two sets of vowel spectral features, formants and global spectral shape parameters, as invariant acoustic cues to vowel identity. Both automatic vowel recognition experiments and perceptual experiments were performed to evaluate these two feature sets. First, these features were compared using the static spectrum sampled in the middle of each steady-state vowel versus features based on dynamic spectra. Second, the role of dynamic and contextual information was investigated in terms of improvements in automatic vowel classification rates. Third, several speaker normalizing methods were examined for each of the feature sets. Finally, perceptual experiments were performed to determine whether vowel perception is more correlated with formants or global spectral shape. Results of the automatic vowel classification experiments indicate that global spectral shape features contain more information than do formants. For both feature sets, dynamic features are superior to static features. Spectral features spanning a time interval beginning with the start of the on-glide region of the acoustic vowel segment and ending at the end of the off-glide region of the acoustic vowel segment are required for maximum vowel recognition accuracy. Speaker normalization of both static and dynamic features can also be used to improve the automatic vowel recognition accuracy. Results of the perceptual experiments with synthesized vowel segments indicate that if formants are kept fixed, global spectral shape can, at least for some conditions, be modified such that the synthetic speech token will be perceived according to spectral shape cues rather than formant cues. This result implies that overall spectral shape may be more important perceptually than the spectral prominences represented by the formants. The results of this research contribute to a fundamental understanding of the information-encoding process in speech. The signal processing techniques used and the acoustic features found in this study can also be used to improve the preprocessing of acoustic signals in the front-end of automatic speech recognition systems

    Connectionist multivariate density-estimation and its application to speech synthesis

    Get PDF
    Autoregressive models factorize a multivariate joint probability distribution into a product of one-dimensional conditional distributions. The variables are assigned an ordering, and the conditional distribution of each variable modelled using all variables preceding it in that ordering as predictors. Calculating normalized probabilities and sampling has polynomial computational complexity under autoregressive models. Moreover, binary autoregressive models based on neural networks obtain statistical performances similar to that of some intractable models, like restricted Boltzmann machines, on several datasets. The use of autoregressive probability density estimators based on neural networks to model real-valued data, while proposed before, has never been properly investigated and reported. In this thesis we extend the formulation of neural autoregressive distribution estimators (NADE) to real-valued data; a model we call the real-valued neural autoregressive density estimator (RNADE). Its statistical performance on several datasets, including visual and auditory data, is reported and compared to that of other models. RNADE obtained higher test likelihoods than other tractable models, while retaining all the attractive computational properties of autoregressive models. However, autoregressive models are limited by the ordering of the variables inherent to their formulation. Marginalization and imputation tasks can only be solved analytically if the missing variables are at the end of the ordering. We present a new training technique that obtains a set of parameters that can be used for any ordering of the variables. By choosing a model with a convenient ordering of the dimensions at test time, it is possible to solve any marginalization and imputation tasks analytically. The same training procedure also makes it practical to train NADEs and RNADEs with several hidden layers. The resulting deep and tractable models display higher test likelihoods than the equivalent one-hidden-layer models for all the datasets tested. Ensembles of NADEs or RNADEs can be created inexpensively by combining models that share their parameters but differ in the ordering of the variables. These ensembles of autoregressive models obtain state-of-the-art statistical performances for several datasets. Finally, we demonstrate the application of RNADE to speech synthesis, and confirm that capturing the phone-conditional dependencies of acoustic features improves the quality of synthetic speech. Our model generates synthetic speech that was judged by naive listeners as being of higher quality than that generated by mixture density networks, which are considered a state-of-the-art synthesis techniqu
    • …
    corecore