36,232 research outputs found

    Improved status following behavioural intervention in a case of severe dysarthria with stroke aetiology

    Get PDF
    There is little published intervention outcome literature concerning dysarthria acquired from stroke. Single case studies have the potential to provide more detailed specification and interpretation than is generally possible with larger participant numbers and are thus informative for clinicians who may deal with similar cases. Such research also contributes to the future planning of larger scale investigations. Behavioural intervention is described which was carried out with a man with severe dysarthria following stroke, beginning at seven and ending at nine months after stroke. Pre-intervention stability between five and seven months contrasted with significant improvements post-intervention on listener-rated measures of word and reading intelligibility and communication effectiveness in conversation. A range of speech analyses were undertaken (comprising of rate, pause and intonation characteristics in connected speech and phonetic transcription of single word production), with the aim of identifying components of speech which might explain the listeners’ perceptions of improvement. Pre- and post intervention changes could be detected mainly in parameters related to utterance segmentation and intonation. The basis of improvement in dysarthria following intervention is complex, both in terms of the active therapeutic dimensions and also the specific speech alterations which account for changes to intelligibility and effectiveness. Single case results are not necessarily generalisable to other cases and outcomes may be affected by participant factors and therapeutic variables, which are not readily controllable

    Using word graphs as intermediate representation of uttered sentences

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-33275-3_35We present an algorithm for building graphs of words as an intermediate representation of uttered sentences. No language model is used. The input data for the algorithm are the pronunciation lexicon organized as a tree and the sequence of acoustic frames. The transition between consecutive units are considered as additional units. Nodes represent discrete instants of time, arcs are labelled with words, and a confidence measure is assigned to each detected word, which is computed by using the phonetic probabilities of the subsequence of acoustic frames used for completing the word. We evaluated the obtained word graphs by searching the path that best matches with the correct sentence and then measuring the word accuracy, i.e. the oracle word accuracy. © 2012 Springer-Verlag.This work was supported by the Spanish MICINN under contract TIN2011-28169-C05-01 and the Vic. d’Investigació of the UPV under contract 20110897.Gómez Adrian, JA.; Sanchís Arnal, E. (2012). Using word graphs as intermediate representation of uttered sentences. En Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Springer Verlag (Germany). 284-291. doi:10.1007/978-3-642-33275-3_35S284291Ortmanns, S., Ney, H., Aubert, X.: A word graph algorithm for large vocabulary continuous speech recognition. Computer Speech and Language 11, 43–72 (1997)Ney, H., Ortmanns, S., Lindam, I.: Extensions to the word graph method for large vocabulary continuous speech recognition. In: Proceedings of IEEE ICASSP 1997, Munich, Germany, vol. 3, pp. 1791–1794 (1997)Wessel, F., Schlüter, R., Macherey, K., Ney, H.: Confidence Measures for Large Vocabulary Continuous Speech Recognition. IEEE Transactions on Speech and Audio Processing 9(3), 288–298 (2001)Ferreiros, J., San-Segundo, R., Fernández, F., D’Haro, L.-F., Sama, V., Barra, R., Mellén, P.: New word-level and sentence-level confidence scoring using graph theory calculus and its evaluation on speech understanding. In: Proceedings of INTERSPEECH 2005, Lisbon, Portugal, pp. 3377–3380 (2005)Raymond, C., Béchet, F., De Mori, R., Damnati, G.: On the use of finite state transducers for semantic interpretation. Speech Communication 48, 288–304 (2006)Hakkani-Tür, D., Béchet, F., Riccardi, G., Tur, G.: Beyond ASR 1-best: Using word confusion networks in spoken language understanding. Computer Speech and Language 20, 495–514 (2006)Justo, R., Pérez, A., Torres, M.I.: Impact of the Approaches Involved on Word-Graph Derivation from the ASR System. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds.) IbPRIA 2011. LNCS, vol. 6669, pp. 668–675. Springer, Heidelberg (2011)Gómez, J.A., Calvo, M.: Improvements on Automatic Speech Segmentation at the Phonetic Level. In: San Martin, C., Kim, S.-W. (eds.) CIARP 2011. LNCS, vol. 7042, pp. 557–564. Springer, Heidelberg (2011)Calvo, M., Gómez, J.A., Sanchis, E., Hurtado, L.F.: An algorithm for automatic speech understanding over word graphs. Procesamiento del Lenguaje Natural (48) (accepted, pending of publication, 2012)Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Mariño, J.B., Nadeu, C.: Albayzin Speech Database: Design of the Phonetic Corpus. In: Proceedings of Eurospeech, Berlin, Germany, vol. 1, pp. 653–656 (September 1993)Benedí, J.M., Lleida, E., Varona, A., Castro, M., Galiano, I., Justo, R., López, I., Miguel, A.: Design and acquisition of a telephone spontaneous speech dialogue corpus in Spanish: DIHANA. In: Proc. of LREC 2006, Genova, Italy (2006

    Measuring Mimicry in Task-Oriented Conversations: The More the Task is Difficult, The More we Mimick our Interlocutors

    Get PDF
    The tendency to unconsciously imitate others in conversations is referred to as mimicry, accommodation, interpersonal adap- tation, etc. During the last years, the computing community has made significant efforts towards the automatic detection of the phenomenon, but a widely accepted approach is still miss- ing. Given that mimicry is the unconscious tendency to imitate others, this article proposes the adoption of speaker verification methodologies that were originally conceived to spot people trying to forge the voice of others. Preliminary experiments suggest that mimicry can be detected by measuring how much speakers converge or diverge with respect to one another in terms of acoustic evidence. As a validation of the approach, the experiments show that convergence (the speakers become more similar in terms of acoustic properties) tends to appear more frequently when a task is difficult and, therefore, requires more time to be addressed

    Does training with amplitude modulated tones affect tone-vocoded speech perception?

    Get PDF
    Temporal-envelope cues are essential for successful speech perception. We asked here whether training on stimuli containing temporal-envelope cues without speech content can improve the perception of spectrally-degraded (vocoded) speech in which the temporal-envelope (but not the temporal fine structure) is mainly preserved. Two groups of listeners were trained on different amplitude-modulation (AM) based tasks, either AM detection or AM-rate discrimination (21 blocks of 60 trials during two days, 1260 trials; frequency range: 4Hz, 8Hz, and 16Hz), while an additional control group did not undertake any training. Consonant identification in vocoded vowel-consonant-vowel stimuli was tested before and after training on the AM tasks (or at an equivalent time interval for the control group). Following training, only the trained groups showed a significant improvement in the perception of vocoded speech, but the improvement did not significantly differ from that observed for controls. Thus, we do not find convincing evidence that this amount of training with temporal-envelope cues without speech content provide significant benefit for vocoded speech intelligibility. Alternative training regimens using vocoded speech along the linguistic hierarchy should be explored

    Towards an artificial therapy assistant: Measuring excessive stress from speech

    Get PDF
    The measurement of (excessive) stress is still a challenging endeavor. Most tools rely on either introspection or expert opinion and are, therefore, often less reliable or a burden on the patient. An objective method could relieve these problems and, consequently, assist diagnostics. Speech was considered an excellent candidate for an objective, unobtrusive measure of emotion. True stress was successfully induced, using two storytelling\ud sessions performed by 25 patients suffering from a stress disorder. When reading either a happy or a sad story, different stress levels were reported using the Subjective Unit of Distress (SUD). A linear regression model consisting of the high-frequency energy, pitch, and zero crossings of the speech signal was able to explain 70% of the variance in the subjectively reported stress. The results demonstrate the feasibility of an objective measurement of stress in speech. As such, the foundation for an Artificial Therapeutic Agent is laid, capable of assisting therapists through an objective measurement of experienced stress

    Tone-in-noise detection deficits in elderly patients with clinically normal hearing

    Get PDF
    One of the most common complaints among the elderly is the inability to understand speech in noisy environments. In many cases, these deficits are due to age-related hearing loss; however, some of the elderly that have difficulty hearing in noise have clinically normal pure-tone thresholds. While speech in noise testing is informative, it fails to identify specific frequencies responsible for the speech processing deficit. Auditory neuropathy patients and animal models of hidden hearing loss suggest that tone-in-noise thresholds may provide frequency specific information for those patients who express difficulty, but have normal thresholds in quiet. Therefore, we aimed to determine if tone-in-noise thresholds could be a useful measure in detecting age-related hearing deficits, despite having normal audiometric thresholds
    corecore