431 research outputs found

    Systematic Review of Machine Learning Approaches for Detecting Developmental Stuttering

    Get PDF
    A systematic review of the literature on statistical and machine learning schemes for identifying symptoms of developmental stuttering from audio recordings is reported. Twenty-seven papers met the quality standards that were set. Comparison of results across studies was not possible because training and testing data, model architecture and feature inputs varied across studies. The limitations that were identified for comparison across studies included: no indication of application for the work, data were selected for training and testing models in ways that could lead to biases, studies used different datasets and attempted to locate different symptom types, feature inputs were reported in different ways and there was no standard way of reporting performance statistics. Recommendations were made about how these problems can be addressed in future work on this topic

    Virtual Reality Technology and Speech Analysis for People Who Stutter

    Get PDF
    Virtual reality (VR) technology provides an interactive computer-generated experience that artificially simulates real-life situations by creating a virtual environment that looks real and stimulates the user’s feelings. During the past few years, the use of VR technology in clinical interventions for assessment, rehabilitation and treatment have received increased attention. Accordingly, many clinical studies and applications have been proposed in the field of mental health, including anxiety disorders. Stuttering is a speech disorder in which affected individuals have a problem with the flow of speech. This can manifest in the repetition and prolongation of words or phrases, as well as in involuntary silent pauses or blocks during which the individual is unable to produce sounds. Stuttering is often accompanied by a social anxiety disorder as a secondary symptom, which requires separate treatment. In this study, we evaluated the effectiveness of using a VR environment as a medium for presenting speech training tasks. In addition, we evaluated the accuracy of a speech analyzer module in detecting stuttering events

    The Infiltration of Hedging and Rising Terminal Pitch in an ASL to English Interpreted Presentation

    Get PDF
    Interpreting from ASL to English requires an interpreter to accurately portray a deaf individual’s character, personality, and language. This representation is accomplished through the language the interpreter uses and their way of speaking. When an interpreter’s own language infiltrates an ASL to English interpretation, this can lead to a misrepresentation of the deaf individual and provide a poor perception by a hearing audience. This research examines common weak language occurrences of hedging (e.g., you know, kind of, so) and rising terminal pitch (ending clauses or statements with an upward inflection). A linguistic analysis of an ASL to English interpretation was conducted to find influencing factors that lead the participant to employ hedging strategies and/or utilize rising terminal pitch within the interpretation. This research impacts current interpreters and interpreter training programs by providing opportunities for further training and intentional practice targeting perceptual challenges caused by hedging and rising terminal pitch. The analysis allows for future research to further assess the influencing factors of an interpreter’s language patterns and the impact had on effectively representing deaf individuals through ASL to English interpreting

    Automatic Framework to Aid Therapists to Diagnose Children who Stutter

    Get PDF

    Filler Word Detection and Classification: A Dataset and Benchmark

    Full text link
    Filler words such as `uh' or `um' are sounds or words people use to signal they are pausing to think. Finding and removing filler words from recordings is a common and tedious task in media editing. Automatically detecting and classifying filler words could greatly aid in this task, but few studies have been published on this problem. A key reason is the absence of a dataset with annotated filler words for training and evaluation. In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions. We propose a pipeline that leverages VAD and ASR to detect filler candidates and a classifier to distinguish between filler word types. We evaluate our proposed pipeline on PodcastFillers, compare to several baselines, and present a detailed ablation study. In particular, we evaluate the importance of using ASR and how it compares to a transcription-free approach resembling keyword spotting. We show that our pipeline obtains state-of-the-art results, and that leveraging ASR strongly outperforms a keyword spotting approach. We make PodcastFillers publicly available, and hope our work serves as a benchmark for future research.Comment: Submitted to Insterspeech 202

    DETECTION OF FILLERS IN THE SPEECH BY PEOPLE WHO STUTTER

    Get PDF
    Stuttering is a speech impediment that is a very complex disorder. It is difficult to diagnose and treat, and is of unknown initiation, despite the large number of studies in this field. Stuttering can take many forms and varies from person to person, and it can change under the influence of external factors. Diagnosing and treating speech disorders such as stuttering requires from a speech therapist, not only good profes-sional preparation, but also experience gained through research and practice in the field. The use of acoustic methods in combination with elements of artificial intelligence makes it possible to objectively assess the disorder, as well as to control the effects of treatment. The main aim of the study was to present an algorithm for automatic recognition of fillers disfluency in the statements of people who stutter. This is done on the basis of their parameterized features in the amplitude-frequency space. The work provides as well, exemplary results demonstrating their possibility and effectiveness. In order to verify and optimize the procedures, the statements of seven stutterers with duration of 2 to 4 minutes were selected. Over 70% efficiency and predictability of automatic detection of these disfluencies was achieved. The use of an automatic method in conjunction with therapy for a stuttering person can give us the opportunity to objectively assess the disorder, as well as to evaluate the progress of therapy

    Detecting stuttering events in transcripts of children’s speech

    Get PDF
    Stuttering is a common problem in childhood that may persist into adulthood if not treated in early stages. Techniques from spoken language understanding may be applied to provide automated diagnosis of stuttering from children speech. The main challenges however lie in the lack of training data and the high dimensionality of this data. This study investigates the applicability of machine learning approaches for detecting stuttering events in transcripts. Two machine learning approaches were applied, namely HELM and CRF. The performance of these two approaches are compared, and the effect of data augmentation is examined in both approaches. Experimental results show that CRF outperforms HELM by 2.2% in the baseline experiments. Data augmentation helps improve systems performance, especially for rarely available events. In addition to the annotated augmented data, this study also adds annotated human transcriptions from real stuttered children’s speech to help expand the research in this field

    Bridging the gap in internet treatments for mental health: A fully automated online cognitive behaviour therapy for social anxiety for those who stutter.

    Get PDF
    Introduction: CBTpsych.com is a fully functional intervention that aims at bridging the gap between Eliza (Weizenbaum, 1966) and modern internet treatments for anxiety disorders (Helgadottir, Menzies, Onslow, Packman & O‟Brien, 2009a). A Phase I trial demonstrated that two participants no longer met the diagnosis of social phobia on the DSM-IV and ICD-10 after being treated by CBTpsych.com. The quality of the interaction appeared to be similar to face-to-face therapy. The automated techniques were successful in engaging the participants and in encouraging them to log on regularly and complete the treatment (Helgadottir, Menzies, Onslow, Packman & O‟Brien, 2009b). The current study is a phase II trial targeting social anxiety in stuttering. Method: File audit data were gathered from 10 years of experience in the psychological management of social anxiety and stuttering (St Clare et al., 2008). This data was used to create a fully automated human-like intervention using algorithms alone. CBTpsych.com creates cognitive restructuring exercises, behavioural experiments, and negative thoughts checklists for the automated treatment program. Furthermore, other features such as imagery rescripting were incorporated to adhere to the Clark D. M. and Wells (1995) model of social anxiety treatment. 18 participants presenting for treatment of stuttering with social anxiety at the Australian Stuttering Research Centre (ASRC) were offered 5 months access to the “computer psychologist”. 16 participants started using the computer program. The treatment did not involve any contact with clinical psychologists at the ASRC. Results: Post-treatment analysis revealed that of the 16 participants who started using the program, 78% no longer met the criteria for social phobia. Furthermore, none of the 10 participants who completed all 7 sections of the computer program within 5 months had social phobia diagnosis at post-treatment. Consequently, in this study the “computer psychologist” was able to identify the specific problem areas participants reported and to design individualized formulations and tailored treatment components with corrective feedback. Furthermore, participants were able to engage in highly specific cognitive restructuring exercises, and with the help of the computer psychologist managed to build behavioural experiments to test out unhelpful cognitions. Data from this Phase II trial suggest that the computerised treatment protocol can lead to substantial reductions in anxiety and avoidance, and improve mood and quality of life as measured by the FNE, SPAI, BDI-II, UTBAS, SASS and OASES. Conclusions: The computer psychologist motivated people both to log on regularly (with use of an automated email process) and to complete the program. It engaged clients in detailed cognitive restructuring work, creating successful behavioural experiments for exposure, and producing promising preliminary results. However, these results need to be tested in a randomized controlled trial to establish efficacy. A RCT is currently under way. The preliminary findings presented here suggest that it would be viable to develop this type of treatment intervention for other anxiety disorders. Limitations of this approach and suggestions for future research are discussed

    A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem

    Full text link
    Most stuttering detection and classification research has viewed stuttering as a multi-class classification problem or a binary detection task for each dysfluency type; however, this does not match the nature of stuttering, in which one dysfluency seldom comes alone but rather co-occurs with others. This paper explores multi-language and cross-corpus end-to-end stuttering detection as a multi-label problem using a modified wav2vec 2.0 system with an attention-based classification head and multi-task learning. We evaluate the method using combinations of three datasets containing English and German stuttered speech, one containing speech modified by fluency shaping. The experimental results and an error analysis show that multi-label stuttering detection systems trained on cross-corpus and multi-language data achieve competitive results but performance on samples with multiple labels stays below over-all detection results.Comment: Accepted for presentation at Interspeech 2023. arXiv admin note: substantial text overlap with arXiv:2210.1598
    • 

    corecore