90 research outputs found

    ELO-SPHERES intelligibility prediction model for the Clarity Prediction Challenge 2022

    Get PDF
    This paper describes and evaluates the ELO-SPHERES project sentence intelligibility model for the Clarity Prediction Challenge 2022. The aim of the model is to make predictions of the intelligibility of enhanced speech to hearing impaired listeners. Input to the model are binaural processed audio of short sentences generated in a simulated noisy and reverberant environment together with the original source audio. Output of the model is a prediction of the intelligibility of each sentence in terms of percentage words correct for a known hearing-impaired listener characterized by a pure-tone audiogram. Models are evaluated in terms of the root mean squared error of prediction. We approached this problem in three stages: (i) evaluation of the influences of the scene metadata on scores, (ii) construction of classifiers for estimation of scene metadata from audio, and (iii) training a non-linear regression model on the challenge data and evaluation using 5-fold cross validation. On the test data, a baseline system using only the standard short-time objective intelligibility metric on the better ear achieved a RMS prediction error of 27%, while our model that also took into account given and estimated scene data achieved an RMS error of 22%

    Automated Voice Pathology Discrimination from Continuous Speech Benefits from Analysis by Phonetic Context

    Get PDF
    In contrast to previous studies that look only at discriminating pathological voice from the normal voice, in this study we focus on the discrimination between cases of spasmodic dysphonia (SD) and vocal fold palsy (VP) using automated analysis of speech recordings. The hypothesis is that discrimination will be enhanced by studying continuous speech, since the different pathologies are likely to have different effects in different phonetic contexts. We collected audio recordings of isolated vowels and of a read passage from 60 patients diagnosed with SD (N=38) or VP (N=22). Baseline classifiers on features extracted from the recordings taken as a whole gave a cross-validated unweighted average recall of up to 75% for discriminating the two pathologies. We used an automated method to divide the read passage into phone-labelled regions and built classifiers for each phone. Results show that the discriminability of the pathologies varied with phonetic context as predicted. Since different phone contexts provide different information about the pathologies, classification is improved by fusing phone predictions, to achieve a classification accuracy of 83%. The work has implications for the differential diagnosis of voice pathologies and contributes to a better understanding of their impact on speech

    Avatar therapy for persecutory auditory hallucinations: What is it and how does it work?

    Get PDF
    We have developed a novel therapy based on a computer program, which enables the patient to create an avatar of the entity, human or non-human, which they believe is persecuting them. The therapist encourages the patient to enter into a dialogue with their avatar, and is able to use the program to change the avatar so that it comes under the patient's control over the course of six 30-min sessions and alters from being abusive to becoming friendly and supportive. The therapy was evaluated in a randomised controlled trial with a partial crossover design. One group went straight into the therapy arm: "immediate therapy". The other continued with standard clinical care for 7 weeks then crossed over into Avatar therapy: "delayed therapy". There was a significant reduction in the frequency and intensity of the voices and in their omnipotence and malevolence. Several individuals had a dramatic response, their voices ceasing completely after a few sessions of the therapy. The average effect size of the therapy was 0.8. We discuss the possible psychological mechanisms for the success of Avatar therapy and the implications for the origins of persecutory voices

    An Utterance Verification System for Word Naming Therapy in Aphasia

    Get PDF
    Anomia (word finding difficulties) is the hallmark of aphasia an acquired language disorder, most commonly caused by stroke. Assessment of speech performance using picture naming tasks is therefore a key method for identification of the disorder and monitoring patient’s response to treatment interventions. Currently, this assessment is conducted manually by speech and language therapists (SLT). Surprisingly, despite advancements in ASR and artificial intelligence with technologies like deep learning, research on developing automated systems for this task has been scarce. Here we present an utterance verification system incorporating a deep learning element that classifies ‘correct’/‘incorrect’ naming attempts from aphasic stroke patients. When tested on 8 native British-English speaking aphasics the system’s performance accuracy ranged between 83.6% to 93.6%, with a 10 fold cross validation mean of 89.5%. This performance was not only significantly better than one of the leading commercially available ASRs (Google speech-to-text service) but also comparable in some instances with two independent SLT ratings for the same dataset

    NUVA: A Naming Utterance Verifier for Aphasia Treatment

    Get PDF
    Anomia (word-finding difficulties) is the hallmark of aphasia, an acquired language disorder most commonly caused by stroke. Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA). Currently, this assessment is conducted manually by speech and language therapists (SLT). Surprisingly, despite advancements in automatic speech recognition (ASR) and artificial intelligence with technologies like deep learning, research on developing automated systems for this task has been scarce. Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus' incorrect' naming attempts from aphasic stroke patients. When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%. This performance was not only significantly better than a baseline created for this study using one of the leading commercially available ASRs (Google speech-to-text service) but also comparable in some instances with two independent SLT ratings for the same dataset

    Issues for eHealth in Psychiatry: Results of an Expert Survey

    Get PDF
    Background: Technology has changed the landscape in which psychiatry operates. Effective, evidence-based treatments for mental health care are now available at the fingertips of anyone with Internet access. However, technological solutions for mental health are not necessarily sought by consumers nor recommended by clinicians. Objective: The objectives of this study are to identify and discuss the barriers to introducing eHealth technology-supported interventions within mental health. Methods: An interactive polling tool was used to ask "In this brave new world, what are the key issues that need to be addressed to improve mental health (using technology)?" Respondents were the multidisciplinary attendees of the "Humans and Machines: A Quest for Better Mental Health" conference, held in Sydney, Australia, in 2016. Responses were categorized into 10 key issues using team-based qualitative analysis. Results: A total of 155 responses to the question were received from 66 audience members. Responses were categorized into 10 issues and ordered by importance: access to care, integration and collaboration, education and awareness, mental health stigma, data privacy, trust, understanding and assessment of mental health, government and policy, optimal design, and engagement. In this paper, each of the 10 issues are outlined, and potential solutions are discussed. Many of the issues were interrelated, having implications for other key areas identified. Conclusions: As many of the issues identified directly related to barriers to care, priority should be given to addressing these issues that are common across mental health delivery. Despite new challenges raised by technology, technology-supported mental health interventions represent a tremendous opportunity to address in a timely way these major concerns and improve the receipt of effective, evidence-based therapy by those in need.This study is supported by a grant from the National Health and Medical Research Council (NHMRC) and forms part of research conducted by the NHMRC Centre for Research Excellence in Suicide Prevention (CRESP; APP1042580). Additional support for the conference was provided by UNSW Brain Sciences. JN is supported by an Australian Postgraduate Award, ML is supported by a Society of Mental Health Research 2015 Early Career Research Award, and PJB is supported by NHMRC Fellowship 1083311

    The voice characterisation checklist: psychometric properties of a brief clinical assessment of voices as social agents

    Get PDF
    Aim: There is growing interest in tailoring psychological interventions for distressing voices and a need for reliable tools to assess phenomenological features which might influence treatment response. This study examines the reliability and internal consistency of the Voice Characterisation Checklist (VoCC), a novel 10-item tool which assesses degree of voice characterisation, identified as relevant to a new wave of relational approaches. Methods: The sample comprised participants experiencing distressing voices, recruited at baseline on the AVATAR2 trial between January 2021 and July 2022 (n = 170). Inter-rater reliability (IRR) and internal consistency analyses (Cronbach’s alpha) were conducted. Results: The majority of participants reported some degree of voice personification (94%) with high endorsement of voices as distinct auditory experiences (87%) with basic attributes of gender and age (82%). While most identified a voice intention (75%) and personality (76%), attribution of mental states (35%) to the voice (‘What are they thinking?’) and a known historical relationship (36%) were less common. The internal consistency of the VoCC was acceptable (10 items, α = 0.71). IRR analysis indicated acceptable to excellent reliability at the item-level for 9/10 items and moderate agreement between raters’ global (binary) classification of more vs. less highly characterised voices, κ = 0.549 (95% CI, 0.240–0.859), p < 0.05. Conclusion: The VoCC is a reliable and internally consistent tool for assessing voice characterisation and will be used to test whether voice characterisation moderates treatment outcome to AVATAR therapy. There is potential wider utility within clinical trials of other relational therapies as well as routine clinical practice

    Pairwise Correlation Analysis of the Alzheimer’s Disease Neuroimaging Initiative (ADNI) Dataset Reveals Significant Feature Correlation

    Get PDF
    The Alzheimer’s Disease Neuroimaging Initiative (ADNI) contains extensive patient measurements (e.g., magnetic resonance imaging [MRI], biometrics, RNA expression, etc.) from Alzheimer’s disease (AD) cases and controls that have recently been used by machine learning algorithms to evaluate AD onset and progression. While using a variety of biomarkers is essential to AD research, highly correlated input features can significantly decrease machine learning model generalizability and performance. Additionally, redundant features unnecessarily increase computational time and resources necessary to train predictive models. Therefore, we used 49,288 biomarkers and 793,600 extracted MRI features to assess feature correlation within the ADNI dataset to determine the extent to which this issue might impact large scale analyses using these data. We found that 93.457% of biomarkers, 92.549% of the gene expression values, and 100% of MRI features were strongly correlated with at least one other feature in ADNI based on our Bonferroni corrected α (p-value ≤ 1.40754 × 10−13). We provide a comprehensive mapping of all ADNI biomarkers to highly correlated features within the dataset. Additionally, we show that significant correlation within the ADNI dataset should be resolved before performing bulk data analyses, and we provide recommendations to address these issues. We anticipate that these recommendations and resources will help guide researchers utilizing the ADNI dataset to increase model performance and reduce the cost and complexity of their analyses

    The Voice Characterisation Checklist:Psychometric Properties of a Brief Clinical Assessment of Voices as Social Agents

    Get PDF
    Aim: There is growing interest in tailoring psychological interventions for distressing voices and a need for reliable tools to assess phenomenological features which might influence treatment response. This study examines the reliability and internal consistency of the Voice Characterisation Checklist (VoCC), a novel 10-item tool which assesses degree of voice characterisation, identified as relevant to a new wave of relational approaches. Methods: The sample comprised participants experiencing distressing voices, recruited at baseline on the AVATAR2 trial between January 2021 and July 2022 (n = 170). Inter-rater reliability (IRR) and internal consistency analyses (Cronbach’s alpha) were conducted. Results: The majority of participants reported some degree of voice personification (94%) with high endorsement of voices as distinct auditory experiences (87%) with basic attributes of gender and age (82%). While most identified a voice intention (75%) and personality (76%), attribution of mental states (35%) to the voice (‘What are they thinking?’) and a known historical relationship (36%) were less common. The internal consistency of the VoCC was acceptable (10 items, α = 0.71). IRR analysis indicated acceptable to excellent reliability at the item-level for 9/10 items and moderate agreement between raters’ global (binary) classification of more vs. less highly characterised voices, κ = 0.549 (95% CI, 0.240–0.859), p &lt; 0.05. Conclusion: The VoCC is a reliable and internally consistent tool for assessing voice characterisation and will be used to test whether voice characterisation moderates treatment outcome to AVATAR therapy. There is potential wider utility within clinical trials of other relational therapies as well as routine clinical practise

    A trial protocol for the effectiveness of digital interventions for preventing depression in adolescents : The Future Proofing Study

    Get PDF
    Background: Depression frequently first emerges during adolescence, and one in five young people will experience an episode of depression by the age of 18 years. Despite advances in treatment, there has been limited progress in addressing the burden at a population level. Accordingly, there has been growing interest in prevention approaches as an additional pathway to address depression. Depression can be prevented using evidence-based psychological programmes. However, barriers to implementing and accessing these programmes remain, typically reflecting a requirement for delivery by clinical experts and high associated delivery costs. Digital technologies, specifically smartphones, are now considered a key strategy to overcome the barriers inhibiting access to mental health programmes. The Future Proofing Study is a large-scale school-based trial investigating whether cognitive behaviour therapies (CBT) delivered by smartphone application can prevent depression. Methods: A randomised controlled trial targeting up to 10,000 Year 8 Australian secondary school students will be conducted. In Stage I, schools will be randomised at the cluster level either to receive the CBT intervention app (SPARX) or to a non-active control group comparator. The primary outcome will be symptoms of depression, and secondary outcomes include psychological distress, anxiety and insomnia. At the 12-month follow-up, participants in the intervention arm with elevated depressive symptoms will participate in an individual-level randomised controlled trial (Stage II) and be randomised to receive a second CBT app which targets sleep difficulties (Sleep Ninja) or a control condition. Assessments will occur post intervention (both trial stages) and at 6, 12, 24, 36, 48 and 60 months post baseline. Primary analyses will use an intention-to-treat approach and compare changes in symptoms from baseline to follow-up relative to the control group using mixed-effect models. Discussion: This is the first trial testing the effectiveness of smartphone apps delivered to school students to prevent depression at scale. Results from this trial will provide much-needed insight into the feasibility of this approach. They stand to inform policy and commission decisions concerning if and how such programmes should be deployed in school-based settings in Australia and beyond
    • …
    corecore