2,017 research outputs found

    Increase Apparent Public Speaking Fluency By Speech Augmentation

    Full text link
    Fluent and confident speech is desirable to every speaker. But professional speech delivering requires a great deal of experience and practice. In this paper, we propose a speech stream manipulation system which can help non-professional speakers to produce fluent, professional-like speech content, in turn contributing towards better listener engagement and comprehension. We propose to achieve this task by manipulating the disfluencies in human speech, like the sounds 'uh' and 'um', the filler words and awkward long silences. Given any unrehearsed speech we segment and silence the filled pauses and doctor the duration of imposed silence as well as other long pauses ('disfluent') by a predictive model learned using professional speech dataset. Finally, we output a audio stream in which speaker sounds more fluent, confident and practiced compared to the original speech he/she recorded. According to our quantitative evaluation, we significantly increase the fluency of speech by reducing rate of pauses and fillers

    Improving the Applicability of AI for Psychiatric Applications through Human-in-the-loop Methodologies

    Get PDF
    Objectives: Machine learning (ML) and natural language processing have great potential to improve effciency and accuracy in diagnosis, treatment recommendations, predictive interventions, and scarce resource allocation within psychiatry. Researchers often conceptualize such an approach as operating in isolation without much need for human involvement, yet it remains crucial to harness human-inthe-loop practices when developing and implementing such techniques as their absence may be catastrophic. We advocate for building ML-based technologies that collaborate with experts within psychiatry in all stages of implementation and use to increase model performance while simultaneously increasing the practicality, robustness, and reliability of the process. Methods: We showcase pitfalls of the traditional ML framework and explain how it can be improved with human-inthe-loop techniques. Specifcally, we applied active learning strategies to the automatic scoring of a story recall task and compared the results to a traditional approach. Results: Human-in-the-loop methodologies supplied a greater understanding of where the model was least confdent or had knowledge gaps during training. As compared to the traditional framework, less than half of the training data were needed to reach a given accuracy. Conclusions: Human-in-the-loop ML is an approach to data collection and model creation that harnesses active learning to select the most critical data needed to increase a model’s accuracy and generalizability more effciently than classic random sampling would otherwise allow. Such techniques may additionally operate as safeguards from spurious predictions and can aid in decreasing disparities that artifcial intelligence systems otherwise propagate

    Distinguishing between True and False Stories using various Linguistic Features

    Get PDF
    This paper analyzes what linguistic features differentiate true and false stories written in Hebrew. To do so, we have defined four feature sets containing 145 features: POS-tags, quantitative, repetition, and special expressions. The examined corpus contains stories that were composed by 48 native Hebrew speakers who were asked to tell both false and true stories. Classification experiments on all possible combinations of these four feature sets using five supervised machine learning methods have been applied. The Part of Speech (POS) set was superior to all others and has been found as a key component. The best accuracy result (89.6%) has been achieved by a combination of sixteen POS-tags and one quantitative feature.

    Modeling Incoherent Discourse in Non-Affective Psychosis

    Get PDF
    Background: Computational linguistic methodology allows quantification of speech abnormalities in non-affective psychosis. For this patient group, incoherent speech has long been described as a symptom of formal thought disorder. Our study is an interdisciplinary attempt at developing a model of incoherence in non-affective psychosis, informed by computational linguistic methodology as well as psychiatric research, which both conceptualize incoherence as associative loosening. The primary aim of this pilot study was methodological: to validate the model against clinical data and reduce bias in automated coherence analysis. Methods: Speech samples were obtained from patients with a diagnosis of schizophrenia or schizoaffective disorder, who were divided into two groups of n = 20 subjects each, based on different clinical ratings of positive formal thought disorder, and n = 20 healthy control subjects. Results: Coherence metrics that were automatically derived from interview transcripts significantly predicted clinical ratings of thought disorder. Significant results from multinomial regression analysis revealed that group membership (controls vs. patients with vs. without formal thought disorder) could be predicted based on automated coherence analysis when bias was considered. Further improvement of the regression model was reached by including variables that psychiatric research has shown to inform clinical diagnostics of positive formal thought disorder. Conclusions: Automated coherence analysis may capture different features of incoherent speech than clinical ratings of formal thought disorder. Models of incoherence in non-affective psychosis should include automatically derived coherence metrics as well as lexical and syntactic features that influence the comprehensibility of speech

    Language production impairments in patients with a first episode of psychosis

    Get PDF

    Language production impairments in patients with a first episode of psychosis

    Get PDF
    Language production has often been described as impaired in psychiatric diseases such as in psychosis. Nevertheless, little is known about the characteristics of linguistic difficulties and their relation with other cognitive domains in patients with a first episode of psychosis (FEP), either affective or non-affective. To deepen our comprehension of linguistic profile in FEP, 133 patients with FEP (95 non-affective, FEP-NA; 38 affective, FEP-A) and 133 healthy controls (HC) were assessed with a narrative discourse task. Speech samples were systematically analyzed with a well-established multilevel procedure investigating both micro- (lexicon, morphology, syntax) and macro-linguistic (discourse coherence, pragmatics) levels of linguistic processing. Executive functioning and IQ were also evaluated. Both linguistic and neuropsychological measures were secondarily implemented with a machine learning approach in order to explore their predictive accuracy in classifying participants as FEP or HC. Compared to HC, FEP patients showed language production difficulty at both micro- and macro-linguistic levels. As for the former, FEP produced shorter and simpler sentences and fewer words per minute, along with a reduced number of lexical fillers, compared to HC. At the macro-linguistic level, FEP performance was impaired in local coherence, which was paired with a higher percentage of utterances with semantic errors. Linguistic measures were not correlated with any neuropsychological variables. No significant differences emerged between FEP-NA and FEP-A (p≥0.02, after Bonferroni correction). Machine learning analysis showed an accuracy of group prediction of 76.36% using language features only, with semantic variables being the most impactful. Such a percentage was enhanced when paired with clinical and neuropsychological variables. Results confirm the presence of language production deficits already at the first episode of the illness, being such impairment not related to other cognitive domains. The high accuracy obtained by the linguistic set of features in classifying groups support the use of machine learning methods in neuroscience investigations

    Language production impairments in patients with a first episode of psychosis

    Get PDF
    Language production has often been described as impaired in psychiatric diseases such as in psychosis. Nevertheless, little is known about the characteristics of linguistic difficulties and their relation with other cognitive domains in patients with a first episode of psychosis (FEP), either affective or non-affective. To deepen our comprehension of linguistic profile in FEP, 133 patients with FEP (95 non-affective, FEP-NA; 38 affective, FEP-A) and 133 healthy controls (HC) were assessed with a narrative discourse task. Speech samples were systematically analyzed with a well-established multilevel procedure investigating both micro- (lexicon, morphology, syntax) and macro-linguistic (discourse coherence, pragmatics) levels of linguistic processing. Executive functioning and IQ were also evaluated. Both linguistic and neuropsychological measures were secondarily implemented with a machine learning approach in order to explore their predictive accuracy in classifying participants as FEP or HC. Compared to HC, FEP patients showed language production difficulty at both micro- and macro-linguistic levels. As for the former, FEP produced shorter and simpler sentences and fewer words per minute, along with a reduced number of lexical fillers, compared to HC. At the macro-linguistic level, FEP performance was impaired in local coherence, which was paired with a higher percentage of utterances with semantic errors. Linguistic measures were not correlated with any neuropsychological variables. No significant differences emerged between FEP-NA and FEP-A (p≥0.02, after Bonferroni correction). Machine learning analysis showed an accuracy of group prediction of 76.36% using language features only, with semantic variables being the most impactful. Such a percentage was enhanced when paired with clinical and neuropsychological variables. Results confirm the presence of language production deficits already at the first episode of the illness, being such impairment not related to other cognitive domains. The high accuracy obtained by the linguistic set of features in classifying groups support the use of machine learning methods in neuroscience investigations
    • …
    corecore