Search CORE

272 research outputs found

Re-framing Incremental Deep Language Models for Dialogue Processing with Multi-task Learning

Author: Hough J
Rohanian M
The 28th International Conference on Computational Linguistics
Publication venue: The 28th International Conference on Computational Linguistics
Publication date: 01/01/2021
Field of study

Are language production problems apparent in adults who no longer meet diagnostic criteria for attention-deficit/hyperactivity disorder?

Author: Achenbach T.
American Psychiatric Association
American Psychiatric Association
Barkley R. A.
Berg T.
Bock J. K.
Brown T. E.
Conners C. K.
Cooper P. V.
de Smedt K.
Fernanda Ferreira
Indefrey P.
Joel T. Nigg
Kempen G.
Kempen G.
Levelt W. J. M.
Logan G. D.
Oram J.
Paul E. Engelhardt
Postma A.
Puig-Antich J.
Schmitter-Edgecombe M.
Sean N. Veld
Shao Z.
Tannock R.
Wechsler D.
Weiss M.
Wender P. H.
Wilkinson G. S.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2012
Field of study

In this study, we examined sentence production in a sample of adults (N = 21) who had had attention-deficit/hyperactivity disorder (ADHD) as children, but as adults no longer met DSM-IV diagnostic criteria (APA, 2000). This “remitted” group was assessed on a sentence production task. On each trial, participants saw two objects and a verb. Their task was to construct a sentence using the objects as arguments of the verb. Results showed more ungrammatical and disfluent utterances with one particular type of verb (i.e., participle). In a second set of analyses, we compared the remitted group to both control participants and a “persistent” group, who had ADHD as children and as adults. Results showed that remitters were more likely to produce ungrammatical utterances and to make repair disfluencies compared to controls, and they patterned more similarly to ADHD participants. Conclusions focus on language output in remitted ADHD, and the role of executive functions in language production

Northumbria Research Link

Crossref

PubMed Central

University of East Anglia digital repository

Helping, I Mean Assessing Psychiatric Communication: An Applicaton of Incremental Self-Repair Detection

Author: Hough J
Howes Christine
McCabe Rose
Purver Matthew
Publication venue
Publication date: 04/03/2016
Field of study

18th SemDial Workshop on the Semantics and Pragmatics of Dialogue (DialWatt), 1-3 September 2014, Edinburgh, ScotlandSelf-repair is pervasive in dialogue, and models thereof have long been a focus of research, particularly for disfluency detection in speech recognition and spoken dialogue systems. However, the generality of such models across domains has received little attention. In this paper we investigate the application of an automatic incremental self-repair detection system, STIR, developed on the Switchboard corpus of telephone speech, to a new domain – psychiatric consultations. We find that word-level accuracy is reduced markedly by the differences in annotation schemes and transcription conventions between corpora, which has implications for the generalisability of all repair detection systems. However, overall rates of repair are detected accurately, promising a useful resource for clinical dialogue studies

Open Research Exeter

DISCO: A Large Scale Human Annotated Corpus for Disfluency Correction in Indo-European Languages

Author: Bhat Vineet
Bhattacharyya Pushpak
Jyothi Preethi
Publication venue
Publication date: 25/10/2023
Field of study

Disfluency correction (DC) is the process of removing disfluent elements like fillers, repetitions and corrections from spoken utterances to create readable and interpretable text. DC is a vital post-processing step applied to Automatic Speech Recognition (ASR) outputs, before subsequent processing by downstream language understanding tasks. Existing DC research has primarily focused on English due to the unavailability of large-scale open-source datasets. Towards the goal of multilingual disfluency correction, we present a high-quality human-annotated DC corpus covering four important Indo-European languages: English, Hindi, German and French. We provide extensive analysis of results of state-of-the-art DC models across all four languages obtaining F1 scores of 97.55 (English), 94.29 (Hindi), 95.89 (German) and 92.97 (French). To demonstrate the benefits of DC on downstream tasks, we show that DC leads to 5.65 points increase in BLEU scores on average when used in conjunction with a state-of-the-art Machine Translation (MT) system. We release code to run our experiments along with our annotated dataset here.Comment: Accepted at EMNLP 2023 Finding

arXiv.org e-Print Archive

Recommended from our members

Effects of Duration, Locality, and Surprisal in Speech Disfluency Prediction in English Spontaneous Speech

Author: Agarwal Sumeet
Dammalapati Samvit
Rajkumar Rajakrishnan
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2021
Field of study

This study examines the role of two influential theories of language processing, Surprisal Theory and Dependency Locality Theory (DLT), in predicting disfluencies (fillers and reparandums) in the Switchboard corpus of English conversational speech. Using Generalized Linear Mixed Models for this task, we incorporate syntactic factors (DLT-inspired costs and syntactic surprisal) in addition to lexical surprisal and duration, thus going beyond the local lexical frequency and predictability used in previous work on modelling word durations in Switchboard speech. Our results indicate that compared to fluent words, words preceding disfluencies tend to have lower lexical surprisal (hence higher activation levels) and lower syntactic complexity (low DLT costs and low syntactic surprisal except for reparandums). Disfluencies tend to occur before upcoming difficulties, i.e., high lexical surprisal words (low activation levels) with high syntactic complexity (high DLT costs and high syntactic surprisal). Further, we see that reparandums behave almost similarly to disfluent fillers with differences possibly arising due to effects being present in the word choice of the reparandum, i.e., in the disfluency itself rather than surrounding it. Moreover, words preceding disfluencies tend to be function words and have longer durations compared to their fluent counterparts, and word duration is a very effective predictor of disfluencies. Overall, speakers may be leveraging the differences in access between content and function words during planning as part of a mechanism to adapt for disfluencies while coordinating between planning and articulation

ScholarWorks@UMass Amherst

A novel multimodal dynamic fusion network for disfluency detection in spoken utterances

Author: Ghosh Sreyan
Kumar Sonal
Shah Rajiv Ratn
Suri Manan
Tyagi Utkarsh
Publication venue
Publication date: 26/11/2022
Field of study

Disfluency, though originating from human spoken utterances, is primarily studied as a uni-modal text-based Natural Language Processing (NLP) task. Based on early-fusion and self-attention-based multimodal interaction between text and acoustic modalities, in this paper, we propose a novel multimodal architecture for disfluency detection from individual utterances. Our architecture leverages a multimodal dynamic fusion network that adds minimal parameters over an existing text encoder commonly used in prior art to leverage the prosodic and acoustic cues hidden in speech. Through experiments, we show that our proposed model achieves state-of-the-art results on the widely used English Switchboard for disfluency detection and outperforms prior unimodal and multimodal systems in literature by a significant margin. In addition, we make a thorough qualitative analysis and show that, unlike text-only systems, which suffer from spurious correlations in the data, our system overcomes this problem through additional cues from speech signals. We make all our codes publicly available on GitHub.Comment: Submitted to ICASSP 2023. arXiv admin note: text overlap with arXiv:2203.1679

arXiv.org e-Print Archive