375 research outputs found

    Zero-Shot Learning for Semantic Utterance Classification

    Get PDF
    We propose a novel zero-shot learning method for semantic utterance classification (SUC). It learns a classifier f:X→Yf: X \to Y for problems where none of the semantic categories YY are present in the training set. The framework uncovers the link between categories and utterances using a semantic space. We show that this semantic space can be learned by deep neural networks trained on large amounts of search engine query log data. More precisely, we propose a novel method that can learn discriminative semantic features without supervision. It uses the zero-shot learning framework to guide the learning of the semantic features. We demonstrate the effectiveness of the zero-shot semantic learning algorithm on the SUC dataset collected by (Tur, 2012). Furthermore, we achieve state-of-the-art results by combining the semantic features with a supervised method

    A Dialogue-Act Taxonomy for a Virtual Coach Designed to Improve the Life of Elderly

    Get PDF
    This paper presents a dialogue act taxonomy designed for the development of a conversational agent for elderly. The main goal of this conversational agent is to improve life quality of the user by means of coaching sessions in different topics. In contrast to other approaches such as task-oriented dialogue systems and chit-chat implementations, the agent should display a pro-active attitude, driving the conversation to reach a number of diverse coaching goals. Therefore, the main characteristic of the introduced dialogue act taxonomy is its capacity for supporting a communication based on the GROW model for coaching. In addition, the taxonomy has a hierarchical structure between the tags and it is multimodal. We use the taxonomy to annotate a Spanish dialogue corpus collected from a group of elder people. We also present a preliminary examination of the annotated corpus and discuss on the multiple possibilities it presents for further research.The research presented in this paper is conducted as part of the project EMPATHIC that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 769872. The authors would also like to thank the support by the Basque Government through the project IT-1244-19

    Out-of-vocabulary spoken term detection

    Get PDF
    Spoken term detection (STD) is a fundamental task for multimedia information retrieval. A major challenge faced by an STD system is the serious performance reduction when detecting out-of-vocabulary (OOV) terms. The difficulties arise not only from the absence of pronunciations for such terms in the system dictionaries, but from intrinsic uncertainty in pronunciations, significant diversity in term properties and a high degree of weakness in acoustic and language modelling. To tackle the OOV issue, we first applied the joint-multigram model to predict pronunciations for OOV terms in a stochastic way. Based on this, we propose a stochastic pronunciation model that considers all possible pronunciations for OOV terms so that the high pronunciation uncertainty is compensated for. Furthermore, to deal with the diversity in term properties, we propose a termdependent discriminative decision strategy, which employs discriminative models to integrate multiple informative factors and confidence measures into a classification probability, which gives rise to minimum decision cost. In addition, to address the weakness in acoustic and language modelling, we propose a direct posterior confidence measure which replaces the generative models with a discriminative model, such as a multi-layer perceptron (MLP), to obtain a robust confidence for OOV term detection. With these novel techniques, the STD performance on OOV terms was improved substantially and significantly in our experiments set on meeting speech data

    Characterizing Spoken Discourse in Individuals with Parkinson Disease Without Dementia

    Get PDF
    Background: The effects of disease (PD) on cognition, word retrieval, syntax, and speech/voice processes may interact to manifest uniquely in spoken language tasks. A handful of studies have explored spoken discourse production in PD and, while not ubiquitously, have reported a number of impairments including: reduced words per minute, reduced grammatical complexity, reduced informativeness, and increased verbal disruption. Methodological differences have impeded cross-study comparisons. As such, the profile of spoken language impairments in PD remains ambiguous. Method: A cross-genre, multi-level discourse analysis, prospective, cross-sectional between groups study design was conducted with 19 PD participants (Mage = 70.74, MUPDRS-III = 30.26) and 19 healthy controls (Mage = 68.16) without dementia. The extensive protocol included a battery of cognitive, language, and speech measures in addition to four discourse tasks. Two tasks each from two discourse genres (picture sequence description; story retelling) were collected. Discourse samples were analysed using both microlinguistic and macrostructural measures. Discourse variables were collapsed statistically to a primal set of variables used to distinguish the spoken discourse of PD vs. controls. Results: Participants with PD differed significantly from controls along a continuum of productivity, grammar, informativeness, and verbal disruption domains including total words F(1,36) = 3.87, p = .06; words/minute F(1,36) = 7.74, p = .01 , % grammatical utterances F(1,36) = 11.92, p = .001, total CIUs F(1,36) = 13.30, p = .001, % CIUs (Correct Information Units) F(1,36) = 9.35, p = .004, CIUs/minute F(1,36) = 14.06, p = .001, and verbal disruptions/100 words F(1,36) = 3.87, p = .06 (α = .10). Discriminant function analyses showed that optimally weighted discourse variables discriminated the spoken discourse of PD vs. controls with 81.6% sensitivity and 86.8% specificity. For both discourse genres, discourse performance showed robust, positive, correlations with global cognition. In PD (picture sequence description), more impaired discourse performance correlated significantly with more severe motor impairment, more advanced disease staging, and higher doses of PD medications. Conclusions: The spoken discourse in PD without dementia differs significantly and predictably from controls. Results have both research and clinical implications

    Detecting emotions from speech using machine learning techniques

    Get PDF
    D.Phil. (Electronic Engineering
    • …
    corecore