3,599 research outputs found

    Towards Cognitive Dialog Systems

    Get PDF

    Combining heterogeneous inputs for the development of adaptive and multimodal interaction systems

    Get PDF
    In this paper we present a novel framework for the integration of visual sensor networks and speech-based interfaces. Our proposal follows the standard reference architecture in fusion systems (JDL), and combines different techniques related to Artificial Intelligence, Natural Language Processing and User Modeling to provide an enhanced interaction with their users. Firstly, the framework integrates a Cooperative Surveillance Multi-Agent System (CS-MAS), which includes several types of autonomous agents working in a coalition to track and make inferences on the positions of the targets. Secondly, enhanced conversational agents facilitate human-computer interaction by means of speech interaction. Thirdly, a statistical methodology allows modeling the user conversational behavior, which is learned from an initial corpus and improved with the knowledge acquired from the successive interactions. A technique is proposed to facilitate the multimodal fusion of these information sources and consider the result for the decision of the next system action.This work was supported in part by Projects MEyC TEC2012-37832-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS S2009/TIC-1485Publicad

    SCREEN: Learning a Flat Syntactic and Semantic Spoken Language Analysis Using Artificial Neural Networks

    Get PDF
    In this paper, we describe a so-called screening approach for learning robust processing of spontaneously spoken language. A screening approach is a flat analysis which uses shallow sequences of category representations for analyzing an utterance at various syntactic, semantic and dialog levels. Rather than using a deeply structured symbolic analysis, we use a flat connectionist analysis. This screening approach aims at supporting speech and language processing by using (1) data-driven learning and (2) robustness of connectionist networks. In order to test this approach, we have developed the SCREEN system which is based on this new robust, learned and flat analysis. In this paper, we focus on a detailed description of SCREEN's architecture, the flat syntactic and semantic analysis, the interaction with a speech recognizer, and a detailed evaluation analysis of the robustness under the influence of noisy or incomplete input. The main result of this paper is that flat representations allow more robust processing of spontaneous spoken language than deeply structured representations. In particular, we show how the fault-tolerance and learning capability of connectionist networks can support a flat analysis for providing more robust spoken-language processing within an overall hybrid symbolic/connectionist framework.Comment: 51 pages, Postscript. To be published in Journal of Artificial Intelligence Research 6(1), 199

    A Satisfaction-based Model for Affect Recognition from Conversational Features in Spoken Dialog Systems

    Get PDF
    Detecting user affect automatically during real-time conversation is the main challenge towards our greater aim of infusing social intelligence into a natural-language mixed-initiative High-Fidelity (Hi-Fi) audio control spoken dialog agent. In recent years, studies on affect detection from voice have moved on to using realistic, non-acted data, which is subtler. However, it is more challenging to perceive subtler emotions and this is demonstrated in tasks such as labelling and machine prediction. This paper attempts to address part of this challenge by considering the role of user satisfaction ratings and also conversational/dialog features in discriminating contentment and frustration, two types of emotions that are known to be prevalent within spoken human-computer interaction. However, given the laboratory constraints, users might be positively biased when rating the system, indirectly making the reliability of the satisfaction data questionable. Machine learning experiments were conducted on two datasets, users and annotators, which were then compared in order to assess the reliability of these datasets. Our results indicated that standard classifiers were significantly more successful in discriminating the abovementioned emotions and their intensities (reflected by user satisfaction ratings) from annotator data than from user data. These results corroborated that: first, satisfaction data could be used directly as an alternative target variable to model affect, and that they could be predicted exclusively by dialog features. Second, these were only true when trying to predict the abovementioned emotions using annotator?s data, suggesting that user bias does exist in a laboratory-led evaluation

    A novel approach for data fusion and dialog management in user-adapted multimodal dialog systems

    Get PDF
    Proceedings of: 17th International Conference on Information Fusion (FUSION 2014): Salamanca, Spain 7-10 July 2014.Multimodal dialog systems have demonstrated a high potential for more flexible, usable and natural humancomputer interaction. These improvements are highly dependent on the fusion and dialog management processes, which respectively integrates and interprets multimedia multimodal information and decides the next system response for the current dialog state. In this paper we propose to carry out the multimodal fusion and dialog management processes at the dialog level in a single step. To do this, we describe an approach based on a statistical model that takes user's intention into account, generates a single representation obtained from the different input modalities and their confidence scores, and selects the next system action based on this representation. The paper also describes the practical application of the proposed approach to develop a multimodal dialog system providing travel and tourist information.This work was supported in part by Projects MINECO TEC2012-37832-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS (S2009/TIC-1485).Publicad

    Fostering parent–child dialog through automated discussion suggestions

    Get PDF
    The development of early literacy skills has been critically linked to a child’s later academic success. In particular, repeated studies have shown that reading aloud to children and providing opportunities for them to discuss the stories that they hear is of utmost importance to later academic success. CloudPrimer is a tablet-based interactive reading primer that aims to foster early literacy skills by supporting parents in shared reading with their children through user-targeted discussion topic suggestions. The tablet application records discussions between parents and children as they read a story and, in combination with a common sense knowledge base, leverages this information to produce suggestions. Because of the unique challenges presented by our application, the suggestion generation method relies on a novel topic modeling method that is based on semantic graph topology. We conducted a user study in which we compared how delivering suggestions generated by our approach compares to expert-crafted suggestions. Our results show that our system can successfully improve engagement and parent–child reading practices in the absence of a literacy expert’s tutoring.National Science Foundation (U.S.) (Award Number 1117584

    Processing and fusioning multiple heterogeneous information sources in multimodal dialog systems

    Get PDF
    Proceedings of: 17th International Conference on Information Fusion (FUSION 2014): Salamanca, Spain 7-10 July 2014.Context-aware dialog systems must be able to process very heterogeneous information sources and user input modes. In this paper we propose a method to fuse multimodal inputs into a unified representation. This representation allows the dialog manager of the system to find the best interaction strategy and also select the next system response. We show the applicability of our proposal by means of the implementation of a dialog system that considers spoken, tactile, and also information related to the context of the interaction with its users. Context information is related to the detection of user's intention during the dialog and their emotional state (internal context), and the user's location (external context).This work was supported in part by Projects MINECO TEC2012-37832-C02-01, CICYT TEC2011-28626-C02-02, CAM CONTEXTS (S2009/TIC-1485).Publicad

    Adaptive Cognitive Interaction Systems

    Get PDF
    Adaptive kognitive Interaktionssysteme beobachten und modellieren den Zustand ihres Benutzers und passen das Systemverhalten entsprechend an. Ein solches System besteht aus drei Komponenten: Dem empirischen kognitiven Modell, dem komputationalen kognitiven Modell und dem adaptiven Interaktionsmanager. Die vorliegende Arbeit enthält zahlreiche Beiträge zur Entwicklung dieser Komponenten sowie zu deren Kombination. Die Ergebnisse werden in zahlreichen Benutzerstudien validiert
    corecore