9 research outputs found

    Intention Detection Based on Siamese Neural Network With Triplet Loss

    Get PDF
    Understanding the user's intention is an essential task for the spoken language understanding (SLU) module in the dialogue system, which further illustrates vital information for managing and generating future action and response. In this paper, we propose a triplet training framework based on the multiclass classification approach to conduct the training for the intention detection task. Precisely, we utilize a Siamese neural network architecture with metric learning to construct a robust and discriminative utterance feature embedding model. We modified the RMCNN model and fine-tuned BERT model as Siamese encoders to train utterance triplets from different semantic aspects. The triplet loss can effectively distinguish the details of two input data by learning a mapping from sequence utterances to a compact Euclidean space. After generating the mapping, the intention detection task can be easily implemented using standard techniques with pre-trained embeddings as feature vectors. Besides, we use the fusion strategy to enhance utterance feature representation in the downstream of intention detection task. We conduct experiments on several benchmark datasets of intention detection task: Snips dataset, ATIS dataset, Facebook multilingual task-oriented datasets, Daily Dialogue dataset, and MRDA dataset. The results illustrate that the proposed method can effectively improve the recognition performance of these datasets and achieves new state-of-the-art results on single-turn task-oriented datasets (Snips dataset, Facebook dataset), and a multi-turn dataset (Daily Dialogue dataset)

    Abstractive Summarization of Voice Communications

    Get PDF
    Abstract summarization of conversations is a very challenging task that requires full understanding of the dialog turns, their roles and relationships in the conversations. We present an efficient system, derived from a fully-fledged text analysis system that performs the necessary linguistic analysis of turns in conversations and provides useful argumentative labels to build synthetic abstractive summaries of conversations

    Deep Linguistic Processing with GETARUNS for Spoken Dialogue Understanding

    Get PDF
    In this paper we will present work carried out to scale up the system for text understanding called GETARUNS, and port it to be used in dialogue understanding. The current goal is that of extracting automatically argumentative information in order to build argumentative structure. The long term goal is using argumentative structure to produce automatic summarization of spoken dialogues. Very much like other deep linguistic processing systems, our system is a generic text/dialogue understanding system that can be used in connection with an ontology – WordNet - and other similar repositories of commonsense knowledge. We will present the adjustments we made in order to cope with transcribed spoken dialogues like those produced in the ICSI Berkeley project. In a final section we present preliminary evaluation of the system on two tasks: the task of automatic argumentative labeling and another frequently addressed task: referential vs. non-referential pronominal detection. Results obtained fair much higher than those reported in similar experiments with machine learning approaches

    Towards Automatic Dialogue Understanding

    Get PDF
    In this paper we will present work carried out to scale up the system for text understanding called GETARUNS, and port it to be used in dialogue understanding. The current goal is that of extracting automatically argumentative information in order to build argumentative structure. The long term goal is using argumentative structure to produce automatic summarization of spoken dialogues. Very much like other deep linguistic processing systems (see Allen et al, 2007), our system is a generic text/dialogue understanding system that can be used in connection with an ontology – WordNet – and other similar repositories of commonsense knowledge. Word sense disambiguation takes place at the level of semantic interpretation and is represented in the Discourse Model. We will present the adjustments we made in order to cope with transcribed spoken dialogues like those produced in the ICSI Berkely project. The low level component is organized according to LFG theory; at this level, the system does pronominal binding, quantifier raising and temporal interpretation. The high level component is where the Discourse Model is created from the Logical Form. For longer sentences the system switches from the top-down to the bottom-up system. In case of failure it will back off to the partial system which produces a very lean and shallow semantics with no inference rules. In a final section, we present preliminary evaluation of the system on two tasks: the task of automatic argumentative labelling and another frequently addressed task: referential vs. non-referential pronominal detection. Results obtained fair much higher than those reported in similar experiments with machine learning approaches

    Query types in the meeting domain: assessing the role of argumentative structure in answering questions on meeting discussion records

    Get PDF
    We define a new task of question answering on meeting records and assess its difficulty in terms of types of information and retrieval techniques required. The importance of this task is revealed by the increasingly growing interest in the design of sophisticated interfaces for accessing meeting records such as meeting browsers. We ground our work on the empirical analysis of elicited user queries. We assess what is the type of information sought by the users and perform a user query classification along several semantic dimensions of the meeting content. We found that queries about argumentative processes and outcomes represent the majority among the elicited queries (about 60%). We also assess the difficulty in answering the queries and focus on the requirements of a prospective QA system to successfully deal with them. Our results suggest that standard Information Retrieval and Question Answering alone can only account for less than 20% of the queries and need to be completed with additional type of information and inference

    Image-Enabled Discourse: Investigating the Creation of Visual Information as Communicative Practice

    Get PDF
    Anyone who has clarified a thought or prompted a response during a conversation by drawing a picture has exploited the potential of image making as an interactive tool for conveying information. Images are increasingly ubiquitous in daily communication, in large part due to advances in visually enabled information and communication technologies (ICT), such as information visualization applications, image retrieval systems and visually enabled collaborative work tools. Human abilities to use images to communicate are however far more sophisticated and nuanced than these technologies currently support. In order to learn more about the practice of image making as a specialized form of information and communication behavior, this study examined face-to-face conversations involving the creation of ad hoc visualizations (i.e., napkin drawings ). A model of image-enabled discourse is introduced, which positions image making as a specialized form of communicative practice. Multimodal analysis of video-recorded conversations focused on identifying image-enabled communicative activities in terms of interactional sociolinguistic concepts of conversational involvement and coordination, specifically framing, footing and stance. The study shows that when drawing occurs in the context of an ongoing dialogue, the activity of visual representation performs key communicative tasks. Visualization is a form of social interaction that contributes to the maintenance of conversational involvement in ways that are not often evident in the image artifact. For example, drawing enables us to coordinate with each other, to introduce alternative perspectives into a conversation and even to temporarily suspend the primary thread of a discussion in order to explore a tangential thought. The study compares attributes of the image artifact with those of the activity of image making, described as a series of contrasting affordances. Visual information in complex systems is generally represented and managed based on the affordances of the artifact, neglecting to account for all that is communicated through the situated action of creating. These finding have heuristic and best-practice implications for a range of areas related to the design and evaluation of virtual collaboration environments, visual information extraction and retrieval systems, and data visualization tools

    Ontology-Based Discourse Understanding for a Persistent Meeting Assistant

    No full text
    In this paper, we present research toward ontology-based understanding of discourse in meetings and describe an ontology of multimodal discourse designed for this purpose. We investigate its application in an integrated but modular architecture which uses semantically annotated knowledge of communicative meeting activity as well as discourse subject matter. We highlight how this approach assists in improving system performance over time and supports understanding in a changing and persistent environment. We also describe current and future plans for ontology-driven robust naturallanguage understanding in the presence of the highly ambiguous and errorful input typical of the meeting domain
    corecore