2,541 research outputs found

    Modelling Users, Intentions, and Structure in Spoken Dialog

    Full text link
    We outline how utterances in dialogs can be interpreted using a partial first order logic. We exploit the capability of this logic to talk about the truth status of formulae to define a notion of coherence between utterances and explain how this coherence relation can serve for the construction of AND/OR trees that represent the segmentation of the dialog. In a BDI model we formalize basic assumptions about dialog and cooperative behaviour of participants. These assumptions provide a basis for inferring speech acts from coherence relations between utterances and attitudes of dialog participants. Speech acts prove to be useful for determining dialog segments defined on the notion of completing expectations of dialog participants. Finally, we sketch how explicit segmentation signalled by cue phrases and performatives is covered by our dialog model.Comment: 17 page

    Guidelines for annotating the LUNA corpus with frame information

    Get PDF
    This document defines the annotation workflow aimed at adding frame information to the LUNA corpus of conversational speech. In particular, it details both the corpus pre-processing steps and the proper annotation process, giving hints about how to choose the frame and the frame element labels. Besides, the description of 20 new domain-specific and language-specific frames is reported. To our knowledge, this is the first attempt to adapt the frame paradigm to dialogs and at the same time to define new frames and frame elements for the specific domain of software/hardware assistance. The technical report is structured as follows: in Section 2 an overview of the FrameNet project is given, while Section 3 introduces the LUNA project and the annotation framework involving the Italian dialogs. Section 4 details the annotation workflow, including the format preparation of the dialog files and the annotation strategy. In Section 5 we discuss the main issues of the annotation of frame information in dialogs and we describe how the standard annotation procedure was changed in order to face such issues. Then, the 20 newly introduced frames are reported in Section 6

    An Empirical Approach to Temporal Reference Resolution

    Full text link
    This paper presents the results of an empirical investigation of temporal reference resolution in scheduling dialogs. The algorithm adopted is primarily a linear-recency based approach that does not include a model of global focus. A fully automatic system has been developed and evaluated on unseen test data with good results. This paper presents the results of an intercoder reliability study, a model of temporal reference resolution that supports linear recency and has very good coverage, the results of the system evaluated on unseen test data, and a detailed analysis of the dialogs assessing the viability of the approach.Comment: 13 pages, latex using aclap.st

    Summarizing Dialogic Arguments from Social Media

    Full text link
    Online argumentative dialog is a rich source of information on popular beliefs and opinions that could be useful to companies as well as governmental or public policy agencies. Compact, easy to read, summaries of these dialogues would thus be highly valuable. A priori, it is not even clear what form such a summary should take. Previous work on summarization has primarily focused on summarizing written texts, where the notion of an abstract of the text is well defined. We collect gold standard training data consisting of five human summaries for each of 161 dialogues on the topics of Gay Marriage, Gun Control and Abortion. We present several different computational models aimed at identifying segments of the dialogues whose content should be used for the summary, using linguistic features and Word2vec features with both SVMs and Bidirectional LSTMs. We show that we can identify the most important arguments by using the dialog context with a best F-measure of 0.74 for gun control, 0.71 for gay marriage, and 0.67 for abortion.Comment: Proceedings of the 21th Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2017

    Incremental LSTM-based Dialog State Tracker

    Full text link
    A dialog state tracker is an important component in modern spoken dialog systems. We present an incremental dialog state tracker, based on LSTM networks. It directly uses automatic speech recognition hypotheses to track the state. We also present the key non-standard aspects of the model that bring its performance close to the state-of-the-art and experimentally analyze their contribution: including the ASR confidence scores, abstracting scarcely represented values, including transcriptions in the training data, and model averaging

    Research on speech understanding and related areas at SRI

    Get PDF
    Research capabilities on speech understanding, speech recognition, and voice control are described. Research activities and the activities which involve text input rather than speech are discussed

    Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking

    Full text link
    The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on. These limitations add significantly to development costs and make cross-domain, multi-lingual dialogue systems intractable. Moreover, human languages are context-aware. The most natural response should be directly learned from data rather than depending on predefined syntaxes or rules. This paper presents a statistical language generator based on a joint recurrent and convolutional neural network structure which can be trained on dialogue act-utterance pairs without any semantic alignments or predefined grammar trees. Objective metrics suggest that this new model outperforms previous methods under the same experimental conditions. Results of an evaluation by human judges indicate that it produces not only high quality but linguistically varied utterances which are preferred compared to n-gram and rule-based systems.Comment: To be appear in SigDial 201
    • …
    corecore