25,441 research outputs found

    Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue

    Full text link
    Research on the structure of dialogue has been hampered for years because large dialogue corpora have not been available. This has impacted the dialogue research community's ability to develop better theories, as well as good off the shelf tools for dialogue processing. Happily, an increasing amount of information and opinion exchange occur in natural dialogue in online forums, where people share their opinions about a vast range of topics. In particular we are interested in rejection in dialogue, also called disagreement and denial, where the size of available dialogue corpora, for the first time, offers an opportunity to empirically test theoretical accounts of the expression and inference of rejection in dialogue. In this paper, we test whether topic-independent features motivated by theoretical predictions can be used to recognize rejection in online forums in a topic independent way. Our results show that our theoretically motivated features achieve 66% accuracy, an improvement over a unigram baseline of an absolute 6%.Comment: @inproceedings{Misra2013TopicII, title={Topic Independent Identification of Agreement and Disagreement in Social Media Dialogue}, author={Amita Misra and Marilyn A. Walker}, booktitle={SIGDIAL Conference}, year={2013}

    Analyzing collaborative learning processes automatically

    Get PDF
    In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multidimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in

    Gaul, conversation and youth genre(s) in Java

    Get PDF
    ăƒšăƒŒă‚žæ•°ăŻć‡șç‰ˆç‰©ă§ăźèš˜èŒ‰ă‚’ç™»

    Individual and Domain Adaptation in Sentence Planning for Dialogue

    Full text link
    One of the biggest challenges in the development and deployment of spoken dialogue systems is the design of the spoken language generation module. This challenge arises from the need for the generator to adapt to many features of the dialogue domain, user population, and dialogue context. A promising approach is trainable generation, which uses general-purpose linguistic knowledge that is automatically adapted to the features of interest, such as the application domain, individual user, or user group. In this paper we present and evaluate a trainable sentence planner for providing restaurant information in the MATCH dialogue system. We show that trainable sentence planning can produce complex information presentations whose quality is comparable to the output of a template-based generator tuned to this domain. We also show that our method easily supports adapting the sentence planner to individuals, and that the individualized sentence planners generally perform better than models trained and tested on a population of individuals. Previous work has documented and utilized individual preferences for content selection, but to our knowledge, these results provide the first demonstration of individual preferences for sentence planning operations, affecting the content order, discourse structure and sentence structure of system responses. Finally, we evaluate the contribution of different feature sets, and show that, in our application, n-gram features often do as well as features based on higher-level linguistic representations

    Domain transfer for deep natural language generation from abstract meaning representations

    Get PDF
    Stochastic natural language generation systems that are trained from labelled datasets are often domainspecific in their annotation and in their mapping from semantic input representations to lexical-syntactic outputs. As a result, learnt models fail to generalize across domains, heavily restricting their usability beyond single applications. In this article, we focus on the problem of domain adaptation for natural language generation. We show how linguistic knowledge from a source domain, for which labelled data is available, can be adapted to a target domain by reusing training data across domains. As a key to this, we propose to employ abstract meaning representations as a common semantic representation across domains. We model natural language generation as a long short-term memory recurrent neural network encoderdecoder, in which one recurrent neural network learns a latent representation of a semantic input, and a second recurrent neural network learns to decode it to a sequence of words. We show that the learnt representations can be transferred across domains and can be leveraged effectively to improve training on new unseen domains. Experiments in three different domains and with six datasets demonstrate that the lexical-syntactic constructions learnt in one domain can be transferred to new domains and achieve up to 75-100% of the performance of in-domain training. This is based on objective metrics such as BLEU and semantic error rate and a subjective human rating study. Training a policy from prior knowledge from a different domain is consistently better than pure in-domain training by up to 10%

    Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems

    Full text link
    Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact both on usability and perceived quality. Most NLG systems in common use employ rules and heuristics and tend to generate rigid and stylised responses without the natural variation of human language. They are also not easily scaled to systems covering multiple domains and languages. This paper presents a statistical language generator based on a semantically controlled Long Short-term Memory (LSTM) structure. The LSTM generator can learn from unaligned data by jointly optimising sentence planning and surface realisation using a simple cross entropy training criterion, and language variation can be easily achieved by sampling from output candidates. With fewer heuristics, an objective evaluation in two differing test domains showed the proposed method improved performance compared to previous methods. Human judges scored the LSTM system higher on informativeness and naturalness and overall preferred it to the other systems.Comment: To be appear in EMNLP 201
    • 

    corecore