16 research outputs found

    Looking into the Past: Evaluating the Effect of Time Gaps in a Personalized Sentiment Model

    Get PDF
    This paper concerns personalized sentiment analysis, which aims at improving the prediction of the sentiment expressed in a piece of text by considering individualities. Mostly, this is done by relating to a person’s past expressions (or opinions), however the time gaps between the messages are not considered in the existing works. We argue that the opinion at a specific time point is affected more by recent opinions that contain related content than the earlier or unrelated ones, thus a sentiment model ought to include such information in the analysis. By using a recurrent neural network with an attention layer as a basic model, we introduce three cases to integrate time gaps in the model. Evaluated on Twitter data with frequent users, we have found that the performance is improved the most by including the time information in the Hawkes process, and it is also more effective to add the time information in the attention layer than at the input

    PERSEUS: A Personalization Framework for Sentiment Categorization with Recurrent Neural Network

    Get PDF
    This paper introduces the personalization framework PERSEUS in order to investigate the impact of individuality in sentiment categorization by looking into the past. The existence of diversity between individuals and certain consistency in each individual is the cornerstone of the framework. We focus on relations between documents for user-sensitive predictions. Individual’s lexical choices act as indicators for individuality, thus we use a concept-based system which utilizes neural networks to embed concepts and associated topics in text. Furthermore, a recurrent neural network is used to memorize the history of user’s opinions, to discover user-topic dependence, and to detect implicit relations between users. PERSEUS also offers a solution for data sparsity. At the first stage, we show the benefit of inquiring a user-specified system. Improvements in performance experimented on a combined Twitter dataset are shown over generalized models. PERSEUS can be used in addition to such generalized systems to enhance the understanding of user’s opinions

    Corpus of long-term instant messaging based dialogues between advanced learners of German as a foreign language and German native speakers: deL1L2IM

    No full text
    The deL1L2IM corpus, created between May and August 2012 and last updated in August 2014, has been collected within the framework of a PhD project on the development of a learning method implying conversations with an artificial companion. This PhD work is presented as a qualitative investigation of instant messaging dialogues on a long-term basis (four months) between advanced learners of German and German native speakers, chatting about whatever topic they wish. The dataset is composed of 72 dialogues, each of them having a duration of 20 to 45 minutes. The whole corpus contains ca. 52,000 words and 4,800 messages and has a file size of 0.5 Mb. Nine pairs of participants – i.e. nine learners and four native speakers – were required, with 8 dialogues per pair. The interactions have undergone linguistic analysis whereby the annotation will be performed only on repair/correction sequences (incomplete learner error annotation). The goal of the project was to create an application for language modelling and to improve learner language applications, tutoring software and dialogue systems. The corpus is delivered in one written text file (in XML format, customized under TEI P5)

    Dealing with Trouble: A Data-Driven Model of a Repair Type for a Conversational Agent

    Get PDF
    Troubles in hearing, comprehension or speech production are common in human conversations, especially if participants of the conversation communicate in a foreign language that they have not yet fully mastered. Here I describe a data-driven model for simulation of dialogue sequences where the learner user does not understand the talk of a conversational agent in chat and asks for clarification

    Data-driven Repair Models for Text Chat with Language Learners

    Get PDF
    This research analyses participants' orientation to linguistic identities in chat and introduces data-driven computational models for communicative Intelligent Computer-Assisted Language Learning (communicative ICALL). Based on non-pedagogical chat conversations between native speakers and non-native speakers, computational models of the following types are presented: exposed and embedded corrections, explanations of unknown words following learner's request. Conversation Analysis helped to obtain patterns from a corpus of dyadic chat conversations in a longitudinal setting, bringing together German native speakers and advanced learners of German as a foreign language. More specifically, this work states a bottom-up, data-driven research design which takes “conversation” from its genuine personalised dyadic environment to a model of a conversational agent. It allows for an informal functional specification of such an agent to which a technical specification for two specific repair types is provided. Starting with the open research objective to create a machine that behaves like a language expert in an informal conversation, this research shows that various forms of orientation to linguistic identities are on participants' disposal in chat. In addition it shows that dealing with computational complexity can be approached by a separation between local models of specific practices and a high-level regulatory mechanism to activate them. More specifically, this work shows that learners' repair initiations may be analysed as turn formats containing resources for signalling trouble and referencing trouble source. Based on this finding, this work shows how computational models for recognition of the repair initiations and trouble source extraction can be formalised and implemented in a chatbot. Further, this work makes clear which level of description of error corrections is required to satisfy computational needs, and how these descriptions may be transformed to patterns for various error correction formats and which technological requirements they imply. Finally, this research shows which factors in interaction influence the decision to correct and how the creation of a high-level decision model for error correction in a Conversation-for-Learning can be approached. In sum, this research enriches the landscape of various communication setups between language learners and communicative ICALL systems explicitly covering Conversations-for-Learning. It strengthens multidisciplinary connections by showing how the multidisciplinary research field of ICALL benefits from including Conversation Analysis into the research paradigm. It highlights the impact of the micro-analytic understanding of actions accomplished by utterances in talk within a specific speech exchange system on computational modelling on the example of chat with language learners

    A Personalized Sentiment Model with Textual and Contextual Information

    Get PDF
    In this paper, we look beyond the traditional population-level sentiment modeling and consider the individuality in a person's expressions by discovering both textual and contextual information. In particular, we construct a hierarchical neural network that leverages valuable information from a person's past expressions, and offer a better understanding of the sentiment from the expresser's perspective. Additionally, we investigate how a person's sentiment changes over time so that recent incidents or opinions may have more effect on the person's current sentiment than the old ones. Psychological studies have also shown that individual variation exists in how easily people change their sentiments. In order to model such traits, we develop a modified attention mechanism with Hawkes process applied on top of a recurrent network for a user-specific design. Implemented with automatically labeled Twitter data, the proposed model has shown positive results employing different input formulations for representing the concerned information

    L'examination des biais linguistiques dans Telegram avec une approche théorie des jeux

    No full text
    International audienceSelective formulations and selective reporting of facts in political news are deliberately used to create particular identities of different political sides. This becomes evident in media dialogue reporting about political conflicts. In contrast to most NLP-based studies of linguistic bias, we engage critically with its nature, aiming at a later de-biasing or at least raising awareness about linguistic bias in political news. We found inspiration in conversation analysis (CA), membership categorisation analysis (MCA) and a game-theoretic approach to discourse called epistemic message exchange (ME) games. We identified three types of bias: selective reports about facts, selective formulations when reporting about the same facts, and different histories built up by the differences in the first two. We extend the epistemic ME games model with findings from a qualitative study.Les formulations et les comptes rendus sélectifs des faits dans les informations politiques sont délibérément utilisés pour créer des identités particulières des différents camps politiques. Cela devient évident dans les reportages sur les conflits politiques dans les médias. Contrairement à la plupart des études sur le biais linguistique basées sur la PNL, nous nous engageons de manière critique dans sa nature, en visant un dé-biaisement ultérieur ou au moins une prise de conscience du biais linguistique dans les informations politiques. Nous nous sommes inspirés de l'analyse des conversations (CA), de l'analyse de la catégorisation des membres (MCA) et d'une approche du discours basée sur la théorie des jeux, appelée jeux épistémiques d'échange de messages (ME). Nous avons identifié trois types de biais : des rapports sélectifs sur les faits, des formulations sélectives lors des rapports sur les mêmes faits, et des histoires différentes construites par les différences entre les deux premiers. Nous étendons le modèle des jeux épistémiques de l'EM avec les résultats d'une étude qualitative
    corecore