3,587 research outputs found

    Detecting Calls to Action in Text Using Deep Learning

    Get PDF

    How did the discussion go: Discourse act classification in social media conversations

    Full text link
    We propose a novel attention based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question-answers, stance detection or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information etc. play their roles in charaterizing online discussions on Facebook

    Extracting Keywords from Multi-party Live Chats

    Get PDF

    mARC: Memory by Association and Reinforcement of Contexts

    Full text link
    This paper introduces the memory by Association and Reinforcement of Contexts (mARC). mARC is a novel data modeling technology rooted in the second quantization formulation of quantum mechanics. It is an all-purpose incremental and unsupervised data storage and retrieval system which can be applied to all types of signal or data, structured or unstructured, textual or not. mARC can be applied to a wide range of information clas-sification and retrieval problems like e-Discovery or contextual navigation. It can also for-mulated in the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast to Conway approach, the objects evolve in a massively multidimensional space. In order to start evaluating the potential of mARC we have built a mARC-based Internet search en-gine demonstrator with contextual functionality. We compare the behavior of the mARC demonstrator with Google search both in terms of performance and relevance. In the study we find that the mARC search engine demonstrator outperforms Google search by an order of magnitude in response time while providing more relevant results for some classes of queries

    Predicting Speech Acts in MOOC Forum Posts Using Conditional Random Fields

    Get PDF
    Massive Open Online Courses (MOOCs) have emerged as a way to reach large numbers of students by providing course materials as free online resources. The popularity of these courses has been reflected in high enrollment numbers, however it is unclear how successful MOOCs are at educating their students given their high attrition rates. One cause for this may be due to instructors' inability to manage the large number of students that enroll. While discussion forums are available for students to seek help, instructors are unable to monitor the large number of posts written in these forums. This study investigates the effectiveness of using machine learning models to classify posts into speech acts as a way to help instructors monitor these discussion forums. Speech acts describe the purpose of a post and may be indicative of common functions such as asking questions or raising issues. A linear classifier is compared against a conditional random field (CRF) classifier, which is able to leverage contextual information about the forum in order to make predictions. The results of this study find that CRFs outperform a simpler linear classifier, and this suggests that casting this prediction problem as a sequence labeling task is fruitful for predicting these speech acts, and automatically identifying posts of interest.Master of Science in Information Scienc
    • …
    corecore