10 research outputs found

    Don't Let Me Be Misunderstood: Comparing Intentions and Perceptions in Online Discussions

    Full text link
    Discourse involves two perspectives: a person's intention in making an utterance and others' perception of that utterance. The misalignment between these perspectives can lead to undesirable outcomes, such as misunderstandings, low productivity and even overt strife. In this work, we present a computational framework for exploring and comparing both perspectives in online public discussions. We combine logged data about public comments on Facebook with a survey of over 16,000 people about their intentions in writing these comments or about their perceptions of comments that others had written. Unlike previous studies of online discussions that have largely relied on third-party labels to quantify properties such as sentiment and subjectivity, our approach also directly captures what the speakers actually intended when writing their comments. In particular, our analysis focuses on judgments of whether a comment is stating a fact or an opinion, since these concepts were shown to be often confused. We show that intentions and perceptions diverge in consequential ways. People are more likely to perceive opinions than to intend them, and linguistic cues that signal how an utterance is intended can differ from those that signal how it will be perceived. Further, this misalignment between intentions and perceptions can be linked to the future health of a conversation: when a comment whose author intended to share a fact is misperceived as sharing an opinion, the subsequent conversation is more likely to derail into uncivil behavior than when the comment is perceived as intended. Altogether, these findings may inform the design of discussion platforms that better promote positive interactions.Comment: Proceedings of The Web Conference (WWW) 202

    Exploiting Structure For Sentiment Classification

    Full text link
    This thesis studies the problem of sentiment classification at both the document and sentence level using statistical learning methods. In particular, we develop computational models that capture useful structure-based intuitions for solving each task, treating the intuitions as latent representations to be discovered and exploited during learning. For document-level sentiment classification, we exploit structure in the form of informative sentences - those sentences that exhibit the same sentiment as the document, thus explain or support the document's sentiment label. We first show that incorporating automatically discovered informative sentences in the form of additional constraints for the learner improves performance on the document-level sentiment classification task. Next, we explore joint structured models for this task: our final proposed model does not need sentence-level sentiment labels, and directly optimizes document classification accuracy using inferred sentence-level information. Our empirical evaluation on two publicly available datasets shows improved performance over strong baselines. For phrase-level sentiment classification, we investigate the compositional linguistic structure of phrases. We investigate compositional matrix-space models, learning matrix-space word representations and modeling composition as matrix multiplication. Using a publicly available dataset, we show that the matrix-space model outperforms the standard bag-of-words model for the phrase-level sentiment classification task

    Multi-level Structured Models for Document-level Sentiment Classification

    Full text link
    In this paper, we investigate structured models for document-level sentiment classification. When predicting the sentiment of a subjective document (e.g., as positive or negative), it is well known that not all sentences are equally discriminative or informative. But identifying the useful sentences automatically is itself a difficult learning problem. This paper proposes a joint two-level approach for document-level sentiment classification that simultaneously extracts useful (i.e., subjective) sentences and predicts document-level sentiment based on the extracted sentences. Unlike previous joint learning methods for the task, our approach (1) does not rely on gold standard sentence-level subjectivity annotations (which may be expensive to obtain), and (2) optimizes directly for document-level performance. Empirical evaluations on movie reviews and U.S. Congressional floor debates show improved performance over previous approaches.

    An empirical evaluation of supervised learning in high dimensions

    Full text link
    In this paper we perform an empirical evaluation of supervised learning on highdimensional data. We evaluate performance on three metrics: accuracy, AUC, and squared loss and study the effect of increasing dimensionality on the performance of the learning algorithms. Our findings are consistent with previous studies for problems of relatively low dimension, but suggest that as dimensionality increases the relative performance of the learning algorithms changes. To our surprise, the method that performs consistently well across all dimensions is random forests, followed by neural nets, boosted trees, and SVMs. 1

    Computational approaches to sentence completion

    Full text link
    This paper studies the problem of sentencelevel semantic coherence by answering SATstyle sentence completion questions. These questions test the ability of algorithms to distinguish sense from nonsense based on a variety of sentence-level phenomena. We tackle the problem with two approaches: methods that use local lexical information, such as the n-grams of a classical language model; and methods that evaluate global coherence, such as latent semantic analysis. We evaluate these methods on a suite of practice SAT questions, and on a recently released sentence completion task based on data taken from five Conan Doyle novels. We find that by fusing local and global information, we can exceed 50% on this task (chance baseline is 20%), and we suggest some avenues for further research.
    corecore