20 research outputs found
Issue Framing in Online Discussion Fora
In online discussion fora, speakers often make arguments for or against
something, say birth control, by highlighting certain aspects of the topic. In
social science, this is referred to as issue framing. In this paper, we
introduce a new issue frame annotated corpus of online discussions. We explore
to what extent models trained to detect issue frames in newswire and social
media can be transferred to the domain of discussion fora, using a combination
of multi-task and adversarial training, assuming only unlabeled training data
in the target domain.Comment: To appear in NAACL-HLT 201
People on Drugs: Credibility of User Statements in Health Communities
Online health communities are a valuable source of information for patients
and physicians. However, such user-generated resources are often plagued by
inaccuracies and misinformation. In this work we propose a method for
automatically establishing the credibility of user-generated medical statements
and the trustworthiness of their authors by exploiting linguistic cues and
distant supervision from expert sources. To this end we introduce a
probabilistic graphical model that jointly learns user trustworthiness,
statement credibility, and language objectivity. We apply this methodology to
the task of extracting rare or unknown side-effects of medical drugs --- this
being one of the problems where large scale non-expert data has the potential
to complement expert medical knowledge. We show that our method can reliably
extract side-effects and filter out false statements, while identifying
trustworthy users that are likely to contribute valuable medical information
A study of feature exraction techniques for classifying topics and sentiments from news posts
Recently, many news channels have their own Facebook pages in which news posts have been released in a daily basis. Consequently, these news posts contain temporal opinions about social events that may change over time due to external factors as well as may use as a monitor to the significant events happened around the world. As a result, many text mining researches have been conducted in the area of Temporal Sentiment Analysis, which one of its most challenging tasks is to detect and extract
the key features from news posts that arrive continuously overtime. However, extracting these features is a challenging task due to postâs complex properties, also posts about a specific topic may grow or vanish overtime leading in producing imbalanced datasets. Thus, this study has developed a comparative analysis on feature extraction Techniques which has examined various feature extraction techniques (TF-IDF, TF, BTO, IG, Chi-square) with three different n-gram features (Unigram, Bigram, Trigram), and using SVM as a classifier. The aim of this study is to discover the optimal Feature Extraction Technique (FET) that could achieve optimum accuracy results for both topic and sentiment classification. Accordingly, this analysis is conducted on three news channelsâ datasets. The experimental results for topic classification have shown that Chi-square with unigram have proven to be the best FET compared to other techniques. Furthermore, to overcome the problem of imbalanced data, this study has combined the best FET with OverSampling
technology. The evaluation results have shown an improvement in classifierâs performance and has achieved a higher accuracy at 93.37%, 92.89%, and 91.92 for BBC, Al-Arabiya, and Al-Jazeera, respectively, compared to what have been obtained on original datasets. Similarly, same combination (Chi-square+Unigram) has been used for sentiment classification and obtained accuracies at rates of 81.87%, 70.01%, 77.36%. However, testing the recognized optimal FET on unseen randomly selected news posts has shown a relatively very low accuracies for both topic and sentiment classification due to the changes of topics and sentiments over time
Don't Let Me Be Misunderstood: Comparing Intentions and Perceptions in Online Discussions
Discourse involves two perspectives: a person's intention in making an
utterance and others' perception of that utterance. The misalignment between
these perspectives can lead to undesirable outcomes, such as misunderstandings,
low productivity and even overt strife. In this work, we present a
computational framework for exploring and comparing both perspectives in online
public discussions.
We combine logged data about public comments on Facebook with a survey of
over 16,000 people about their intentions in writing these comments or about
their perceptions of comments that others had written. Unlike previous studies
of online discussions that have largely relied on third-party labels to
quantify properties such as sentiment and subjectivity, our approach also
directly captures what the speakers actually intended when writing their
comments. In particular, our analysis focuses on judgments of whether a comment
is stating a fact or an opinion, since these concepts were shown to be often
confused.
We show that intentions and perceptions diverge in consequential ways. People
are more likely to perceive opinions than to intend them, and linguistic cues
that signal how an utterance is intended can differ from those that signal how
it will be perceived. Further, this misalignment between intentions and
perceptions can be linked to the future health of a conversation: when a
comment whose author intended to share a fact is misperceived as sharing an
opinion, the subsequent conversation is more likely to derail into uncivil
behavior than when the comment is perceived as intended. Altogether, these
findings may inform the design of discussion platforms that better promote
positive interactions.Comment: Proceedings of The Web Conference (WWW) 202
Detecting Comments on News Articles in Microblogs
A reader of a news article would often be interested in the comments of other readers on an article, because comments give insight into popular opinions or feelings toward a given piece of news. In recent years, social media platforms, such as Twitter, have become a social hub for users to communicate and express their thoughts. This includes sharing news articles and commenting on them. In this paper, we propose an approach for identifying âcomment-tweetsâ that comment on news articles. We discuss the nature of comment-tweets and compare them to subjective tweets. We utilize a machine learning-based classification approach for distinguishing between comment-tweets and others that only report the news. Our approach is evaluated on the TREC-2011 Microblog track data after applying additional annotations to tweets containing comments. Results show the effectiveness of our classification approach. Furthermore, we demonstrate the effectiveness of our approach on live news articles