1,247 research outputs found
Summarizing Dialogic Arguments from Social Media
Online argumentative dialog is a rich source of information on popular
beliefs and opinions that could be useful to companies as well as governmental
or public policy agencies. Compact, easy to read, summaries of these dialogues
would thus be highly valuable. A priori, it is not even clear what form such a
summary should take. Previous work on summarization has primarily focused on
summarizing written texts, where the notion of an abstract of the text is well
defined. We collect gold standard training data consisting of five human
summaries for each of 161 dialogues on the topics of Gay Marriage, Gun Control
and Abortion. We present several different computational models aimed at
identifying segments of the dialogues whose content should be used for the
summary, using linguistic features and Word2vec features with both SVMs and
Bidirectional LSTMs. We show that we can identify the most important arguments
by using the dialog context with a best F-measure of 0.74 for gun control, 0.71
for gay marriage, and 0.67 for abortion.Comment: Proceedings of the 21th Workshop on the Semantics and Pragmatics of
Dialogue (SemDial 2017
Automatic summarization of online debates
Debate summarization is one of the novel and challenging research areas in automatic text summarization which has been largely unexplored. In this paper, we develop a debate summarization pipeline to summarize key topics which are discussed or argued in the two opposing sides of online debates. We view that the generation of debate summaries can be achieved by clustering, cluster labeling, and visualization. In our work, we investigate two different clustering approaches for the generation of the summaries. In the first approach, we generate the summaries by applying purely term-based clustering and cluster labeling. The second approach makes use of X-means for clustering and Mutual Information for labeling the clusters. Both approaches are driven by ontologies. We visualize the results using bar charts. We think that our results are a smooth entry for users aiming to receive the first impression about what is discussed within a debate topic containing waste number of argumentations
Viewpoint Discovery and Understanding in Social Networks
The Web has evolved to a dominant platform where everyone has the opportunity
to express their opinions, to interact with other users, and to debate on
emerging events happening around the world. On the one hand, this has enabled
the presence of different viewpoints and opinions about a - usually
controversial - topic (like Brexit), but at the same time, it has led to
phenomena like media bias, echo chambers and filter bubbles, where users are
exposed to only one point of view on the same topic. Therefore, there is the
need for methods that are able to detect and explain the different viewpoints.
In this paper, we propose a graph partitioning method that exploits social
interactions to enable the discovery of different communities (representing
different viewpoints) discussing about a controversial topic in a social
network like Twitter. To explain the discovered viewpoints, we describe a
method, called Iterative Rank Difference (IRD), which allows detecting
descriptive terms that characterize the different viewpoints as well as
understanding how a specific term is related to a viewpoint (by detecting other
related descriptive terms). The results of an experimental evaluation showed
that our approach outperforms state-of-the-art methods on viewpoint discovery,
while a qualitative analysis of the proposed IRD method on three different
controversial topics showed that IRD provides comprehensive and deep
representations of the different viewpoints
Sentiment analysis and real-time microblog search
This thesis sets out to examine the role played by sentiment in real-time microblog search. The recent prominence of the real-time web is proving both challenging and disruptive for a number of areas of research, notably information retrieval and web data mining. User-generated content on the real-time web is perhaps best epitomised by content on microblogging platforms, such as Twitter. Given the substantial quantity of microblog posts that may be relevant to a user query at a given point in time, automated methods are required to enable users to sift through this information. As an area of research reaching maturity, sentiment analysis offers a promising direction for modelling the text content in microblog streams.
In this thesis we review the real-time web as a new area of focus for sentiment analysis, with a specific focus on microblogging. We propose a system and method for evaluating the effect of sentiment on perceived search quality in real-time microblog search scenarios. Initially we provide an evaluation of sentiment analysis using supervised learning for classi- fying the short, informal content in microblog posts. We then evaluate our sentiment-based filtering system for microblog search in a user study with simulated real-time scenarios. Lastly, we conduct real-time user studies for the live broadcast of the popular television programme, the X Factor, and for the Leaders Debate during the Irish General Election. We find that we are able to satisfactorily classify positive, negative and neutral sentiment in microblog posts. We also find a significant role played by sentiment in many microblog search scenarios, observing some detrimental effects in filtering out certain sentiment types. We make a series of observations regarding associations between document-level sentiment and user feedback, including associations with user profile attributes, and usersā prior topic sentiment
Sentiment Analysis and Opinion Mining within Social Networks using Konstanz Information Miner
Evaluations, opinions, and sentiments have become very obvious due to rapid emerging interest in ecommerce which is also a significant source of expression of opinions and analysis of sentiment. In this study, a general introduction on sentiment analysis, steps of sentiment analysis, sentiments analysis applications, sentiment analysis research challenges, techniques used for sentiment analysis, etc., were discussed in detail. With these details given, it is hoped that researchers will engage in opinion mining and sentiment analysis research to attain more successes correlated to these issues. The research is based on data input from web services and social networks, including an application that performs such actions. The main aspects of this study are to statistically test and evaluate the major social network websites: In this case Twitter, because it is has rich data source and easy within social networks tools. In this study, firstly a good understanding of sentiment analysis and opinion mining research based on recent trends in the field is provided. Secondly, various aspects of sentiment analysis are explained. Thirdly, various steps of sentiment analysis are introduced. Fourthly, various sentiment analysis, research challenges are discussed. Finally, various techniques used for sentiment analysis are explained and Konstanz Information Miner (KNIME) that can be used as sentiment analysis tool is introduced. For future work, recent machine learning techniques including big data platforms may be proposed for efficient solutions for opinion mining and sentiment analysi
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation
Peer reviewe
Organized Behavior Classification of Tweet Sets using Supervised Learning Methods
During the 2016 US elections Twitter experienced unprecedented levels of
propaganda and fake news through the collaboration of bots and hired persons,
the ramifications of which are still being debated. This work proposes an
approach to identify the presence of organized behavior in tweets. The Random
Forest, Support Vector Machine, and Logistic Regression algorithms are each
used to train a model with a data set of 850 records consisting of 299 features
extracted from tweets gathered during the 2016 US presidential election. The
features represent user and temporal synchronization characteristics to capture
coordinated behavior. These models are trained to classify tweet sets among the
categories: organic vs organized, political vs non-political, and pro-Trump vs
pro-Hillary vs neither. The random forest algorithm performs better with
greater than 95% average accuracy and f-measure scores for each category. The
most valuable features for classification are identified as user based
features, with media use and marking tweets as favorite to be the most
dominant.Comment: 51 pages, 5 figure
- ā¦