205 research outputs found
Cascading Behavior in Large Blog Graphs
How do blogs cite and influence each other? How do such links evolve? Does
the popularity of old blog posts drop exponentially with time? These are some
of the questions that we address in this work. Our goal is to build a model
that generates realistic cascades, so that it can help us with link prediction
and outlier detection.
Blogs (weblogs) have become an important medium of information because of
their timely publication, ease of use, and wide availability. In fact, they
often make headlines, by discussing and discovering evidence about political
events and facts. Often blogs link to one another, creating a publicly
available record of how information and influence spreads through an underlying
social network. Aggregating links from several blog posts creates a directed
graph which we analyze to discover the patterns of information propagation in
blogspace, and thereby understand the underlying social network. Not only are
blogs interesting on their own merit, but our analysis also sheds light on how
rumors, viruses, and ideas propagate over social and computer networks.
Here we report some surprising findings of the blog linking and information
propagation structure, after we analyzed one of the largest available datasets,
with 45,000 blogs and ~ 2.2 million blog-postings. Our analysis also sheds
light on how rumors, viruses, and ideas propagate over social and computer
networks. We also present a simple model that mimics the spread of information
on the blogosphere, and produces information cascades very similar to those
found in real life
Negative emotions boost users activity at BBC Forum
We present an empirical study of user activity in online BBC discussion
forums, measured by the number of posts written by individual debaters and the
average sentiment of these posts. Nearly 2.5 million posts from over 18
thousand users were investigated. Scale free distributions were observed for
activity in individual discussion threads as well as for overall activity. The
number of unique users in a thread normalized by the thread length decays with
thread length, suggesting that thread life is sustained by mutual discussions
rather than by independent comments. Automatic sentiment analysis shows that
most posts contain negative emotions and the most active users in individual
threads express predominantly negative sentiments. It follows that the average
emotion of longer threads is more negative and that threads can be sustained by
negative comments. An agent based computer simulation model has been used to
reproduce several essential characteristics of the analyzed system. The model
stresses the role of discussions between users, especially emotionally laden
quarrels between supporters of opposite opinions, and represents many observed
statistics of the forum.Comment: 29 pages, 6 figure
PARAMETRIZED EVENT ANALYSIS FROM SOCIAL NETWORKS
The growth of data in social networks facilitate demand for data analysis. The field of event detection is of increasing interest to researchers. Events from real life are actively discussed in the virtual space. Event detection results can be used in a variety of applications, from digital marketing to collecting data about natural disasters. Thereby, researchers face the emergence of new algorithms along with the improvement of existing solutions in the event detection field. This paper proposes improvements to the SEDTWik (Segment-based Event Detection from Tweets using Wikipedia) algorithm. The SEDTWik algorithm is designed to detect events without contextual guidance. The overall SEDTWik detection process excludes the perspective of a topic, or multi-topic, guided (or semi-supervised) event detection approach. As a result, some interesting narrowly focused events are not detected as they are weakly relevant in a broader context (e.g., Wikipedia) although acquiring relevance within a conditioned context. Therefore, there is a need for an adaptive perspective where data is to be analysed against a set of narrower topics of interest. This paper shows that SEDTWik gains expressive power after being extended with multi-topic semi-supervision. The evaluation of the current proposal uses the well-known corpora with labeled events, Events2012. In the Events2012 dataset used notation category for events, meaning that events are combined by a certain topic. SEDTWik with topic dictionaries was checked across all categories. In the main part of the article, it is also explained the process of topic dictionary construction from Events2012 labeled tweets. At this stage of the research, in all tasks unigrams were used. SEDTWik with dictionaries showed improved accuracy, and more events were found within a certain category
- …