8,148 research outputs found
Extracting News Events from Microblogs
Twitter stream has become a large source of information for many people, but
the magnitude of tweets and the noisy nature of its content have made
harvesting the knowledge from Twitter a challenging task for researchers for a
long time. Aiming at overcoming some of the main challenges of extracting the
hidden information from tweet streams, this work proposes a new approach for
real-time detection of news events from the Twitter stream. We divide our
approach into three steps. The first step is to use a neural network or deep
learning to detect news-relevant tweets from the stream. The second step is to
apply a novel streaming data clustering algorithm to the detected news tweets
to form news events. The third and final step is to rank the detected events
based on the size of the event clusters and growth speed of the tweet
frequencies. We evaluate the proposed system on a large, publicly available
corpus of annotated news events from Twitter. As part of the evaluation, we
compare our approach with a related state-of-the-art solution. Overall, our
experiments and user-based evaluation show that our approach on detecting
current (real) news events delivers a state-of-the-art performance
Event detection, tracking, and visualization in Twitter: a mention-anomaly-based approach
The ever-growing number of people using Twitter makes it a valuable source of
timely information. However, detecting events in Twitter is a difficult task,
because tweets that report interesting events are overwhelmed by a large volume
of tweets on unrelated topics. Existing methods focus on the textual content of
tweets and ignore the social aspect of Twitter. In this paper we propose MABED
(i.e. mention-anomaly-based event detection), a novel statistical method that
relies solely on tweets and leverages the creation frequency of dynamic links
(i.e. mentions) that users insert in tweets to detect significant events and
estimate the magnitude of their impact over the crowd. MABED also differs from
the literature in that it dynamically estimates the period of time during which
each event is discussed, rather than assuming a predefined fixed duration for
all events. The experiments we conducted on both English and French Twitter
data show that the mention-anomaly-based approach leads to more accurate event
detection and improved robustness in presence of noisy Twitter content.
Qualitatively speaking, we find that MABED helps with the interpretation of
detected events by providing clear textual descriptions and precise temporal
descriptions. We also show how MABED can help understanding users' interest.
Furthermore, we describe three visualizations designed to favor an efficient
exploration of the detected events.Comment: 17 page
- …