26 research outputs found
Scalable distributed event detection for Twitter
Social media streams, such as Twitter, have shown themselves to be useful sources of real-time information about what is happening in the world. Automatic detection and tracking of events identified in these streams have a variety of real-world applications, e.g. identifying and automatically reporting road accidents for emergency services. However, to be useful, events need to be identified within the stream with a very low latency. This is challenging due to the high volume of posts within these social streams. In this paper, we propose a novel event detection approach that can both effectively detect events within social streams like Twitter and can scale to thousands of posts every second. Through experimentation on a large Twitter dataset, we show that our approach can process the equivalent to the full Twitter Firehose stream, while maintaining event detection accuracy and outperforming an alternative distributed event detection system
SUPER: Towards the Use of Social Sensors for Security Assessments and Proactive Management of Emergencies
Social media statistics during recent disasters (e.g. the 20 million tweets relating to 'Sandy' storm and the sharing of related photos in Instagram at a rate of 10/sec) suggest that the understanding and management of real-world events by civil protection and law enforcement agencies could benefit from the effective blending of social media information into their resilience processes. In this paper, we argue that despite the widespread use of social media in various domains (e.g. marketing/branding/finance), there is still no easy, standardized and effective way to leverage different social media streams -- also referred to as social sensors -- in security/emergency management applications. We also describe the EU FP7 project SUPER (Social sensors for secUrity assessments and Proactive EmeRgencies management), started in 2014, which aims to tackle this technology gap
Tweeting Behaviour during Train Disruptions within a City
In a smart city environment, citizens use social media for communicating and reporting events. Existing
work has shown that social media tools, such as Twitter and Facebook, can be used as social sensors to monitor
events in real-time as they happen (e.g. riots, natural disasters and sport events). In this paper, we study the
reactions of citizens in social media towards train disruptions within a city. Our study using 30 days of tweets in a large city shows that citizens react differently to train disruptions by, for instance, displaying unique behaviours in tweeting depending on the time of the disruption. Specifically, for working days, tweets related to train disruptions are typically generated during rush hour periods. In contrast, during weekends, urban citizens tended to tweet about train disruptions during late evenings. Using these insights, we develop a supervised approach to predict whether a train disruption tweet will be retweeted and propagated on the social network, by using features, such as time, user, and the content of tweets. Our experimental results show that we can effectively predict when a train disruption tweet is retweeted by using such features
A Tutorial on Event Detection using Social Media Data Analysis: Applications, Challenges, and Open Problems
In recent years, social media has become one of the most popular platforms
for communication. These platforms allow users to report real-world incidents
that might swiftly and widely circulate throughout the whole social network. A
social event is a real-world incident that is documented on social media.
Social gatherings could contain vital documentation of crisis scenarios.
Monitoring and analyzing this rich content can produce information that is
extraordinarily valuable and help people and organizations learn how to take
action. In this paper, a survey on the potential benefits and applications of
event detection with social media data analysis will be presented. Moreover,
the critical challenges and the fundamental tradeoffs in event detection will
be methodically investigated by monitoring social media stream. Then,
fundamental open questions and possible research directions will be introduced
Analyzing the Language of Food on Social Media
We investigate the predictive power behind the language of food on social
media. We collect a corpus of over three million food-related posts from
Twitter and demonstrate that many latent population characteristics can be
directly predicted from this data: overweight rate, diabetes rate, political
leaning, and home geographical location of authors. For all tasks, our
language-based models significantly outperform the majority-class baselines.
Performance is further improved with more complex natural language processing,
such as topic modeling. We analyze which textual features have most predictive
power for these datasets, providing insight into the connections between the
language of food, geographic locale, and community characteristics. Lastly, we
design and implement an online system for real-time query and visualization of
the dataset. Visualization tools, such as geo-referenced heatmaps,
semantics-preserving wordclouds and temporal histograms, allow us to discover
more complex, global patterns mirrored in the language of food.Comment: An extended abstract of this paper will appear in IEEE Big Data 201
EAIMS: Emergency Analysis Identification and Management System
Social media has great potential as a means to enable civil
protection and law enforcement agencies to more effectively
tackle disasters and emergencies. However, there is currently
a lack of tools that enable civil protection agencies
to easily make use of social media. The Emergency Analysis
Identification and Management System (EAIMS) is a prototype
service that provides real-time detection of emergency
events, related information finding and credibility analysis
tools for use over social media during emergencies. This
system exploits machine learning over data gathered from
past emergencies and disasters to build effective models for
identifying new events as they occur, tracking developments
within those events and analyzing those developments for
the purposes of enhancing the decision making processes of
emergency response agencies
Detecting Vital Documents in Massive Data Streams
Existing knowledge bases, includingWikipedia, are typically written and maintained by a group of voluntary editors. Meanwhile, numerous web documents are being published partly due to the popularization of online news and social media. Some of the web documents, called "vital documents", contain novel information that should be taken into account in updating articles of the knowledge bases. However, it is practically impossible for the editors to manually monitor all the relevant web documents. Consequently, there is a considerable time lag between an edit to knowledge base and the publication dates of such vital documents. This paper proposes a realtime detection framework of web documents containing novel information flowing in massive document streams. The framework consists of twostep filter using statistical language models. Further, the framework is implemented on the distributed and faulttolerant realtime computation system, Apache Storm, in order to process the large number of web documents. On a publicly available web document data set, the TREC KBA Stream Corpus, the validity of the proposed framework is demonstrated in terms of the detection performance and processing time
Event Detection from Social Media Stream: Methods, Datasets and Opportunities
Social media streams contain large and diverse amount of information, ranging
from daily-life stories to the latest global and local events and news.
Twitter, especially, allows a fast spread of events happening real time, and
enables individuals and organizations to stay informed of the events happening
now. Event detection from social media data poses different challenges from
traditional text and is a research area that has attracted much attention in
recent years. In this paper, we survey a wide range of event detection methods
for Twitter data stream, helping readers understand the recent development in
this area. We present the datasets available to the public. Furthermore, a few
research opportunitiesComment: 8 page