6,321 research outputs found
Engineering Crowdsourced Stream Processing Systems
A crowdsourced stream processing system (CSP) is a system that incorporates
crowdsourced tasks in the processing of a data stream. This can be seen as
enabling crowdsourcing work to be applied on a sample of large-scale data at
high speed, or equivalently, enabling stream processing to employ human
intelligence. It also leads to a substantial expansion of the capabilities of
data processing systems. Engineering a CSP system requires the combination of
human and machine computation elements. From a general systems theory
perspective, this means taking into account inherited as well as emerging
properties from both these elements. In this paper, we position CSP systems
within a broader taxonomy, outline a series of design principles and evaluation
metrics, present an extensible framework for their design, and describe several
design patterns. We showcase the capabilities of CSP systems by performing a
case study that applies our proposed framework to the design and analysis of a
real system (AIDR) that classifies social media messages during time-critical
crisis events. Results show that compared to a pure stream processing system,
AIDR can achieve a higher data classification accuracy, while compared to a
pure crowdsourcing solution, the system makes better use of human workers by
requiring much less manual work effort
Crowdsourced Rumour Identification During Emergencies
When a significant event occurs, many social media users leverage platforms such as Twitter to track that event. Moreover, emergency response agencies are increasingly looking to social media as a source of real-time information about such events. However, false information and rumours are often spread during such events, which can influence public opinion and limit the usefulness of social media for emergency management. In this paper, we present an initial study into rumour identification during emergencies using crowdsourcing. In particular, through an analysis of three tweet datasets relating to emergency events from 2014, we propose a taxonomy of tweets relating to rumours. We then perform a crowdsourced labeling experiment to determine whether crowd assessors can identify rumour-related tweets and where such labeling can fail. Our results show that overall, agreement over the tweet labels produced were high (0.7634 Fleiss Kappa), indicating that crowd-based rumour labeling is possible. However, not all tweets are of equal difficulty to assess. Indeed, we show that tweets containing disputed/controversial information tend to be some of the most difficult to identify
Crisis Analytics: Big Data Driven Crisis Response
Disasters have long been a scourge for humanity. With the advances in
technology (in terms of computing, communications, and the ability to process
and analyze big data), our ability to respond to disasters is at an inflection
point. There is great optimism that big data tools can be leveraged to process
the large amounts of crisis-related data (in the form of user generated data in
addition to the traditional humanitarian data) to provide an insight into the
fast-changing situation and help drive an effective disaster response. This
article introduces the history and the future of big crisis data analytics,
along with a discussion on its promise, challenges, and pitfalls
- …