27,046 research outputs found
Crowdsourcing Without a Crowd: Reliable Online Species Identification Using Bayesian Models to Minimize Crowd Size
We present an incremental Bayesian model that resolves key issues of crowd size and data quality for consensus labeling. We evaluate our method using data collected from a real-world citizen science program, BeeWatch, which invites members of the public in the United Kingdom to classify (label) photographs of bumblebees as one of 22 possible species. The biological recording domain poses two key and hitherto unaddressed challenges for consensus models of crowdsourcing: (1) the large number of potential species makes classification difficult, and (2) this is compounded by limited crowd availability, stemming from both the inherent difficulty of the task and the lack of relevant skills among the general public. We demonstrate that consensus labels can be reliably found in such circumstances with very small crowd sizes of around three to five users (i.e., through group sourcing). Our incremental Bayesian model, which minimizes crowd size by re-evaluating the quality of the consensus label following each species identification solicited from the crowd, is competitive with a Bayesian approach that uses a larger but fixed crowd size and outperforms majority voting. These results have important ecological applicability: biological recording programs such as BeeWatch can sustain themselves when resources such as taxonomic experts to confirm identifications by photo submitters are scarce (as is typically the case), and feedback can be provided to submitters in a timely fashion. More generally, our model provides benefits to any crowdsourced consensus labeling task where there is a cost (financial or otherwise) associated with soliciting a label
Wireless Communications in the Era of Big Data
The rapidly growing wave of wireless data service is pushing against the
boundary of our communication network's processing power. The pervasive and
exponentially increasing data traffic present imminent challenges to all the
aspects of the wireless system design, such as spectrum efficiency, computing
capabilities and fronthaul/backhaul link capacity. In this article, we discuss
the challenges and opportunities in the design of scalable wireless systems to
embrace such a "bigdata" era. On one hand, we review the state-of-the-art
networking architectures and signal processing techniques adaptable for
managing the bigdata traffic in wireless networks. On the other hand, instead
of viewing mobile bigdata as a unwanted burden, we introduce methods to
capitalize from the vast data traffic, for building a bigdata-aware wireless
network with better wireless service quality and new mobile applications. We
highlight several promising future research directions for wireless
communications in the mobile bigdata era.Comment: This article is accepted and to appear in IEEE Communications
Magazin
Considering Human Aspects on Strategies for Designing and Managing Distributed Human Computation
A human computation system can be viewed as a distributed system in which the
processors are humans, called workers. Such systems harness the cognitive power
of a group of workers connected to the Internet to execute relatively simple
tasks, whose solutions, once grouped, solve a problem that systems equipped
with only machines could not solve satisfactorily. Examples of such systems are
Amazon Mechanical Turk and the Zooniverse platform. A human computation
application comprises a group of tasks, each of them can be performed by one
worker. Tasks might have dependencies among each other. In this study, we
propose a theoretical framework to analyze such type of application from a
distributed systems point of view. Our framework is established on three
dimensions that represent different perspectives in which human computation
applications can be approached: quality-of-service requirements, design and
management strategies, and human aspects. By using this framework, we review
human computation in the perspective of programmers seeking to improve the
design of human computation applications and managers seeking to increase the
effectiveness of human computation infrastructures in running such
applications. In doing so, besides integrating and organizing what has been
done in this direction, we also put into perspective the fact that the human
aspects of the workers in such systems introduce new challenges in terms of,
for example, task assignment, dependency management, and fault prevention and
tolerance. We discuss how they are related to distributed systems and other
areas of knowledge.Comment: 3 figures, 1 tabl
Challenges in Complex Systems Science
FuturICT foundations are social science, complex systems science, and ICT.
The main concerns and challenges in the science of complex systems in the
context of FuturICT are laid out in this paper with special emphasis on the
Complex Systems route to Social Sciences. This include complex systems having:
many heterogeneous interacting parts; multiple scales; complicated transition
laws; unexpected or unpredicted emergence; sensitive dependence on initial
conditions; path-dependent dynamics; networked hierarchical connectivities;
interaction of autonomous agents; self-organisation; non-equilibrium dynamics;
combinatorial explosion; adaptivity to changing environments; co-evolving
subsystems; ill-defined boundaries; and multilevel dynamics. In this context,
science is seen as the process of abstracting the dynamics of systems from
data. This presents many challenges including: data gathering by large-scale
experiment, participatory sensing and social computation, managing huge
distributed dynamic and heterogeneous databases; moving from data to dynamical
models, going beyond correlations to cause-effect relationships, understanding
the relationship between simple and comprehensive models with appropriate
choices of variables, ensemble modeling and data assimilation, modeling systems
of systems of systems with many levels between micro and macro; and formulating
new approaches to prediction, forecasting, and risk, especially in systems that
can reflect on and change their behaviour in response to predictions, and
systems whose apparently predictable behaviour is disrupted by apparently
unpredictable rare or extreme events. These challenges are part of the FuturICT
agenda
- …