1,407 research outputs found
The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race
Recent studies in social media spam and automation provide anecdotal
argumentation of the rise of a new generation of spambots, so-called social
spambots. Here, for the first time, we extensively study this novel phenomenon
on Twitter and we provide quantitative evidence that a paradigm-shift exists in
spambot design. First, we measure current Twitter's capabilities of detecting
the new social spambots. Later, we assess the human performance in
discriminating between genuine accounts, social spambots, and traditional
spambots. Then, we benchmark several state-of-the-art techniques proposed by
the academic literature. Results show that neither Twitter, nor humans, nor
cutting-edge applications are currently capable of accurately detecting the new
social spambots. Our results call for new approaches capable of turning the
tide in the fight against this raising phenomenon. We conclude by reviewing the
latest literature on spambots detection and we highlight an emerging common
research trend based on the analysis of collective behaviors. Insights derived
from both our extensive experimental campaign and survey shed light on the most
promising directions of research and lay the foundations for the arms race
against the novel social spambots. Finally, to foster research on this novel
phenomenon, we make publicly available to the scientific community all the
datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science
Track, Perth, Australia, 3-7 April, 2017
Certified Computation from Unreliable Datasets
A wide range of learning tasks require human input in labeling massive data.
The collected data though are usually low quality and contain inaccuracies and
errors. As a result, modern science and business face the problem of learning
from unreliable data sets.
In this work, we provide a generic approach that is based on
\textit{verification} of only few records of the data set to guarantee high
quality learning outcomes for various optimization objectives. Our method,
identifies small sets of critical records and verifies their validity. We show
that many problems only need verifications, to
ensure that the output of the computation is at most a factor of away from the truth. For any given instance, we provide an
\textit{instance optimal} solution that verifies the minimum possible number of
records to approximately certify correctness. Then using this instance optimal
formulation of the problem we prove our main result: "every function that
satisfies some Lipschitz continuity condition can be certified with a small
number of verifications". We show that the required Lipschitz continuity
condition is satisfied even by some NP-complete problems, which illustrates the
generality and importance of this theorem.
In case this certification step fails, an invalid record will be identified.
Removing these records and repeating until success, guarantees that the result
will be accurate and will depend only on the verified records. Surprisingly, as
we show, for several computation tasks more efficient methods are possible.
These methods always guarantee that the produced result is not affected by the
invalid records, since any invalid record that affects the output will be
detected and verified
Time Critical Social Mobilization: The DARPA Network Challenge Winning Strategy
It is now commonplace to see the Web as a platform that can harness the
collective abilities of large numbers of people to accomplish tasks with
unprecedented speed, accuracy and scale. To push this idea to its limit, DARPA
launched its Network Challenge, which aimed to "explore the roles the Internet
and social networking play in the timely communication, wide-area
team-building, and urgent mobilization required to solve broad-scope,
time-critical problems." The challenge required teams to provide coordinates of
ten red weather balloons placed at different locations in the continental
United States. This large-scale mobilization required the ability to spread
information about the tasks widely and quickly, and to incentivize individuals
to act. We report on the winning team's strategy, which utilized a novel
recursive incentive mechanism to find all balloons in under nine hours. We
analyze the theoretical properties of the mechanism, and present data about its
performance in the challenge.Comment: 25 pages, 6 figure
An Abstract Formal Basis for Digital Crowds
Crowdsourcing, together with its related approaches, has become very popular
in recent years. All crowdsourcing processes involve the participation of a
digital crowd, a large number of people that access a single Internet platform
or shared service. In this paper we explore the possibility of applying formal
methods, typically used for the verification of software and hardware systems,
in analysing the behaviour of a digital crowd. More precisely, we provide a
formal description language for specifying digital crowds. We represent digital
crowds in which the agents do not directly communicate with each other. We
further show how this specification can provide the basis for sophisticated
formal methods, in particular formal verification.Comment: 32 pages, 4 figure
- …