1,407 research outputs found

    The paradigm-shift of social spambots: Evidence, theories, and tools for the arms race

    Full text link
    Recent studies in social media spam and automation provide anecdotal argumentation of the rise of a new generation of spambots, so-called social spambots. Here, for the first time, we extensively study this novel phenomenon on Twitter and we provide quantitative evidence that a paradigm-shift exists in spambot design. First, we measure current Twitter's capabilities of detecting the new social spambots. Later, we assess the human performance in discriminating between genuine accounts, social spambots, and traditional spambots. Then, we benchmark several state-of-the-art techniques proposed by the academic literature. Results show that neither Twitter, nor humans, nor cutting-edge applications are currently capable of accurately detecting the new social spambots. Our results call for new approaches capable of turning the tide in the fight against this raising phenomenon. We conclude by reviewing the latest literature on spambots detection and we highlight an emerging common research trend based on the analysis of collective behaviors. Insights derived from both our extensive experimental campaign and survey shed light on the most promising directions of research and lay the foundations for the arms race against the novel social spambots. Finally, to foster research on this novel phenomenon, we make publicly available to the scientific community all the datasets used in this study.Comment: To appear in Proc. 26th WWW, 2017, Companion Volume (Web Science Track, Perth, Australia, 3-7 April, 2017

    Certified Computation from Unreliable Datasets

    Full text link
    A wide range of learning tasks require human input in labeling massive data. The collected data though are usually low quality and contain inaccuracies and errors. As a result, modern science and business face the problem of learning from unreliable data sets. In this work, we provide a generic approach that is based on \textit{verification} of only few records of the data set to guarantee high quality learning outcomes for various optimization objectives. Our method, identifies small sets of critical records and verifies their validity. We show that many problems only need poly(1/ε)\text{poly}(1/\varepsilon) verifications, to ensure that the output of the computation is at most a factor of (1±ε)(1 \pm \varepsilon) away from the truth. For any given instance, we provide an \textit{instance optimal} solution that verifies the minimum possible number of records to approximately certify correctness. Then using this instance optimal formulation of the problem we prove our main result: "every function that satisfies some Lipschitz continuity condition can be certified with a small number of verifications". We show that the required Lipschitz continuity condition is satisfied even by some NP-complete problems, which illustrates the generality and importance of this theorem. In case this certification step fails, an invalid record will be identified. Removing these records and repeating until success, guarantees that the result will be accurate and will depend only on the verified records. Surprisingly, as we show, for several computation tasks more efficient methods are possible. These methods always guarantee that the produced result is not affected by the invalid records, since any invalid record that affects the output will be detected and verified

    Time Critical Social Mobilization: The DARPA Network Challenge Winning Strategy

    Full text link
    It is now commonplace to see the Web as a platform that can harness the collective abilities of large numbers of people to accomplish tasks with unprecedented speed, accuracy and scale. To push this idea to its limit, DARPA launched its Network Challenge, which aimed to "explore the roles the Internet and social networking play in the timely communication, wide-area team-building, and urgent mobilization required to solve broad-scope, time-critical problems." The challenge required teams to provide coordinates of ten red weather balloons placed at different locations in the continental United States. This large-scale mobilization required the ability to spread information about the tasks widely and quickly, and to incentivize individuals to act. We report on the winning team's strategy, which utilized a novel recursive incentive mechanism to find all balloons in under nine hours. We analyze the theoretical properties of the mechanism, and present data about its performance in the challenge.Comment: 25 pages, 6 figure

    An Abstract Formal Basis for Digital Crowds

    Get PDF
    Crowdsourcing, together with its related approaches, has become very popular in recent years. All crowdsourcing processes involve the participation of a digital crowd, a large number of people that access a single Internet platform or shared service. In this paper we explore the possibility of applying formal methods, typically used for the verification of software and hardware systems, in analysing the behaviour of a digital crowd. More precisely, we provide a formal description language for specifying digital crowds. We represent digital crowds in which the agents do not directly communicate with each other. We further show how this specification can provide the basis for sophisticated formal methods, in particular formal verification.Comment: 32 pages, 4 figure
    • …
    corecore