601 research outputs found
Disruption and Deception in Crowdsourcing: Towards a Crowdsourcing Risk Framework
While crowdsourcing has become increasingly popular among organizations, it also has become increasingly susceptible to unethical and malicious activities. This paper discusses recent examples of disruptive and deceptive efforts on crowdsourcing sites, which impacted the confidentiality, integrity, and availability of the crowdsourcing efforts’ service, stakeholders, and data. From these examples, we derive an organizing framework of risk types associated with disruption and deception in crowdsourcing based on commonalities among incidents. The framework includes prank activities, the intentional placement of false information, hacking attempts, DDoS attacks, botnet attacks, privacy violation attempts, and data breaches. Finally, we discuss example controls that can assist in identifying and mitigating disruption and deception risks in crowdsourcing
Online Deception Detection Refueled by Real World Data Collection
The lack of large realistic datasets presents a bottleneck in online
deception detection studies. In this paper, we apply a data collection method
based on social network analysis to quickly identify high-quality deceptive and
truthful online reviews from Amazon. The dataset contains more than 10,000
deceptive reviews and is diverse in product domains and reviewers. Using this
dataset, we explore effective general features for online deception detection
that perform well across domains. We demonstrate that with generalized features
- advertising speak and writing complexity scores - deception detection
performance can be further improved by adding additional deceptive reviews from
assorted domains in training. Finally, reviewer level evaluation gives an
interesting insight into different deceptive reviewers' writing styles.Comment: 10 pages, Accepted to Recent Advances in Natural Language Processing
(RANLP) 201
Stateless Puzzles for Real Time Online Fraud Preemption
The profitability of fraud in online systems such as app markets and social
networks marks the failure of existing defense mechanisms. In this paper, we
propose FraudSys, a real-time fraud preemption approach that imposes
Bitcoin-inspired computational puzzles on the devices that post online system
activities, such as reviews and likes. We introduce and leverage several novel
concepts that include (i) stateless, verifiable computational puzzles, that
impose minimal performance overhead, but enable the efficient verification of
their authenticity, (ii) a real-time, graph-based solution to assign fraud
scores to user activities, and (iii) mechanisms to dynamically adjust puzzle
difficulty levels based on fraud scores and the computational capabilities of
devices. FraudSys does not alter the experience of users in online systems, but
delays fraudulent actions and consumes significant computational resources of
the fraudsters. Using real datasets from Google Play and Facebook, we
demonstrate the feasibility of FraudSys by showing that the devices of honest
users are minimally impacted, while fraudster controlled devices receive daily
computational penalties of up to 3,079 hours. In addition, we show that with
FraudSys, fraud does not pay off, as a user equipped with mining hardware
(e.g., AntMiner S7) will earn less than half through fraud than from honest
Bitcoin mining
Uncovering Download Fraud Activities in Mobile App Markets
Download fraud is a prevalent threat in mobile App markets, where fraudsters
manipulate the number of downloads of Apps via various cheating approaches.
Purchased fake downloads can mislead recommendation and search algorithms and
further lead to bad user experience in App markets. In this paper, we
investigate download fraud problem based on a company's App Market, which is
one of the most popular Android App markets. We release a honeypot App on the
App Market and purchase fake downloads from fraudster agents to track fraud
activities in the wild. Based on our interaction with the fraudsters, we
categorize download fraud activities into three types according to their
intentions: boosting front end downloads, optimizing App search ranking, and
enhancing user acquisition&retention rate. For the download fraud aimed at
optimizing App search ranking, we select, evaluate, and validate several
features in identifying fake downloads based on billions of download data. To
get a comprehensive understanding of download fraud, we further gather stances
of App marketers, fraudster agencies, and market operators on download fraud.
The followed analysis and suggestions shed light on the ways to mitigate
download fraud in App markets and other social platforms. To the best of our
knowledge, this is the first work that investigates the download fraud problem
in mobile App markets.Comment: Published as a conference paper in IEEE/ACM ASONAM 201
Crowd and AI Powered Manipulation: Characterization and Detection
User reviews are ubiquitous. They power online review aggregators that influence our daily-based decisions, from what products to purchase (e.g., Amazon), movies to view (e.g., Netflix,
HBO, Hulu), restaurants to patronize (e.g., Yelp), and hotels to book (e.g., TripAdvisor, Airbnb).
In addition, policy makers rely on online commenting platforms like Regulations.gov and FCC.gov as a means for citizens to voice their opinions about public policy issues. However, showcasing the opinions of fellow users has a dark side as these reviews and comments are vulnerable to manipulation. And as advances in AI continue, fake reviews generated by AI agents rather than users pose even more scalable and dangerous manipulation attacks. These attacks on online discourse can sway ratings of products, manipulate opinions and perceived support of key issues, and degrade our trust in online platforms. Previous efforts have mainly focused on highly visible anomaly behaviors captured by statistical modeling or clustering algorithms. While detection of such anomalous behaviors helps to improve the reliability of online interactions, it misses subtle and difficult-to-detect behaviors.
This research investigates two major research thrusts centered around manipulation strategies.
In the first thrust, we study crowd-based manipulation strategies wherein crowds of paid workers organize to spread fake reviews. In the second thrust, we explore AI-based manipulation strategies, where crowd workers are replaced by scalable, and potentially undetectable generative models of fake reviews. In particular, one of the key aspects of this work is to address the research gap in previous efforts for anomaly detection where ground truth data is missing (and hence, evaluation can be challenging). In addition, this work studies the capabilities and impact of model-based attacks as the next generation of online threats. We propose inter-related methods for collecting evidence of these attacks, and create new countermeasures for defending against them. The performance of proposed methods are compared against other state-of-the-art approaches in the literature. We find that although crowd campaigns do not show obvious anomaly behavior, they can be detected
given a careful formulation of their behaviors. And, although model-generated fake reviews may appear on the surface to be legitimate, we find that they do not completely mimic the underlying distribution of human-written reviews, so we can leverage this signal to detect them
Crowdsourcing geospatial data for Earth and human observations: a review
The transformation from authoritative to user-generated data landscapes has garnered considerable attention, notably with the proliferation of crowdsourced geospatial data. Facilitated by advancements in digital technology and high-speed communication, this paradigm shift has democratized data collection, obliterating traditional barriers between data producers and users. While previous literature has compartmentalized this subject into distinct platforms and application domains, this review offers a holistic examination of crowdsourced geospatial data. Employing a narrative review approach due to the interdisciplinary nature of the topic, we investigate both human and Earth observations through crowdsourced initiatives. This review categorizes the diverse applications of these data and rigorously examines specific platforms and paradigms pertinent to data collection. Furthermore, it addresses salient challenges, encompassing data quality, inherent biases, and ethical dimensions. We contend that this thorough analysis will serve as an invaluable scholarly resource, encapsulating the current state-of-the-art in crowdsourced geospatial data, and offering strategic directions for future interdisciplinary research and applications across various sectors
Marketing Intelligence: Boom or Bust of Service Marketing?
Marketing intelligence fosters two major developments within digital service marketing. On the one hand, a boom of services seems to have evolved, accelerated by the opportunities of marketing intelligence. It has contributed to the optimization of customer experiences, e.g., supported by mobile, personalized, and customized marketing services. On the other hand, (digital) self-services are likely to pervert the term “service”. Lifecycle marketing, including annoying marketing communication in real-time, automated price adjustment and programmatic advertising based on artificial intelligence, affects the vision of fully standardized marketing automation. Additionally, there are incentives to pollute the digital information in order to manufacture opinions. Fake news is one popular example. This leads to the (open) question if marketing intelligence means service boom or bust of marketing. This contribution aims to elaborate the boom-and-bust aspects of marketing intelligence and suggests a trade-off. The method applied in this paper will be a descriptive and conceptual literature review, through which the paradigmatic thoughts will be juxtaposed from the perspective of service
- …