115 research outputs found
Mitigating Colluding Attacks in Online Social Networks and Crowdsourcing Platforms
Online Social Networks (OSNs) have created new ways for people to communicate, and for companies to engage their customers -- with these new avenues for communication come new vulnerabilities that can be exploited by attackers. This dissertation aims to investigate two attack models: Identity Clone Attacks (ICA) and Reconnaissance Attacks (RA). During an ICA, attackers impersonate users in a network and attempt to infiltrate social circles and extract confidential information. In an RA, attackers gather information on a target\u27s resources, employees, and relationships with other entities over public venues such as OSNs and company websites. This was made easier for the RA to be efficient because well-known social networks, such as Facebook, have a policy to force people to use their real identities for their accounts. The goal of our research is to provide mechanisms to defend against colluding attackers in the presence of ICA and RA collusion attacks. In this work, we consider a scenario not addressed by previous works, wherein multiple attackers collude against the network, and propose defense mechanisms for such an attack. We take into account the asymmetric nature of social networks and include the case where colluders could add or modify some attributes of their clones. We also consider the case where attackers send few friend requests to uncover their targets.
To detect fake reviews and uncovering colluders in crowdsourcing, we propose a semantic similarity measurement between reviews and a community detection algorithm to overcome the non-adversarial attack. ICA in a colluding attack may become stronger and more sophisticated than in a single attack. We introduce a token-based comparison and a friend list structure-matching approach, resulting in stronger identifiers even in the presence of attackers who could add or modify some attributes on the clone. We also propose a stronger RA collusion mechanism in which colluders build their own legitimacy by considering asymmetric relationships among users and, while having partial information of the networks, avoid recreating social circles around their targets. Finally, we propose a defense mechanism against colluding RA which uses the weakest person (e.g., the potential victim willing to accept friend requests) to reach their target
Search Rank Fraud De-Anonymization in Online Systems
We introduce the fraud de-anonymization problem, that goes beyond fraud
detection, to unmask the human masterminds responsible for posting search rank
fraud in online systems. We collect and study search rank fraud data from
Upwork, and survey the capabilities and behaviors of 58 search rank fraudsters
recruited from 6 crowdsourcing sites. We propose Dolos, a fraud
de-anonymization system that leverages traits and behaviors extracted from
these studies, to attribute detected fraud to crowdsourcing site fraudsters,
thus to real identities and bank accounts. We introduce MCDense, a min-cut
dense component detection algorithm to uncover groups of user accounts
controlled by different fraudsters, and leverage stylometry and deep learning
to attribute them to crowdsourcing site profiles. Dolos correctly identified
the owners of 95% of fraudster-controlled communities, and uncovered fraudsters
who promoted as many as 97.5% of fraud apps we collected from Google Play. When
evaluated on 13,087 apps (820,760 reviews), which we monitored over more than 6
months, Dolos identified 1,056 apps with suspicious reviewer groups. We report
orthogonal evidence of their fraud, including fraud duplicates and fraud
re-posts.Comment: The 29Th ACM Conference on Hypertext and Social Media, July 201
Adversarial attacks on crowdsourcing quality control
Crowdsourcing is a popular methodology to collect manual labels at scale. Such labels are often used to train AI models and, thus, quality control is a key aspect in the process. One of the most popular quality assurance mechanisms in paid micro-task crowdsourcing is based on gold questions: the use of a small set of tasks of which the requester knows the correct answer and, thus, is able to directly assess crowd work quality. In this paper, we show that such mechanism is prone to an attack carried out by a group of colluding crowd workers that is easy to implement and deploy: the inherent size limit of the gold set can be exploited by building an inferential system to detect which parts of the job are more likely to be gold questions. The described attack is robust to various forms of randomisation and programmatic generation of gold questions. We present the architecture of the proposed system, composed of a browser plug-in and an external server used to share information, and briefly introduce its potential evolution to a decentralised implementation. We implement and experimentally validate the gold detection system, using real-world data from a popular crowdsourcing platform. Our experimental results show that crowd workers using the proposed system spend more time on signalled gold questions but do not neglect the others thus achieving an increased overall work quality. Finally, we discuss the economic and sociological implications of this kind of attack
Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques and Assurance Actions
Crowdsourcing enables one to leverage on the intelligence and wisdom of
potentially large groups of individuals toward solving problems. Common
problems approached with crowdsourcing are labeling images, translating or
transcribing text, providing opinions or ideas, and similar - all tasks that
computers are not good at or where they may even fail altogether. The
introduction of humans into computations and/or everyday work, however, also
poses critical, novel challenges in terms of quality control, as the crowd is
typically composed of people with unknown and very diverse abilities, skills,
interests, personal objectives and technological resources. This survey studies
quality in the context of crowdsourcing along several dimensions, so as to
define and characterize it and to understand the current state of the art.
Specifically, this survey derives a quality model for crowdsourcing tasks,
identifies the methods and techniques that can be used to assess the attributes
of the model, and the actions and strategies that help prevent and mitigate
quality problems. An analysis of how these features are supported by the state
of the art further identifies open issues and informs an outlook on hot future
research directions.Comment: 40 pages main paper, 5 pages appendi
A Dataset on Malicious Paper Bidding in Peer Review
In conference peer review, reviewers are often asked to provide "bids" on
each submitted paper that express their interest in reviewing that paper. A
paper assignment algorithm then uses these bids (along with other data) to
compute a high-quality assignment of reviewers to papers. However, this process
has been exploited by malicious reviewers who strategically bid in order to
unethically manipulate the paper assignment, crucially undermining the peer
review process. For example, these reviewers may aim to get assigned to a
friend's paper as part of a quid-pro-quo deal. A critical impediment towards
creating and evaluating methods to mitigate this issue is the lack of any
publicly-available data on malicious paper bidding. In this work, we collect
and publicly release a novel dataset to fill this gap, collected from a mock
conference activity where participants were instructed to bid either honestly
or maliciously. We further provide a descriptive analysis of the bidding
behavior, including our categorization of different strategies employed by
participants. Finally, we evaluate the ability of each strategy to manipulate
the assignment, and also evaluate the performance of some simple algorithms
meant to detect malicious bidding. The performance of these detection
algorithms can be taken as a baseline for future research on detecting
malicious bidding
Location reliability and gamification mechanisms for mobile crowd sensing
People-centric sensing with smart phones can be used for large scale sensing of the physical world by leveraging the sensors on the phones. This new type of sensing can be a scalable and cost-effective alternative to deploying static wireless sensor networks for dense sensing coverage across large areas. However, mobile people-centric sensing has two main issues: 1) Data reliability in sensed data and 2) Incentives for participants. To study these issues, this dissertation designs and develops McSense, a mobile crowd sensing system which provides monetary and social incentives to users.
This dissertation proposes and evaluates two protocols for location reliability as a step toward achieving data reliability in sensed data, namely, ILR (Improving Location Reliability) and LINK (Location authentication through Immediate Neighbors Knowledge). ILR is a scheme which improves the location reliability of mobile crowd sensed data with minimal human efforts based on location validation using photo tasks and expanding the trust to nearby data points using periodic Bluetooth scanning. LINK is a location authentication protocol working independent of wireless carriers, in which nearby users help authenticate each other’s location claims using Bluetooth communication. The results of experiments done on Android phones show that the proposed protocols are capable of detecting a significant percentage of the malicious users claiming false location. Furthermore, simulations with the LINK protocol demonstrate that LINK can effectively thwart a number of colluding user attacks.
This dissertation also proposes a mobile sensing game which helps collect crowd sensing data by incentivizing smart phone users to play sensing games on their phones. We design and implement a first person shooter sensing game, “Alien vs. Mobile User”, which employs techniques to attract users to unpopular regions. The user study results show that mobile gaming can be a successful alternative to micro-payments for fast and efficient area coverage in crowd sensing. It is observed that the proposed game design succeeds in achieving good player engagement
Are We All in a Truman Show? Spotting Instagram Crowdturfing through Self-Training
Influencer Marketing generated $16 billion in 2022. Usually, the more popular
influencers are paid more for their collaborations. Thus, many services were
created to boost profiles' popularity metrics through bots or fake accounts.
However, real people recently started participating in such boosting activities
using their real accounts for monetary rewards, generating ungenuine content
that is extremely difficult to detect. To date, no works have attempted to
detect this new phenomenon, known as crowdturfing (CT), on Instagram.
In this work, we propose the first Instagram CT engagement detector. Our
algorithm leverages profiles' characteristics through semi-supervised learning
to spot accounts involved in CT activities. Compared to the supervised
approaches used so far to identify fake accounts, semi-supervised models can
exploit huge quantities of unlabeled data to increase performance. We purchased
and studied 1293 CT profiles from 11 providers to build our self-training
classifier, which reached 95\% F1-score. We tested our model in the wild by
detecting and analyzing CT engagement from 20 mega-influencers (i.e., with more
than one million followers), and discovered that more than 20% was artificial.
We analyzed the CT profiles and comments, showing that it is difficult to
detect these activities based solely on their generated content
- …