4,346 research outputs found
Automated Crowdturfing Attacks and Defenses in Online Review Systems
Malicious crowdsourcing forums are gaining traction as sources of spreading
misinformation online, but are limited by the costs of hiring and managing
human workers. In this paper, we identify a new class of attacks that leverage
deep learning language models (Recurrent Neural Networks or RNNs) to automate
the generation of fake online reviews for products and services. Not only are
these attacks cheap and therefore more scalable, but they can control rate of
content output to eliminate the signature burstiness that makes crowdsourced
campaigns easy to detect.
Using Yelp reviews as an example platform, we show how a two phased review
generation and customization attack can produce reviews that are
indistinguishable by state-of-the-art statistical detectors. We conduct a
survey-based user study to show these reviews not only evade human detection,
but also score high on "usefulness" metrics by users. Finally, we develop novel
automated defenses against these attacks, by leveraging the lossy
transformation introduced by the RNN training and generation cycle. We consider
countermeasures against our mechanisms, show that they produce unattractive
cost-benefit tradeoffs for attackers, and that they can be further curtailed by
simple constraints imposed by online service providers
Unsupervised Anomaly-based Malware Detection using Hardware Features
Recent works have shown promise in using microarchitectural execution
patterns to detect malware programs. These detectors belong to a class of
detectors known as signature-based detectors as they catch malware by comparing
a program's execution pattern (signature) to execution patterns of known
malware programs. In this work, we propose a new class of detectors -
anomaly-based hardware malware detectors - that do not require signatures for
malware detection, and thus can catch a wider range of malware including
potentially novel ones. We use unsupervised machine learning to build profiles
of normal program execution based on data from performance counters, and use
these profiles to detect significant deviations in program behavior that occur
as a result of malware exploitation. We show that real-world exploitation of
popular programs such as IE and Adobe PDF Reader on a Windows/x86 platform can
be detected with nearly perfect certainty. We also examine the limits and
challenges in implementing this approach in face of a sophisticated adversary
attempting to evade anomaly-based detection. The proposed detector is
complementary to previously proposed signature-based detectors and can be used
together to improve security.Comment: 1 page, Latex; added description for feature selection in Section 4,
results unchange
Predicting the performance of users as human sensors of security threats in social media
While the human as a sensor concept has been utilised extensively for the detection of threats to safety and security in physical space, especially in emergency response and crime reporting, the concept is largely unexplored in the area of cyber security. Here, we evaluate the potential of utilising users as human sensors for the detection of cyber threats, specifically on social media. For this, we have conducted an online test and accompanying questionnaire-based survey, which was taken by 4,457 users. The test included eight realistic social media scenarios (four attack and four non-attack) in the form of screenshots, which the participants were asked to categorise as “likely attack” or “likely not attack”. We present the overall performance of human sensors in our experiment for each exhibit, and also apply logistic regression and Random Forest classifiers to evaluate the feasibility of predicting that performance based on different characteristics of the participants. Such prediction would be useful where accuracy of human sensors in detecting and reporting social media security threats is important. We identify features that are good predictors of a human sensor’s performance and evaluate them in both a theoretical ideal case and two more realistic cases, the latter corresponding to limited access to a user’s characteristics
- …