14,108 research outputs found
Automated Crowdturfing Attacks and Defenses in Online Review Systems
Malicious crowdsourcing forums are gaining traction as sources of spreading
misinformation online, but are limited by the costs of hiring and managing
human workers. In this paper, we identify a new class of attacks that leverage
deep learning language models (Recurrent Neural Networks or RNNs) to automate
the generation of fake online reviews for products and services. Not only are
these attacks cheap and therefore more scalable, but they can control rate of
content output to eliminate the signature burstiness that makes crowdsourced
campaigns easy to detect.
Using Yelp reviews as an example platform, we show how a two phased review
generation and customization attack can produce reviews that are
indistinguishable by state-of-the-art statistical detectors. We conduct a
survey-based user study to show these reviews not only evade human detection,
but also score high on "usefulness" metrics by users. Finally, we develop novel
automated defenses against these attacks, by leveraging the lossy
transformation introduced by the RNN training and generation cycle. We consider
countermeasures against our mechanisms, show that they produce unattractive
cost-benefit tradeoffs for attackers, and that they can be further curtailed by
simple constraints imposed by online service providers
Image-based Text Classification using 2D Convolutional Neural Networks
We propose a new approach to text classification
in which we consider the input text as an image and apply
2D Convolutional Neural Networks to learn the local and
global semantics of the sentences from the variations of the
visual patterns of words. Our approach demonstrates that
it is possible to get semantically meaningful features from
images with text without using optical character recognition
and sequential processing pipelines, techniques that traditional
natural language processing algorithms require. To validate
our approach, we present results for two applications: text
classification and dialog modeling. Using a 2D Convolutional
Neural Network, we were able to outperform the state-ofart
accuracy results for a Chinese text classification task and
achieved promising results for seven English text classification
tasks. Furthermore, our approach outperformed the memory
networks without match types when using out of vocabulary
entities from Task 4 of the bAbI dialog dataset
Guess who? Multilingual approach for the automated generation of author-stylized poetry
This paper addresses the problem of stylized text generation in a
multilingual setup. A version of a language model based on a long short-term
memory (LSTM) artificial neural network with extended phonetic and semantic
embeddings is used for stylized poetry generation. The quality of the resulting
poems generated by the network is estimated through bilingual evaluation
understudy (BLEU), a survey and a new cross-entropy based metric that is
suggested for the problems of such type. The experiments show that the proposed
model consistently outperforms random sample and vanilla-LSTM baselines, humans
also tend to associate machine generated texts with the target author
Recommended from our members
Predicting Second and Third Graders' Reading Comprehension Gains: Observing Students' and Classmates Talk during Literacy Instruction using COLT.
This paper introduces a new observation system that is designed to investigate students' and teachers' talk during literacy instruction, Creating Opportunities to Learn from Text (COLT). Using video-recorded observations of 2nd-3rd grade literacy instruction (N=51 classrooms, 337 students, 151 observations), we found that nine types of student talk ranged from using non-verbal gestures to generating new ideas. The more a student talked, the greater were his/her reading comprehension (RC) gains. Classmate talk also predicted RC outcomes (total effect size=0.27). We found that 11 types of teacher talk ranged from asking simple questions to encouraging students' thinking and reasoning. Teacher talk predicted student talk but did not predict students' RC gains directly. Findings highlight the importance of each student's discourse during literacy instruction, how classmates' talk contributes to the learning environments that each student experiences, and how this affects RC gains, with implications for improving the effectiveness of literacy instruction
- …