11,900 research outputs found
Detecting Hate Speech in Social Media
In this paper we examine methods to detect hate speech in social media, while
distinguishing this from general profanity. We aim to establish lexical
baselines for this task by applying supervised classification methods using a
recently released dataset annotated for this purpose. As features, our system
uses character n-grams, word n-grams and word skip-grams. We obtain results of
78% accuracy in identifying posts across three classes. Results demonstrate
that the main challenge lies in discriminating profanity and hate speech from
each other. A number of directions for future work are discussed.Comment: Proceedings of Recent Advances in Natural Language Processing
(RANLP). pp. 467-472. Varna, Bulgari
PRNU-based image classification of origin social network with CNN
A huge amount of images are continuously shared on social networks (SNs) daily and, in most of cases, it is very difficult to reliably establish the SN of provenance of an image when it is recovered from a hard disk, a SD card or a smartphone memory. During an investigation, it could be crucial to be able to distinguish images coming directly from a photo-camera with respect to those downloaded from a social network and possibly, in this last circumstance, determining which is the SN among a defined group. It is well known that each SN leaves peculiar traces on each content during the upload-download process; such traces can be exploited to make image classification. In this work, the idea is to use the PRNU, embedded in every acquired images, as the “carrier” of the particular SN traces which diversely modulate the PRNU. We demonstrate, in this paper, that SN-modulated noise residual can be adopted as a feature to detect the social network of origin by means of a trained convolutional neural network (CNN)
Comparative Studies of Detecting Abusive Language on Twitter
The context-dependent nature of online aggression makes annotating large
collections of data extremely difficult. Previously studied datasets in abusive
language detection have been insufficient in size to efficiently train deep
learning models. Recently, Hate and Abusive Speech on Twitter, a dataset much
greater in size and reliability, has been released. However, this dataset has
not been comprehensively studied to its potential. In this paper, we conduct
the first comparative study of various learning models on Hate and Abusive
Speech on Twitter, and discuss the possibility of using additional features and
context data for improvements. Experimental results show that bidirectional GRU
networks trained on word-level features, with Latent Topic Clustering modules,
is the most accurate model scoring 0.805 F1.Comment: ALW2: 2nd Workshop on Abusive Language Online to be held at EMNLP
2018 (Brussels, Belgium), October 31st, 201
Online Social Network Bullying Detection Using Intelligence Techniques
AbstractSocial networking sites (SNS) is being rapidly increased in recent years, which provides platform to connect people all over the world and share their interests. However, Social Networking Sites is providing opportunities for cyberbullying activities. Cyberbullying is harassing or insulting a person by sending messages of hurting or threatening nature using electronic communication. Cyberbullying poses significant threat to physical and mental health of the victims.Detection of cyberbullying and the provision of subsequent preventive measures are the main courses of action to combat cyberbullying. The proposed method is an effective method to detect cyberbullying activities on social media. The detection method can identify the presence of cyberbullying terms and classify cyberbullying activities in social network such as Flaming, Harassment, Racism and Terrorism, using Fuzzy logic and Genetic algorithm. The effectiveness of the system is increased using Fuzzy rule set to retrieve relevant data for classification from the input. In the proposed method Genetic algorithm is also used, for optimizing the parameters and to obtain precise output
Automatic Detection of Cyberbullying in Social Media Text
While social media offer great communication opportunities, they also
increase the vulnerability of young people to threatening situations online.
Recent studies report that cyberbullying constitutes a growing problem among
youngsters. Successful prevention depends on the adequate detection of
potentially harmful messages and the information overload on the Web requires
intelligent systems to identify potential risks automatically. The focus of
this paper is on automatic cyberbullying detection in social media text by
modelling posts written by bullies, victims, and bystanders of online bullying.
We describe the collection and fine-grained annotation of a training corpus for
English and Dutch and perform a series of binary classification experiments to
determine the feasibility of automatic cyberbullying detection. We make use of
linear support vector machines exploiting a rich feature set and investigate
which information sources contribute the most for this particular task.
Experiments on a holdout test set reveal promising results for the detection of
cyberbullying-related posts. After optimisation of the hyperparameters, the
classifier yields an F1-score of 64% and 61% for English and Dutch
respectively, and considerably outperforms baseline systems based on keywords
and word unigrams.Comment: 21 pages, 9 tables, under revie
- …