Search CORE

2 research outputs found

Automatic Detection of Cyberbullying in Social Media Text

Author: B Sri Nandhini
Bart Desmet
Ben Verhoeven
C Cortes
C Salmivalli
C Salmivalli
C Salmivalli
C Salmivalli
CC Chang
CE Osgood
Chris Emmery
Cynthia Van Hee
D Olweus
DM Blei
EF Gross
Els Lefever
F Dehue
F Pedregosa
Gilles Jacobs
Guy De Pauw
H Cowie
H He
H Vandebosch
H Vandebosch
H Zijlstra
Hussein Suleman
J Cohen
J Juvonen
JJ Dooley
JL Fleiss
K Van Royen
K Van Royen
KY Mckenna
M Fekkes
M O’Moore
M Price
M van de Kauter
MA Al-garadi
ML McHugh
NE Willard
NV Chawla
P Galán-García
PB O’Sullivan
PJ Stone
PK Smith
R Slonje
R Slonje
R Zhao
RE Fan
RS Tokunaga
S Bastiaensens
S Bastiaensens
S Deerwester
S Hinduja
S Hinduja
T Fawcett
V Nahar
Véronique Hoste
Walter Daelemans
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

While social media offer great communication opportunities, they also increase the vulnerability of young people to threatening situations online. Recent studies report that cyberbullying constitutes a growing problem among youngsters. Successful prevention depends on the adequate detection of potentially harmful messages and the information overload on the Web requires intelligent systems to identify potential risks automatically. The focus of this paper is on automatic cyberbullying detection in social media text by modelling posts written by bullies, victims, and bystanders of online bullying. We describe the collection and fine-grained annotation of a training corpus for English and Dutch and perform a series of binary classification experiments to determine the feasibility of automatic cyberbullying detection. We make use of linear support vector machines exploiting a rich feature set and investigate which information sources contribute the most for this particular task. Experiments on a holdout test set reveal promising results for the detection of cyberbullying-related posts. After optimisation of the hyperparameters, the classifier yields an F1-score of 64% and 61% for English and Dutch respectively, and considerably outperforms baseline systems based on keywords and word unigrams.Comment: 21 pages, 9 tables, under revie

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

Directory of Open Access Journals

Institutional Repository Universiteit Antwerpen

Tilburg University Repository