1 research outputs found
Evaluation of Deep Learning Models for Hostility Detection in Hindi Text
The social media platform is a convenient medium to express personal thoughts
and share useful information. It is fast, concise, and has the ability to reach
millions. It is an effective place to archive thoughts, share artistic content,
receive feedback, promote products, etc. Despite having numerous advantages
these platforms have given a boost to hostile posts. Hate speech and derogatory
remarks are being posted for personal satisfaction or political gain. The
hostile posts can have a bullying effect rendering the entire platform
experience hostile. Therefore detection of hostile posts is important to
maintain social media hygiene. The problem is more pronounced languages like
Hindi which are low in resources. In this work, we present approaches for
hostile text detection in the Hindi language. The proposed approaches are
evaluated on the Constraint@AAAI 2021 Hindi hostility detection dataset. The
dataset consists of hostile and non-hostile texts collected from social media
platforms. The hostile posts are further segregated into overlapping classes of
fake, offensive, hate, and defamation. We evaluate a host of deep learning
approaches based on CNN, LSTM, and BERT for this multi-label classification
problem. The pre-trained Hindi fast text word embeddings by IndicNLP and
Facebook are used in conjunction with CNN and LSTM models. Two variations of
pre-trained multilingual transformer language models mBERT and IndicBERT are
used. We show that the performance of BERT based models is best. Moreover, CNN
and LSTM models also perform competitively with BERT based models.Comment: Accepted at IEEE I2CT 202