4 research outputs found
A comparative study of different state-of-the-art hate speech detection methods in Hindi-English code-mixed data
Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities. In an environment where multilingual speakers switch among multiple languages, hate speech detection becomes a challenging task using methods that are designed for monolingual corpora. In our work, we attempt to analyze, detect and provide a comparative study of hate speech in a code-mixed social media text. We also provide a Hindi-English code-mixed data set consisting of Facebook and Twitter posts and comments. Our experiments show that deep learning models trained on this code-mixed corpus perform better.This publication has emanated from research supported in
part by a research grant from Science Foundation Ireland
(SFI) under Grant Number SFI/12/RC/2289 (Insight),
SFI/12/RC/2289 P2 (Insight 2), & SFI/18/CRT/6223
(CRT-Centre for Research Training in Artficial Intelligence) co-funded by the European Regional Development
Fund as well as by the EU H2020 programme under grant
agreements 731015 (ELEXIS-European Lexical Infrastructure), 825182 (Pret- ˆ a-LLOD), and Irish Research Council `
grant IRCLA/2017/129 (CARDAMOM-Comparative
Deep Models of Language for Minority and Historical
Languages). The authors are grateful to Ajay Bohra and
his team for sharing their data set and for their support.
We would also like to thank our annotators for their
contribution and lending us their precious time.non-peer-reviewe
A comparative study of different state-of-the-art hate speech detection methods in Hindi-English code-mixed data
Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities. In an environment where multilingual speakers switch among multiple languages, hate speech detection becomes a challenging task using methods that are designed for monolingual corpora. In our work, we attempt to analyze, detect and provide a comparative study of hate speech in a code-mixed social media text. We also provide a Hindi-English code-mixed data set consisting of Facebook and Twitter posts and comments. Our experiments show that deep learning models trained on this code-mixed corpus perform better.This publication has emanated from research supported in
part by a research grant from Science Foundation Ireland
(SFI) under Grant Number SFI/12/RC/2289 (Insight),
SFI/12/RC/2289 P2 (Insight 2), & SFI/18/CRT/6223
(CRT-Centre for Research Training in Artficial Intelligence) co-funded by the European Regional Development
Fund as well as by the EU H2020 programme under grant
agreements 731015 (ELEXIS-European Lexical Infrastructure), 825182 (Pret- ˆ a-LLOD), and Irish Research Council `
grant IRCLA/2017/129 (CARDAMOM-Comparative
Deep Models of Language for Minority and Historical
Languages). The authors are grateful to Ajay Bohra and
his team for sharing their data set and for their support.
We would also like to thank our annotators for their
contribution and lending us their precious time