We study Socially Unacceptable Discourse (SUD) characterization and detection
in online text. We first build and present a novel corpus that contains a large
variety of manually annotated texts from different online sources used so far
in state-of-the-art Machine learning (ML) SUD detection solutions. This global
context allows us to test the generalization ability of SUD classifiers that
acquire knowledge around the same SUD categories, but from different contexts.
From this perspective, we can analyze how (possibly) different annotation
modalities influence SUD learning by discussing open challenges and open
research directions. We also provide several data insights which can support
domain experts in the annotation task