7 research outputs found

    'The enemy among us': detecting cyber hate speech with threats-based othering language embeddings

    Get PDF
    Offensive or antagonistic language targeted at individuals and social groups based on their personal characteristics (also known as cyber hate speech or cyberhate) has been frequently posted and widely circulated via the World Wide Web. This can be considered as a key risk factor for individual and societal tension surrounding regional instability. Automated Web-based cyberhate detection is important for observing and understanding community and regional societal tension - especially in online social networks where posts can be rapidly and widely viewed and disseminated. While previous work has involved using lexicons, bags-of-words or probabilistic language parsing approaches, they often suffer from a similar issue which is that cyberhate can be subtle and indirect - thus depending on the occurrence of individual words or phrases can lead to a significant number of false negatives, providing inaccurate representation of the trends in cyberhate. This problem motivated us to challenge thinking around the representation of subtle language use, such as references to perceived threats from ‘the other’ including immigration or job prosperity in a hateful context. We propose a novel ‘othering’ feature set that utilises language use around the concept of ‘othering’ and intergroup threat theory to identify these subtleties, and we implement a wide range of classification methods using embedding learning to compute semantic distances between parts of speech considered to be part of an ‘othering’ narrative. To validate our approach we conducted two sets of experiments. The first involved comparing the results of our novel method with state of the art baseline models from the literature. Our approach outperformed all existing methods. The second tested the best performing models from the first phase on unseen datasets for different types of cyberhate, namely religion, disability, race and sexual orientation. The results showed F-measure scores for classifying hateful instances obtained through applying our model of 0.81, 0.71, 0.89 and 0.72 respectively, demonstrating the ability of the ‘othering’ narrative to be an important part of model generalisation

    Mowa nienawiƛci jako przedmiot badaƄ. Praktyki komunikacyjne nacechowane nienawiƛcią w dyskursie medialnym

    Get PDF
    Hate Speech as an Object of Research: Hate-Fuelled Communication Practices in Media DiscourseHate speech is currently an issue discussed in many scientific disciplines and is one of the threads of linguistic research conducted at the Department of Intercultural Glottopedagogy at the Institute of Applied Linguistics, Adam Mickiewicz University in PoznaƄ. Since 2014, the Department has been involved in the RADAR project (Regulating Anti-Discrimination and Anti-Racism), conducted in cooperation with other European universities and public institutions, and co-financed by the European Commission. The aim of this article is to present partial results of empirical research conducted under the project, including a catalogue of hate-oriented communication practices in media discourse. Mowa nienawiƛci jako przedmiot badaƄ. Praktyki komunikacyjne nacechowane nienawiƛcią w dyskursie medialnymObecnie mowa nienawiƛci stanowi zagadnienie omawiane w wielu dyscyplinach naukowych, w tym jest rĂłwnieĆŒ jednym z wątkĂłw badaƄ językoznawczych prowadzonych w ZakƂadzie Glottopedagogiki Interkulturowej, dziaƂającym w ramach Instytutu Lingwistyki Stosowanej na Uniwersytecie im. Adama Mickiewicza w Poznaniu. Od 2014 roku w zakƂadzie tym prowadzono w kooperacji z innymi europejskimi uniwersytetami oraz instytucjami publicznymi projekt RADAR (Regulating Anti-Discrimination and Anti-Racism), wspóƂfinansowany przez Komisję Europejską. Celem niniejszego artykuƂu jest przedstawienie częƛciowych wynikĂłw badaƄ empirycznych prowadzonych w ramach tego projektu, do ktĂłrych naleĆŒy katalog praktyk komunikacyjnych nacechowanych nienawiƛcią w dyskursie medialnym

    Hate speech and offensive language detection: a new feature set with filter-embedded combining feature selection

    Get PDF
    Social media has changed the world and play an important role in people lives. Social media platforms like Twitter, Facebook and YouTube create a new dimension of communication by providing channels to express and exchange ideas freely. Although the evolution brings numerous benefits, the dynamic environment and the allowable of anonymous posts could expose the uglier side of humanity. Irresponsible people would abuse the freedom of speech by aggressively express opinion or idea that incites hatred. This study performs hate speech and offensive language detection. The problem of this task is there is no clear boundary between hate speech and offensive language. In this study, a selected new features set is proposed for detecting hate speech and offensive language. Using Twitter dataset, the experiments are performed by considering the combination of word n-gram and enhanced syntactic n-gram. To reduce the feature set, filter-embedded combining feature selection is used. The experimental results indicate that the combination of word n-gram and enhanced syntactic n-gram with feature selection to classify the data into three classes: hate speech, offensive language or neither could give good performance. The result reaches 91% for accuracy and the averages of precision, recall and F1

    The Buddhist nuns and dialogue in wartime Myanmar:Understanding the 'Banality of Othering'

    Get PDF
    This paper contends that dialogue must be understood dispassionately with the aim to appreciate what David Bohm (2013) called ‘incoherence’, and the need to embrace multiplicity in narratives, even if that implies incongruence in the understanding of dialogue. Using ethnographic methods and findings, I situate the politics of self and the other, and argue that determining the other and acknowledging the ‘banality of othering’ need to be examined in discussions around dialogue. I present a background of the interfaith tensions between the Buddhists and the Muslim-Other in Myanmar and by means of ethnographic anecdotes unpack the underplayed importance of determining the other within one’s own faith tradition and emphasise the needs and possibilities of engaging with them. Female religious leaders are often the innate other in many religious traditions, and their stories, experiences, and recommendations are disproportionately discounted, and that necessitates redressing. In a first, this study reports the role of Buddhist nuns, or the lack of it, in transitional Myanmar in the belief, practice, and scholarship of dialogue, and emphasises the need for their meaningful involvement

    "HOT" ChatGPT: The promise of ChatGPT in detecting and discriminating hateful, offensive, and toxic comments on social media

    Full text link
    Harmful content is pervasive on social media, poisoning online communities and negatively impacting participation. A common approach to address this issue is to develop detection models that rely on human annotations. However, the tasks required to build such models expose annotators to harmful and offensive content and may require significant time and cost to complete. Generative AI models have the potential to understand and detect harmful content. To investigate this potential, we used ChatGPT and compared its performance with MTurker annotations for three frequently discussed concepts related to harmful content: Hateful, Offensive, and Toxic (HOT). We designed five prompts to interact with ChatGPT and conducted four experiments eliciting HOT classifications. Our results show that ChatGPT can achieve an accuracy of approximately 80% when compared to MTurker annotations. Specifically, the model displays a more consistent classification for non-HOT comments than HOT comments compared to human annotations. Our findings also suggest that ChatGPT classifications align with provided HOT definitions, but ChatGPT classifies "hateful" and "offensive" as subsets of "toxic." Moreover, the choice of prompts used to interact with ChatGPT impacts its performance. Based on these in-sights, our study provides several meaningful implications for employing ChatGPT to detect HOT content, particularly regarding the reliability and consistency of its performance, its understand-ing and reasoning of the HOT concept, and the impact of prompts on its performance. Overall, our study provides guidance about the potential of using generative AI models to moderate large volumes of user-generated content on social media

    Doing identity on Facebook: A discourse analytic study of posts shared among older Greek-Cypriot users

    Get PDF
    While older adult users of the internet account for a significant and steadily increasing proportion of Facebook’s user base, our understanding of how these users communicate on their Facebook timelines and construct identities online remains limited. The broader objective of this thesis is to advance our knowledge on this topic and, in doing so, to contribute to the growing body of research on digital media and identity performance, which has so far focused predominantly on younger adult users. More specifically, this thesis aims to examine: (i) what identity aspects older users project through their Facebook wall posts, and (ii) how such identities are projected through a range of linguistic and other semiotic resources. The data collection and analysis of the posts follows the broader framework of digital discourse analysis (Vásquez, 2022). For the purposes of this study, 2845 Facebook posts from 13 Greek-Cypriot Facebook users aged over 45 were collected over a six-month period in 2018. Drawing on content analysis, this study initially analyses the posts in terms of the communicative functions they fulfil and their potential for identity construction in the context of Facebook timelines. This analysis has revealed that the majority of the posts were used by the participants either to express humour or to communicate an opinion, highlighting the significance of humour and opinion-giving in identity construction by older users on Facebook. For this reason, the study undertakes a more detailed qualitative discourse analysis of posts expressing humour and opinion, with an emphasis on the linguistic and other semiotic strategies deployed by the participants in these messages. The findings of the study foreground the use of several discursive, linguistic and other semiotic tools for identity purposes, especially the strategic use of language and script choice, (in)directness, pronouns, storytelling, polyphony, non-standard punctuation markers and emoticons. With respect to the range of identities identified, hetero-normative gender identities and place identities are particularly prevalent in the sample and discussed in more detail and in relation to the concept of ‘age’. The study also contributes to the wider volume of research that argues that the online sphere and any practices developed there are not separate from offline discourses, practices and communication
    corecore