143 research outputs found
Recommended from our members
Contextual Semantics for Radicalisation Detection on Twitter
Much research aims to detect online radical content mainly using radicalisation glossaries, i.e., by looking for terms and expressions associated with religion, war, offensive language, etc. However, such crude methods are highly inaccurate towards content that uses radicalisation terminology to simply report on current events, to share harmless religious rhetoric, or even to counter extremism.
Language is complex and the context in which particular terms are used should not be disregarded. In this paper, we propose an approach for building a representation of the semantic context of the terms that are linked to radicalised rhetoric. We use this approach to analyse over 114K tweets that contain radicalisation-terms (around 17K posted by pro-ISIS users, and 97k posted by “general” Twitter users).
We report on how the contextual information differs for the same radicalisation terms in the two datasets, which indicate that contextual semantics can help to better discriminate radical content from content that only uses radical terminology.The classifiers we built to test this hypothesis outperform those that disregard contextual informatio
Recommended from our members
On the Role of Semantics for Detecting pro-ISIS Stances on Social Media
From its start, the so-called Islamic State of Iraq and the Levant (ISIL/ISIS) has been successfully exploiting social media networks, most notoriously Twitter, to promote its propaganda and recruit new members, resulting in thousands of social media users adopting pro ISIS stance every year. Automatic identification of pro-ISIS users on social media has, thus, become the centre of interest for various governmental and research organisations. In this paper we propose a semantic-based approach for radicalisation detection on Twitter. Unlike most previous works, which mainly rely on the lexical and contextual representation of the content published by Twitter users, our approach extracts and makes use of the underlying semantics of words exhibited by these users to identify their pro/anti-ISIS stances. Our results show that classifiers trained from words’ semantics outperform those trained from lexical and network features by 2% on average F1-measure
Recommended from our members
Artificial Intelligence and Online Extremism: Challenges and Opportunities
Radicalisation is a process that historically used to be triggered mainly through social interactions in places of worship, religious schools, prisons, meeting venues, etc. Today, this process is often initiated on the Internet, where radicalisation content is easily shared, and potential candidates are reached more easily, rapidly, and at an unprecedented scale (Edwards and Gribbon, 2013; Von Behr et al., 2013).
In recent years, some terrorist organisations succeeded in leveraging the power of social media to recruit individuals to their cause and ideology (Farwell, 2014). It is often the case that such recruitment attempts are initiated on open social media platforms (e.g., Twitter, Facebook, Tumblr, YouTube) but then move onto private messages and/or encrypted platforms (e.g., WhatsApp, Telegram). Such encrypted communication channels have also been used by terrorist cells and networks to plan their operations (Gartenstein-Ross and Barr).
To counteract the activities of such organisations, and to halt the spread of radicalisation content, some governments, social media platforms, and counter-extremism agencies are investing in the creation of advanced information technologies to identify and counter extremism through the development of Artificial Intelligent (AI) solutions (Correa and Sureka, 2013; Agarwal and Sureka 2015a; Scrivens and Davies, 2018).
These solutions have three main objectives: (i) understanding the phenomena behind online extremism (the communication flow, the use of propaganda, the different stages of the radicalisation process, the variety of radicalisation channels, etc.), (ii) automatically detecting radical users and content, and (iii) predicting the adoption and spreading of extremist ideas.
Despite current advancements in the area, multiple challenges still exist, including: (i) the lack of a common definition of prohibited radical and extremist internet activity, (ii) the lack of solid verification of the datasets collected to develop detection and prediction models, (iii) the lack of cooperation across research fields, since most of the developed technological solutions are neither based on, nor do they take advantage of, existing social theories and studies of radicalisation, (iv) the constant evolution of behaviours associated with online extremism in order to avoid being detected by the developed algorithms (changes in terminology, creation of new accounts, etc.) and, (v) the development of ethical guidelines and legislation to regulate the design and development of AI technology to counter radicalisation.
In this book chapter we provide an overview of the current technological advancements towards addressing the problem of online extremism (with a particular focus on Jihadism). We identify some of the limitations of current technologies, and highlight some of the potential opportunities. Our aim is to reflect on the current state of the art and to stimulate discussions on the future design and development of AI technology to target the problem of online extremism
Semantic Wide and Deep Learning for Detecting Crisis-Information Categories on Social Media
When crises hit, many flog to social media to share or consume information related to the event. Social media posts during crises tend to provide valuable reports on affected people, donation offers, help requests, advice provision, etc. Automatically identifying the category of information (e.g., reports on affected individuals, donations and volunteers) contained in these posts is vital for their efficient handling and consumption by effected communities and concerned organisations. In this paper, we introduce Sem-CNN; a wide and deep Convolutional Neural Network (CNN) model designed for identifying the category of information contained in crisis-related social media content. Unlike previous models, which mainly rely on the lexical representations of words in the text, the proposed model integrates an additional layer of semantics that represents the named entities in the text, into a wide and deep CNN network. Results show that the Sem-CNN model consistently outperforms the baselines which consist of
statistical and non-semantic deep learning models
#ISIS vs #ActionCountersTerrorism: A Computational Analysis of Extremist and Counter-extremist Twitter Narratives
The rapid expansion of cyberspace has greatly facilitated the strategic shift of traditional crimes to online platforms. This has included malicious actors, such as extremist organisations, making use of online networks to disseminate propaganda and incite violence through radicalising individuals. In this article, we seek to advance current research by exploring how supporters of extremist organisations craft and disseminate their content, and how posts from counter-extremism agencies compare to them. In particular, this study will apply computational techniques to analyse the narratives of various pro-extremist and counter-extremist Twitter accounts, and investigate how the psychological motivation behind the messages compares between pro-ISIS and counter-extremism narratives. Our findings show that pro-extremist accounts often use different strategies to disseminate content (such as the types of hashtags used) when compared to counter-extremist accounts across different types of organisations, including accounts of governments and NGOs. Through this study, we provide unique insights into both extremist and counter-extremist narratives on social media platforms. Furthermore, we define several avenues for discussion regarding the extent to which counter-messaging may be effective at diminishing the online influence of extremist and other criminal organisations
Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate
Terror attacks have been linked in part to online extremist content. Although
tens of thousands of Islamist extremism supporters consume such content, they
are a small fraction relative to peaceful Muslims. The efforts to contain the
ever-evolving extremism on social media platforms have remained inadequate and
mostly ineffective. Divergent extremist and mainstream contexts challenge
machine interpretation, with a particular threat to the precision of
classification algorithms. Our context-aware computational approach to the
analysis of extremist content on Twitter breaks down this persuasion process
into building blocks that acknowledge inherent ambiguity and sparsity that
likely challenge both manual and automated classification. We model this
process using a combination of three contextual dimensions -- religion,
ideology, and hate -- each elucidating a degree of radicalization and
highlighting independent features to render them computationally accessible. We
utilize domain-specific knowledge resources for each of these contextual
dimensions such as Qur'an for religion, the books of extremist ideologues and
preachers for political ideology and a social media hate speech corpus for
hate. Our study makes three contributions to reliable analysis: (i) Development
of a computational approach rooted in the contextual dimensions of religion,
ideology, and hate that reflects strategies employed by online Islamist
extremist groups, (ii) An in-depth analysis of relevant tweet datasets with
respect to these dimensions to exclude likely mislabeled users, and (iii) A
framework for understanding online radicalization as a process to assist
counter-programming. Given the potentially significant social impact, we
evaluate the performance of our algorithms to minimize mislabeling, where our
approach outperforms a competitive baseline by 10.2% in precision.Comment: 22 page
Defining and Detecting Toxicity on Social Media: Context and Knowledge are Key
As the role of online platforms has become increasingly prominent for communication, toxic behaviors, such as cyberbullying and harassment, have been rampant in the last decade. On the other hand, online toxicity is multi-dimensional and sensitive in nature, which makes its detection challenging. As the impact of exposure to online toxicity can lead to serious implications for individuals and communities, reliable models and algorithms are required for detecting and understanding such communications. In this paper We define toxicity to provide a foundation drawing social theories. Then, we provide an approach that identifies multiple dimensions of toxicity and incorporates explicit knowledge in a statistical learning algorithm to resolve ambiguity across such dimensions
A survey on extremism analysis using natural language processing: definitions, literature review, trends and challenges
Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.Extremism has grown as a global problem for society in recent years, especially after the apparition of movements such as
jihadism. This and other extremist groups have taken advantage of different approaches, such as the use of Social Media, to
spread their ideology, promote their acts and recruit followers. The extremist discourse, therefore, is reflected on the language
used by these groups. Natural language processing (NLP) provides a way of detecting this type of content, and several authors
make use of it to describe and discriminate the discourse held by these groups, with the final objective of detecting and
preventing its spread. Following this approach, this survey aims to review the contributions of NLP to the field of extremism
research, providing the reader with a comprehensive picture of the state of the art of this research area. The content includes
a first conceptualization of the term extremism, the elements that compose an extremist discourse and the differences with
other terms. After that, a review description and comparison of the frequently used NLP techniques is presented, including
how they were applied, the insights they provided, the most frequently used NLP software tools, descriptive and classification
applications, and the availability of datasets and data sources for research. Finally, research questions are approached
and answered with highlights from the review, while future trends, challenges and directions derived from these highlights
are suggested towards stimulating further research in this exciting research area.CRUE-CSIC agreementSpringer Natur
- …