18 research outputs found
Automatic Detection of Online Jihadist Hate Speech
We have developed a system that automatically detects online jihadist hate
speech with over 80% accuracy, by using techniques from Natural Language
Processing and Machine Learning. The system is trained on a corpus of 45,000
subversive Twitter messages collected from October 2014 to December 2016. We
present a qualitative and quantitative analysis of the jihadist rhetoric in the
corpus, examine the network of Twitter users, outline the technical procedure
used to train the system, and discuss examples of use.Comment: 31 page
Understanding the Roots of Radicalisation on Twitter
In an increasingly digital world, identifying signs of online extremism sits at the top of the priority list for counter-extremist agencies. Researchers and governments are investing in the creation of advanced information technologies to identify and counter extremism through intelligent large-scale analysis of online data. However, to the best of our knowledge, these technologies are neither based on, nor do they take advantage of, the existing theories and studies of radicalisation. In this paper we propose a computational approach for detecting and predicting the radicalisation influence a user is exposed to, grounded on the notion of ’roots of radicalisation’ from social science models. This approach has been applied to analyse and compare the radicalisation level of 112 pro-ISIS vs.112 “general" Twitter users. Our results show the effectiveness of our proposed algorithms in detecting and predicting radicalisation influence, obtaining up to 0.9 F-1 measure for detection and between 0.7 and 0.8 precision for prediction. While this is an initial attempt towards the effective combination of social and computational perspectives, more work is needed to bridge these disciplines, and to build on their strengths to target the problem of online radicalisation
Detecting Textual Propaganda Using Machine Learning Techniques
سيطرت الشبكات الاجتماعية على العالم بأسره من خلال توفير منصة لنشر المعلومات. عادة ما يشارك الناس المعلومات دون معرفة صدقها. في الوقت الحاضر ، تُستخدم الشبكات الاجتماعية لاكتساب النفوذ في العديد من المجالات مثل الانتخابات والإعلانات وما إلى ذلك ، وليس من المستغرب أن تصبح وسائل التواصل الاجتماعي سلاحًا للتلاعب بالمشاعر من خلال نشر معلومات مُضللة. الدعاية هي إحدى المحاولات المنهجية والمتعمدة التي تستخدم للتأثير على الناس لتحقيق مكاسب سياسية ودينية. في هذه الورقة البحثية ، تم بذل جهود لتصنيف النص الدعائي من النص غير الدعائي باستخدام خوارزميات التعلم الآلي الخاضعة للإشراف. تم جمع البيانات من مصادر الأخبار في الفترة من يوليو 2018 إلى أغسطس 2018. بعد إضافة التعليقات التوضيحية على النص ، يتم تنفيذ هندسة الميزات باستخدام تقنيات مثل مصطلح تردد / تردد الوثيقة العكسي (TF / IDF) وحقيبة الكلمات (BOW). يتم توفير الميزات ذات الصلة لدعم المصنفات المتجهة (SVM) و Multinomial Naïve Bayesian (MNB). يتم إجراء ضبط دقيق لـ SVM عن طريق أخذ kernel Linear و Poly و RBF. أظهر SVM نتائج أفضل من MNB من خلال دقة 70٪ واسترجاع 76.5٪ ودرجة F1 69.5٪ ودقة كلية 69.2٪.Social Networking has dominated the whole world by providing a platform of information dissemination. Usually people share information without knowing its truthfulness. Nowadays Social Networks are used for gaining influence in many fields like in elections, advertisements etc. It is not surprising that social media has become a weapon for manipulating sentiments by spreading disinformation. Propaganda is one of the systematic and deliberate attempts used for influencing people for the political, religious gains. In this research paper, efforts were made to classify Propagandist text from Non-Propagandist text using supervised machine learning algorithms. Data was collected from the news sources from July 2018-August 2018. After annotating the text, feature engineering is performed using techniques like term frequency/inverse document frequency (TF/IDF) and Bag of words (BOW). The relevant features are supplied to support vector machine (SVM) and Multinomial Naïve Bayesian (MNB) classifiers. The fine tuning of SVM is being done by taking kernel Linear, Poly and RBF. SVM showed better results than MNB by having precision of 70%, recall of 76.5%, F1 Score of 69.5% and overall Accuracy of 69.2%
Organized Behavior Classification of Tweet Sets using Supervised Learning Methods
During the 2016 US elections Twitter experienced unprecedented levels of
propaganda and fake news through the collaboration of bots and hired persons,
the ramifications of which are still being debated. This work proposes an
approach to identify the presence of organized behavior in tweets. The Random
Forest, Support Vector Machine, and Logistic Regression algorithms are each
used to train a model with a data set of 850 records consisting of 299 features
extracted from tweets gathered during the 2016 US presidential election. The
features represent user and temporal synchronization characteristics to capture
coordinated behavior. These models are trained to classify tweet sets among the
categories: organic vs organized, political vs non-political, and pro-Trump vs
pro-Hillary vs neither. The random forest algorithm performs better with
greater than 95% average accuracy and f-measure scores for each category. The
most valuable features for classification are identified as user based
features, with media use and marking tweets as favorite to be the most
dominant.Comment: 51 pages, 5 figure
(De)constructing difference: a qualitative review of the ‘othering’ of UK Muslim communities, extremism, soft harms, and Twitter analytics
There is some evidence that, in the UK, current counter terrorism initiatives reproduce and amplify both real and imagined differences between Muslim and anti-Muslim groups, leading in turn to social and community polarisation and isolation. It is far from clear whether these changing perceptions always lead to increased ethnic and religious violence or increased radicalisation. However, more worrying is the potential for the development of ‘soft harms’ among those ‘suspect communities; for example reduced social integration, withdrawal from British cultural life, hate crime, forced marriage and domestic violence. There has to date been little interrogation of the scale of ‘soft harm’ among Muslim communities. Within this paper, the author offers a qualitative review of how the Muslim ‘other’ has become an ascribed category reproduced through an endemic ‘Mulsim common sense’. Following that the author suggests that Twitter analytics may be harnessed to analyse the attitudes, current condition, and reactions of suspect other communities through the tweeting of everyday events. The aim in doing so is to develop a series of proposals to counter the ideological underpinnings of difference and contribute to current debates on counter terrorism policy in the UK
Leveraging Natural Language Processing to Analyse the Temporal Behavior of Extremists on Social Media
Aiming at achieving sustainability and quality of life for citizens, future smart cities adopt a data-centric approach to decision making in which assets, people, and events are constantly monitored to inform decisions. Public opinion monitoring is of particular importance to governments and intelligence agencies, who seek to monitor extreme views and attempts of radicalizing individuals in society. While social media platforms provide increased visibility and a platform to express public views freely, such platforms can also be used to manipulate public opinion, spread hate speech, and radicalize others. Natural language processing and data mining techniques have gained popularity for the analysis of social media content and the detection of extremists and radical views expressed online. However, existing approaches simplify the concept of radicalization to a binary problem in which individuals are classified as extremists or non-extremists. Such binary approaches do not capture the radicalization process\u27s complexity that is influenced by many aspects such as social interactions, the impact of opinion leaders, and peer pressure. Moreover, the longitudinal analysis of users\u27 interactions and profile evolution over time is lacking in the literature. Aiming at addressing those limitations, this work proposes a sophisticated framework for the analysis of the temporal behavior of extremists on social media platforms. Far-right extremism during the Trump presidency was used as a case study, and a large dataset of over 259,000 tweets was collected to train and test our models. The results obtained are very promising and encourage the use of advanced social media analytics in the support of effective and timely decision-making
Analisis Sentimen Konten Radikal Melalui Dokumen Twitter Menggunakan Metode Backpropagation
Twitter adalah layanan jejaring sosial dimana pengguna dapat
memposting dan berinteraksi dengan pesan, yang dikenal sebagai "tweet".
Twitter juga digunakan oleh sebagian orang untuk memberikan opini mereka
terhadap suatu hal namun terkadang terlalu berlebihan bahkan juga kadang
ditemukan tweet yang berbau radikal. Tindakan radikal yang ada pada media
sosial biasanya disebut dengan konten radikal. Konten-konten radikal yang ada di
media sosial tentu dapat merugikan beberapa pihak. Ada juga pihak-pihak
tertentu yang memanfaatkan konten radikal untuk mencapai tujuan tertentu.
Oleh sebab itu pada penelitian ini mencoba menganalisis tweet berbahasa
Indonesia yang mengandung kata radikal, termasuk dalam konten radikal positif
atau radikal negatif. Tweet yg di dapat dari twitter yang berisi opini masyarakat
yang mengarah ke konten radikal akan di klasifikasikan. Tweet tadi bisa disebut
dokumen atau data terlebih dahulu akan melalui proses preprocessing.
Kemudian dokumen tadi di pecah menjadi 6 jenis kata, diantaranya yaitu kata
benda, kata kerja dan kata sifat dimana masing-masing jenis kata akan di bagi
lagi menjadi positif dan negatif. Setelah di pecah akan dihitung berapa banyak
jumlah jenis kata dalam masing-masing dokumen sehingga bisa diubah menjadi
angka yang selanjutnya bisa dimasukkan ke dalam rumus algoritma
The Potential Impact of Big Data in International Development and Humanitarian Aid
Honors (Bachelor's)International StudiesUniversity of Michiganhttps://deepblue.lib.umich.edu/bitstream/2027.42/139612/1/emjabs.pd
Recommended from our members
Artificial Intelligence and Online Extremism: Challenges and Opportunities
Radicalisation is a process that historically used to be triggered mainly through social interactions in places of worship, religious schools, prisons, meeting venues, etc. Today, this process is often initiated on the Internet, where radicalisation content is easily shared, and potential candidates are reached more easily, rapidly, and at an unprecedented scale (Edwards and Gribbon, 2013; Von Behr et al., 2013).
In recent years, some terrorist organisations succeeded in leveraging the power of social media to recruit individuals to their cause and ideology (Farwell, 2014). It is often the case that such recruitment attempts are initiated on open social media platforms (e.g., Twitter, Facebook, Tumblr, YouTube) but then move onto private messages and/or encrypted platforms (e.g., WhatsApp, Telegram). Such encrypted communication channels have also been used by terrorist cells and networks to plan their operations (Gartenstein-Ross and Barr).
To counteract the activities of such organisations, and to halt the spread of radicalisation content, some governments, social media platforms, and counter-extremism agencies are investing in the creation of advanced information technologies to identify and counter extremism through the development of Artificial Intelligent (AI) solutions (Correa and Sureka, 2013; Agarwal and Sureka 2015a; Scrivens and Davies, 2018).
These solutions have three main objectives: (i) understanding the phenomena behind online extremism (the communication flow, the use of propaganda, the different stages of the radicalisation process, the variety of radicalisation channels, etc.), (ii) automatically detecting radical users and content, and (iii) predicting the adoption and spreading of extremist ideas.
Despite current advancements in the area, multiple challenges still exist, including: (i) the lack of a common definition of prohibited radical and extremist internet activity, (ii) the lack of solid verification of the datasets collected to develop detection and prediction models, (iii) the lack of cooperation across research fields, since most of the developed technological solutions are neither based on, nor do they take advantage of, existing social theories and studies of radicalisation, (iv) the constant evolution of behaviours associated with online extremism in order to avoid being detected by the developed algorithms (changes in terminology, creation of new accounts, etc.) and, (v) the development of ethical guidelines and legislation to regulate the design and development of AI technology to counter radicalisation.
In this book chapter we provide an overview of the current technological advancements towards addressing the problem of online extremism (with a particular focus on Jihadism). We identify some of the limitations of current technologies, and highlight some of the potential opportunities. Our aim is to reflect on the current state of the art and to stimulate discussions on the future design and development of AI technology to target the problem of online extremism
Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate
Terror attacks have been linked in part to online extremist content. Although
tens of thousands of Islamist extremism supporters consume such content, they
are a small fraction relative to peaceful Muslims. The efforts to contain the
ever-evolving extremism on social media platforms have remained inadequate and
mostly ineffective. Divergent extremist and mainstream contexts challenge
machine interpretation, with a particular threat to the precision of
classification algorithms. Our context-aware computational approach to the
analysis of extremist content on Twitter breaks down this persuasion process
into building blocks that acknowledge inherent ambiguity and sparsity that
likely challenge both manual and automated classification. We model this
process using a combination of three contextual dimensions -- religion,
ideology, and hate -- each elucidating a degree of radicalization and
highlighting independent features to render them computationally accessible. We
utilize domain-specific knowledge resources for each of these contextual
dimensions such as Qur'an for religion, the books of extremist ideologues and
preachers for political ideology and a social media hate speech corpus for
hate. Our study makes three contributions to reliable analysis: (i) Development
of a computational approach rooted in the contextual dimensions of religion,
ideology, and hate that reflects strategies employed by online Islamist
extremist groups, (ii) An in-depth analysis of relevant tweet datasets with
respect to these dimensions to exclude likely mislabeled users, and (iii) A
framework for understanding online radicalization as a process to assist
counter-programming. Given the potentially significant social impact, we
evaluate the performance of our algorithms to minimize mislabeling, where our
approach outperforms a competitive baseline by 10.2% in precision.Comment: 22 page