Search CORE

11 research outputs found

Identifying Misinformation on YouTube through Transcript Contextual Analysis with Transformer Models

Author: Christodoulou Christos
Leonidou Pantelitsa
Papadakis Michail
Salamanos Nikos
Sirivianos Michael
Publication venue
Publication date: 22/07/2023
Field of study

Misinformation on YouTube is a significant concern, necessitating robust detection strategies. In this paper, we introduce a novel methodology for video classification, focusing on the veracity of the content. We convert the conventional video classification task into a text classification task by leveraging the textual content derived from the video transcripts. We employ advanced machine learning techniques like transfer learning to solve the classification challenge. Our approach incorporates two forms of transfer learning: (a) fine-tuning base transformer models such as BERT, RoBERTa, and ELECTRA, and (b) few-shot learning using sentence-transformers MPNet and RoBERTa-large. We apply the trained models to three datasets: (a) YouTube Vaccine-misinformation related videos, (b) YouTube Pseudoscience videos, and (c) Fake-News dataset (a collection of articles). Including the Fake-News dataset extended the evaluation of our approach beyond YouTube videos. Using these datasets, we evaluated the models distinguishing valid information from misinformation. The fine-tuned models yielded Matthews Correlation Coefficient>0.81, accuracy>0.90, and F1 score>0.90 in two of three datasets. Interestingly, the few-shot models outperformed the fine-tuned ones by 20% in both Accuracy and F1 score for the YouTube Pseudoscience dataset, highlighting the potential utility of this approach -- especially in the context of limited training data

arXiv.org e-Print Archive

A graph exploration method for identifying influential spreaders in complex networks

Author: Elli Voudigari
Emmanuel J. Yannakoudakis
Nikos Salamanos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2017
Field of study

Abstract The problem of identifying the influential spreaders - the important nodes - in a real world network is of high importance due to its theoretical interest as well as its practical applications, such as the acceleration of information diffusion, the control of the spread of a disease and the improvement of the resilience of networks to external attacks. In this paper, we propose a graph exploration sampling method that accurately identifies the influential spreaders in a complex network, without any prior knowledge of the original graph, apart from the collected samples/subgraphs. The method explores the graph, following a deterministic selection rule and outputs a graph sample - the set of edges that have been crossed. The proposed method is based on a version of Rank Degree graph sampling algorithm. We conduct extensive experiments in eight real world networks by simulating the susceptible-infected-recovered (SIR) and susceptible-infected-susceptible (SIS) epidemic models which serve as ground truth identifiers of nodes spreading efficiency. Experimentally, we show that by exploring only the 20% of the network and using the degree centrality as well as the k-core measure, we are able to identify the influential spreaders with at least the same accuracy as in the full information case, namely, the case where we have access to the original graph and in that graph, we compute the centrality measures. Finally and more importantly, we present strong evidence that the degree centrality - the degree of nodes in the collected samples - is almost as accurate as the k-core values obtained from the original graph

Directory of Open Access Journals

Privacy-Preserving Online Content Moderation with Federated Learning

Author: Kourtellis Nicolas
Leonidou Pantelitsa
Salamanos Nikos
Sirivianos Michael
Publication venue
Publication date: 30/04/2023
Field of study

Ktisis

Privacy-Preserving Online Content Moderation: A Federated Learning Use Case

Author: Kourtellis Nicolas
Leonidou Pantelitsa
Salamanos Nikos
Sirivianos Michael
Publication venue
Publication date: 30/04/2023
Field of study

Users are exposed to a large volume of harmful content that appears daily on various social network platforms. One solution to users' protection is developing online moderation tools using Machine Learning (ML) techniques for automatic detection or content filtering. On the other hand, the processing of user data requires compliance with privacy policies. In this paper, we propose a framework for developing content moderation tools in a privacy-preserving manner where sensitive information stays on the users' device. For this purpose, we apply Differentially Private Federated Learning (DP-FL), where the training of ML models is performed locally on the users' devices, and only the model updates are shared with a central entity. To demonstrate the utility of our approach, we simulate harmful text classification on Twitter data in a distributed FL fashion- but the overall concept can be generalized to other types of misbehavior, data, and platforms. We show that the performance of the proposed FL framework can be close to the centralized approach - for both the DP-FL and non-DP FL. Moreover, it has a high performance even if a small number of clients (each with a small number of tweets) are available for the FL training. When reducing the number of clients (from fifty to ten) or the tweets per client (from 1K to 100), the classifier can still achieve AUC. Furthermore, we extend the evaluation to four other Twitter datasets that capture different types of user misbehavior and still obtain a promising performance (61% - 80% AUC). Finally, we explore the overhead on the users' devices during the FL training phase and show that the local training does not introduce excessive CPU utilization and memory consumption overhead

Ktisis

Did State-Sponsored Trolls Shape the 2016 US Presidential Election Discourse? Quantifying Influence on Twitter

Author: Iordanou Costas
Jensen Michael J.
Salamanos Nikos
Sirivianos Michael
Publication venue
Publication date: 07/05/2022
Field of study

It is a widely accepted fact that state-sponsored Twitter accounts operated during the 2016 US presidential election, spreading millions of tweets with misinformation and inflammatory political content. Whether these social media campaigns of the so-called “troll” accounts were able to manipulate public opinion is still in question. Here, we quantify the influence of troll accounts on Twitter by analyzing 152.5 million tweets (by 9.9 million users) from that period. The data contain original tweets from 822 troll accounts identified as such by Twitter. We construct and analyze a very large interaction graph of 9.3 million nodes and 169.9 million edges using graph analysis techniques and a game-theoretic centrality measure. Then, we quantify the influence of all Twitter accounts on the overall information exchange as defined by the retweet cascades. We provide a global influence ranking of all Twitter accounts, and we find that one troll account appears in the top-100 and four in the top-1000. This, combined with other findings presented in this paper, constitute evidence that the driving force of virality and influence in the network came from regular users - users who have not been classified as trolls by Twitter. On the other hand, we find that, on average, troll accounts were tens of times more influential than regular users were. Moreover, 23% and 22% of regular accounts in the top-100 and top-1000, respectively, have now been suspended by Twitter. This raises questions about their authenticity and practices during the 2016 US presidential election

Ktisis

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Did State-Sponsored Trolls Shape the 2016 US Presidential Election Discourse? Quantifying Influence on Twitter

Author: Iordanou Costas
Jensen Michael J.
Salamanos Nikos
Sirivianos Michael
Publication venue
Publication date: 14/08/2023
Field of study

Ktisis

A Unified Graph-Based Approach to Disinformation Detection using Contextual and Semantic Relations

Author: Iordanou Costas
Laoutaris Nikolaos
Paraschiv Marius
Salamanos Nikos
Sirivianos Michael
Publication venue
Publication date: 24/09/2021
Field of study

As recent events have demonstrated, disinformation spread through social networks can have dire political, economic and social consequences. Detecting disinformation must inevitably rely on the structure of the network, on users particularities and on event occurrence patterns. We present a graph data structure, which we denote as a meta-graph, that combines underlying users' relational event information, as well as semantic and topical modeling. We detail the construction of an example meta-graph using Twitter data covering the 2016 US election campaign and then compare the detection of disinformation at cascade level, using well-known graph neural network algorithms, to the same algorithms applied on the meta-graph nodes. The comparison shows a consistent 3%-4% improvement in accuracy when using the meta-graph, over all considered algorithms, compared to basic cascade classification, and a further 1% increase when topic modeling and sentiment analysis are considered. We carry out the same experiment on two other datasets, HealthRelease and HealthStory, part of the FakeHealth dataset repository, with consistent results. Finally, we discuss further advantages of our approach, such as the ability to augment the graph structure using external data sources, the ease with which multiple meta-graphs can be combined as well as a comparison of our method to other graph-based disinformation detection frameworks

Ktisis

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Qualitative Analysis of Illicit Arms Trafficking on Darknet Marketplaces

Author: Aspri Maria
Farao Aristeidis
Leonidou Pantelitsa
Salamanos Nikos
Sirivianos Michael
Publication venue
Publication date: 29/08/2023
Field of study

During the last decade, the dark web has become the playground for criminal and underground activities, such as marketplaces of drugs and guns, as well as illegal content sharing. The dark web is one of the top crime environments presented in EUROPOL's Internet Organised Crime Threat Assessment 2021. This paper provides a qualitative study on the darknet marketplaces of illegal arms trafficking. For this purpose, we implemented a crawler based on the ACHE Python library to collect hidden web pages (onion services) on the Tor network. We gathered data from ten marketplaces recommended by dark web search engines - Ahmia, Deep Search, and Onion Land Search. We provide a first report of the overall landscape of illicit arms trafficking, discussing the range of weapons such as military drones, explosives, and other related products, together with the payment and shipping methods provided by the vendors. The findings verify previous reports from reputable institutions (United Nations and RAND Europe). Most of these illicit marketplaces are easily accessible to the average user; they are well-organized with a large variety of firearms and also provide extensive customer support

Ktisis

HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection

Author: Aspri Maria
Laoutaris Nikolaos
Leonidou Pantelitsa
Paraschiv Marius
Salamanos Nikos
Sirivianos Michael
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 28/05/2024
Field of study

In light of the growing impact of disinformation on social, economic, and political landscapes, accurate and efficient identification methods are increasingly critical. This paper introduces HyperGraphDis, a novel approach for detecting disinformation on Twitter that employs a hypergraph-based representation to capture (i) the intricate social structures arising from retweet cascades, (ii) relational features among users, and (iii) semantic and topical nuances. Evaluated on four Twitter datasets -- focusing on the 2016 U.S. presidential election and the COVID-19 pandemic -- HyperGraphDis outperforms existing methods in both accuracy and computational efficiency, underscoring its effectiveness and scalability for tackling the challenges posed by disinformation dissemination. HyperGraphDis displays exceptional performance on a COVID-19-related dataset, achieving an impressive F1 score (weighted) of approximately 89.5%. This result represents a notable improvement of around 4% compared to the other state-of-the-art methods. Additionally, significant enhancements in computation time are observed for both model training and inference. In terms of model training, completion times are accelerated by a factor ranging from 2.3 to 7.6 compared to the second-best method across the four datasets. Similarly, during inference, computation times are 1.3 to 6.8 times faster than the state-of-the-art

Association for the Advancement of Artificial Intelligence: AAAI Publications