197 research outputs found

    Exposing and explaining fake news on-the-fly

    Get PDF
    Social media platforms enable the rapid dissemination and consumption of information. However, users instantly consume such content regardless of the reliability of the shared data. Consequently, the latter crowdsourcing model is exposed to manipulation. This work contributes with an explainable and online classification method to recognize fake news in real-time. The proposed method combines both unsupervised and supervised Machine Learning approaches with online created lexica. The profiling is built using creator-, content- and context-based features using Natural Language Processing techniques. The explainable classification mechanism displays in a dashboard the features selected for classification and the prediction confidence. The performance of the proposed solution has been validated with real data sets from Twitter and the results attain 80% accuracy and macro F-measure. This proposal is the first to jointly provide data stream processing, profiling, classification and explainability. Ultimately, the proposed early detection, isolation and explanation of fake news contribute to increase the quality and trustworthiness of social media contentsXunta de Galicia | Ref. ED481B-2021-118Xunta de Galicia | Ref. ED481B-2022-093Fundação para a CiĂȘncia e a Tecnologia | Ref. UIDB/50014/2020Universidade de Vigo/CISU

    Anomaly Detection on Dynamic Graph

    Get PDF

    Combating Fake News on Social Media: A Framework, Review, and Future Opportunities

    Get PDF
    Social media platforms facilitate the sharing of a vast magnitude of information in split seconds among users. However, some false information is also widely spread, generally referred to as “fake news”. This can have major negative impacts on individuals and societies. Unfortunately, people are often not able to correctly identify fake news from truth. Therefore, there is an urgent need to find effective mechanisms to fight fake news on social media. To this end, this paper adapts the Straub Model of Security Action Cycle to the context of combating fake news on social media. It uses the adapted framework to classify the vast literature on fake news to action cycle phases (i.e., deterrence, prevention, detection, and mitigation/remedy). Based on a systematic and inter-disciplinary review of the relevant literature, we analyze the status and challenges in each stage of combating fake news, followed by introducing future research directions. These efforts allow the development of a holistic view of the research frontier on fighting fake news online. We conclude that this is a multidisciplinary issue; and as such, a collaborative effort from different fields is needed to effectively address this problem

    State of the art 2015: a literature review of social media intelligence capabilities for counter-terrorism

    Get PDF
    Overview This paper is a review of how information and insight can be drawn from open social media sources. It focuses on the specific research techniques that have emerged, the capabilities they provide, the possible insights they offer, and the ethical and legal questions they raise. These techniques are considered relevant and valuable in so far as they can help to maintain public safety by preventing terrorism, preparing for it, protecting the public from it and pursuing its perpetrators. The report also considers how far this can be achieved against the backdrop of radically changing technology and public attitudes towards surveillance. This is an updated version of a 2013 report paper on the same subject, State of the Art. Since 2013, there have been significant changes in social media, how it is used by terrorist groups, and the methods being developed to make sense of it.  The paper is structured as follows: Part 1 is an overview of social media use, focused on how it is used by groups of interest to those involved in counter-terrorism. This includes new sections on trends of social media platforms; and a new section on Islamic State (IS). Part 2 provides an introduction to the key approaches of social media intelligence (henceforth ‘SOCMINT’) for counter-terrorism. Part 3 sets out a series of SOCMINT techniques. For each technique a series of capabilities and insights are considered, the validity and reliability of the method is considered, and how they might be applied to counter-terrorism work explored. Part 4 outlines a number of important legal, ethical and practical considerations when undertaking SOCMINT work

    Can we predict a riot? Disruptive event detection using Twitter

    Get PDF
    In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases

    Sequential path signature networks for personalised longitudinal language modeling

    Get PDF
    Longitudinal user modeling can provide a strong signal for various downstream tasks. Despite the rapid progress in representation learning, dynamic aspects of modelling individuals’ language have only been sparsely addressed. We present a novel extension of neural sequential models using the notion of path signatures from rough path theory, which constitute graduated summaries of continuous paths and have the ability to capture non-linearities in trajectories. By combining path signatures of users’ history with contextual neural representations and recursive neural networks we can produce compact time-sensitive user representations. Given the magnitude of mental health conditions with symptoms manifesting in language, we show the applicability of our approach on the task of identifying changes in individuals’ mood by analysing their online textual content. By directly integrating signature transforms of users’ history in the model architecture we jointly address the two most important aspects of the task, namely sequentiality and temporality. Our approach achieves state-of-the-art performance on macro-average F1 score on the two available datasets for the task, outperforming or performing on-par with state-of-the-art models utilising only historical posts and even outperforming prior models which also have access to future posts of users

    An Ensemble Classification and Hybrid Feature Selection Approach for Fake News Stance Detection

    Get PDF
    The developments in Internet and notions of social media have revolutionised representations and disseminations of news. News spreads quickly while costing less in social media. Amidst these quick distributions, dangerous or seductive information like user generated false news also spread equally. on social media. Distinguishing true incidents from false news strips create key challenges. Prior to sending the feature vectors to the classifier, it was suggested in this study effort to use dimensionality reduction approaches to do so. These methods would not significantly affect the result, though. Furthermore, utilising dimensionality reduction techniques significantly reduces the time needed to complete a forecast. This paper presents a hybrid feature selection method to overcome the above mentioned issues. The classifications of fake news are based on ensembles which identify connections between stories and headlines of news items. Initially, data is pre-processed to transform unstructured data into structures for ease of processing. In the second step, unidentified qualities of false news from diverse connections amongst news articles are extracted utilising PCA (Principal Component Analysis). For the feature reduction procedure, the third step uses FPSO (Fuzzy Particle Swarm Optimization) to select features. To efficiently understand how news items are represented and spot bogus news, this study creates ELMs (Ensemble Learning Models). This study obtained a dataset from Kaggle to create the reasoning. In this study, four assessment metrics have been used to evaluate performances of classifying models

    Efficient Integration of Multi-Order Dynamics and Internal Dynamics in Stock Movement Prediction

    Full text link
    Advances in deep neural network (DNN) architectures have enabled new prediction techniques for stock market data. Unlike other multivariate time-series data, stock markets show two unique characteristics: (i) \emph{multi-order dynamics}, as stock prices are affected by strong non-pairwise correlations (e.g., within the same industry); and (ii) \emph{internal dynamics}, as each individual stock shows some particular behaviour. Recent DNN-based methods capture multi-order dynamics using hypergraphs, but rely on the Fourier basis in the convolution, which is both inefficient and ineffective. In addition, they largely ignore internal dynamics by adopting the same model for each stock, which implies a severe information loss. In this paper, we propose a framework for stock movement prediction to overcome the above issues. Specifically, the framework includes temporal generative filters that implement a memory-based mechanism onto an LSTM network in an attempt to learn individual patterns per stock. Moreover, we employ hypergraph attentions to capture the non-pairwise correlations. Here, using the wavelet basis instead of the Fourier basis, enables us to simplify the message passing and focus on the localized convolution. Experiments with US market data over six years show that our framework outperforms state-of-the-art methods in terms of profit and stability. Our source code and data are available at \url{https://github.com/thanhtrunghuynh93/estimate}.Comment: Technical report for accepted paper at WSDM 202

    Event Detection and Tracking Detection of Dangerous Events on Social Media

    Get PDF
    Online social media platforms have become essential tools for communication and information exchange in our lives. It is used for connecting with people and sharing information. This phenomenon has been intensively studied in the past decade to investigate users’ sentiments for different scenarios and purposes. As the technology advanced and popularity increased, it led to the use of different terms referring to similar topics which often result in confusion. We study such trends and intend to propose a uniform solution that deals with the subject clearly. We gather all these ambiguous terms under the umbrella of the most recent and popular terms to reach a concise verdict. Many events have been addressed in recent works that cover only specific types and domains of events. For the sake of keeping things simple and practical, the events that are extreme, negative, and dangerous are grouped under the name Dangerous Events (DE). These dangerous events are further divided into three main categories of action-based, scenario-based, and sentiments-based dangerous events to specify their characteristics. We then propose deep-learning-based models to detect events that are dangerous in nature. The deep-learning models that include BERT, RoBERTa, and XLNet provide valuable results that can effectively help solve the issue of detecting dangerous events using various dimensions. Even though the models perform well, the main constraint of fewer available event datasets and lower quality of certain events data affects the performance of these models can be tackled by handling the issue accordingly.As plataformas online de redes sociais tornaram-se ferramentas essenciais para a comunicação, conexĂŁo com outros, e troca de informação nas nossas vidas. Este fenĂłmeno tem sido intensamente estudado na Ășltima dĂ©cada para investigar os sentimentos dos utilizadores em diferentes cenĂĄrios e para vĂĄrios propĂłsitos. Contudo, a utilização dos meios de comunicação social tornou-se mais complexa e num fenĂłmeno mais vasto devido ao envolvimento de mĂșltiplos intervenientes, tais como empresas, grupos e outras organizaçÔes. À medida que a tecnologia avançou e a popularidade aumentou, a utilização de termos diferentes referentes a tĂłpicos semelhantes gerou confusĂŁo. Por outras palavras, os modelos sĂŁo treinados segundo a informação de termos e Ăąmbitos especĂ­ficos. Portanto, a padronização Ă© imperativa. O objetivo deste trabalho Ă© unir os diferentes termos utilizados em termos mais abrangentes e padronizados. O perigo pode ser uma ameaça como violĂȘncia social, desastres naturais, danos intelectuais ou comunitĂĄrios, contĂĄgio, agitação social, perda econĂłmica, ou apenas a difusĂŁo de ideologias odiosas e violentas. Estudamos estes diferentes eventos e classificamos-los em tĂłpicos para que a tĂ©nica de deteção baseada em tĂłpicos possa ser concebida e integrada sob o termo Evento Perigosos (DE). Consequentemente, definimos o termo proposto “Eventos Perigosos” (Dangerous Events) e dividimo-lo em trĂȘs categorias principais de modo a especificar as suas caracterĂ­sticas. Sendo estes denominados Eventos Perigosos, Eventos Perigosos de nĂ­vel superior, e Eventos Perigosos de nĂ­vel inferior. O conjunto de dados MAVEN foi utilizado para a obtenção de conjuntos de dados para realizar a experiĂȘncia. Estes conjuntos de dados sĂŁo filtrados manualmente com base no tipo de eventos para separar eventos perigosos de eventos gerais. Os modelos de transformação BERT, RoBERTa, e XLNet foram utilizados para classificar dados de texto consoante a respetiva categoria de Eventos Perigosos. Os resultados demonstraram que o desempenho do BERT Ă© superior a outros modelos e pode ser eficazmente utilizado para a tarefa de deteção de Eventos Perigosos. Salienta-se que a abordagem de divisĂŁo dos conjuntos de dados aumentou significativamente o desempenho dos modelos. Existem diversos mĂ©todos propostos para a deteção de eventos. A deteção destes eventos (ED) sĂŁo maioritariamente classificados na categoria de supervisonado e nĂŁo supervisionados, como demonstrado nos metĂłdos supervisionados, estĂŁo incluidos support vector machine (SVM), Conditional random field (CRF), Decision tree (DT), Naive Bayes (NB), entre outros. Enquanto a categoria de nĂŁo supervisionados inclui Query-based, Statisticalbased, Probabilistic-based, Clustering-based e Graph-based. Estas sĂŁo as duas abordagens em uso na deteção de eventos e sĂŁo denonimados de document-pivot and feature-pivot. A diferença entre estas abordagens Ă© na sua maioria a clustering approach, a forma como os documentos sĂŁo utilizados para caracterizar vetores, e a similaridade mĂ©trica utilizada para identificar se dois documentos correspondem ao mesmo evento ou nĂŁo. AlĂ©m da deteção de eventos, a previsĂŁo de eventos Ă© um problema importante mas complicado que engloba diversas dimensĂ”es. Muitos destes eventos sĂŁo difĂ­ceis de prever antes de se tornarem visĂ­veis e ocorrerem. Como um exemplo, Ă© impossĂ­vel antecipar catĂĄstrofes naturais, sendo apenas detetĂĄveis apĂłs o seu acontecimento. Existe um nĂșmero limitado de recursos em ternos de conjuntos de dados de eventos. ACE 2005, MAVEN, EVIN sĂŁo alguns dos exemplos de conjuntos de dados disponĂ­veis para a deteção de evnetos. Os trabalhos recentes demonstraram que os Transformer-based pre-trained models (PTMs) sĂŁo capazes de alcançar desempenho de Ășltima geração em vĂĄrias tarefas de NLP. Estes modelos sĂŁo prĂ©-treinados em grandes quantidades de texto. Aprendem incorporaçÔes para as palavras da lĂ­ngua ou representaçÔes de vetores de modo a que as palavras que se relacionem se agrupen no espaço vectorial. Um total de trĂȘs transformadores diferentes, nomeadamente BERT, RoBERTa, e XLNet, serĂĄ utilizado para conduzir a experiĂȘncia e tirar a conclusĂŁo atravĂ©s da comparação destes modelos. Os modelos baseados em transformação (Transformer-based) estĂŁo em total sintonia utilizando uma divisĂŁo de 70,30 dos conjuntos de dados para fins de formação e teste/validação. A sintonização do hiperparĂąmetro inclui 10 epochs, 16 batch size, e o optimizador AdamW com taxa de aprendizagem 2e-5 para BERT e RoBERTa e 3e-5 para XLNet. Para eventos perigosos, o BERT fornece 60%, o RoBERTa 59 enquanto a XLNet fornece apenas 54% de precisĂŁo geral. Para as outras experiĂȘncias de configuração de eventos de alto nĂ­vel, o BERT e a XLNet dĂŁo 71% e 70% de desempenho com RoBERTa em relação aos outros modelos com 74% de precisĂŁo. Enquanto para o DE baseado em acçÔes, DE baseado em cenĂĄrios, e DE baseado em sentimentos, o BERT dĂĄ 62%, 85%, e 81% respetivamente; RoBERTa com 61%, 83%, e 71%; a XLNet com 52%, 81%, e 77% de precisĂŁo. Existe a necessidade de clarificar a ambiguidade entre os diferentes trabalhos que abordam problemas similares utilizando termos diferentes. A ideia proposta de referir acontecimentos especifĂ­cos como eventos perigosos torna mais fĂĄcil a abordagem do problema em questĂŁo. No entanto, a escassez de conjunto de dados de eventos limita o desempenho dos modelos e o progresso na deteção das tarefas. A disponibilidade de uma maior quantidade de informação relacionada com eventos perigosos pode melhorar o desempenho do modelo existente. É evidente que o uso de modelos de aprendizagem profunda, tais como como BERT, RoBERTa, e XLNet, pode ajudar a detetar e classificar eventos perigosos de forma eficiente. Tem sido evidente que a utilização de modelos de aprendizagem profunda, tais como BERT, RoBERTa, e XLNet, pode ajudar a detetar e classificar eventos perigosos de forma eficiente. Em geral, o BERT tem um desempenho superior ao do RoBERTa e XLNet na detecção de eventos perigosos. É igualmente importante rastrear os eventos apĂłs a sua detecção. Por conseguinte, para trabalhos futuros, propĂ”e-se a implementação das tĂ©cnicas que lidam com o espaço e o tempo, a fim de monitorizar a sua emergĂȘncia com o tempo
    • 

    corecore