4 research outputs found

    Robust Spammer Detection Using Collaborative Neural Network in Internet of Thing Applications

    Get PDF
    Spamming is emerging as a key threat to Internet of Things (IoT)-based social media applications. It will pose serious security threats to the IoT cyberspace. To this end, artificial intelligence-based detection and identification techniques have been widely investigated. The literature works on IoT cyberspace can be categorized into two categories: 1) behavior pattern-based approaches; and 2) semantic pattern-based approaches. However, they are unable to effectively handle concealed, complicated, and changing spamming activities, especially in the highly uncertain environment of the IoT. To address this challenge, in this paper, we exploit the collaborative awareness of both patterns, and propose a Collaborative neural network-based Spammer detection mechanism (Co-Spam) in social media applications. In particular, it introduces multi-source information fusion by collaboratively encoding long-term behavioral and semantic patterns. Hence, a more comprehensive representation of the feature space can be captured for further spammer detection. Empirically, we implement a series of experiments on two real-world datasets under different scenario and parameter settings. The efficiency of the proposed Co-Spam is compared with five baselines with respect to several evaluation metrics. The experimental results indicate that the Co-Spam has an average performance improvement of approximately 5% compared to the baselines

    Online Social Deception and Its Countermeasures for Trustworthy Cyberspace: A Survey

    Full text link
    We are living in an era when online communication over social network services (SNSs) have become an indispensable part of people's everyday lives. As a consequence, online social deception (OSD) in SNSs has emerged as a serious threat in cyberspace, particularly for users vulnerable to such cyberattacks. Cyber attackers have exploited the sophisticated features of SNSs to carry out harmful OSD activities, such as financial fraud, privacy threat, or sexual/labor exploitation. Therefore, it is critical to understand OSD and develop effective countermeasures against OSD for building a trustworthy SNSs. In this paper, we conducted an extensive survey, covering (i) the multidisciplinary concepts of social deception; (ii) types of OSD attacks and their unique characteristics compared to other social network attacks and cybercrimes; (iii) comprehensive defense mechanisms embracing prevention, detection, and response (or mitigation) against OSD attacks along with their pros and cons; (iv) datasets/metrics used for validation and verification; and (v) legal and ethical concerns related to OSD research. Based on this survey, we provide insights into the effectiveness of countermeasures and the lessons from existing literature. We conclude this survey paper with an in-depth discussions on the limitations of the state-of-the-art and recommend future research directions in this area.Comment: 35 pages, 8 figures, submitted to ACM Computing Survey

    Convolution-deconvolution word embedding: an end-to-end multi-prototype fusion embedding method for natural language processing

    Get PDF
    Existing unsupervised word embedding methods have been proved to be effective to capture latent semantic information on various tasks of Natural Language Processing (NLP). However, existing word representation methods are incapable of tackling both the polysemousunaware and task-unaware problems that are common phenomena in NLP tasks. In this work, we present a novel Convolution-Deconvolution Word Embedding (CDWE), an end-to-end multi-prototype fusion embedding that fuses context-specific information and taskspecific information. To the best of our knowledge, we are the first to extend deconvolution (e.g. convolution transpose), which has been widely used in computer vision, to word embedding generation. We empirically demonstrate the efficiency and generalization ability of CDWE by applying it to two representative tasks in NLP: text classification and machine translation. The models of CDWE significantly outperform the baselines and achieve state-of-the-art results on both tasks. To validate the efficiency of CDWE further, we demonstrate how CDWE solves the polysemous-unaware and task-unaware problems via analyzing the Text Deconvolution Saliency, which is an existing strategy for evaluating the outputs of deconvolution

    Exploring Novel Datasets and Methods for the Study of False Information

    Get PDF
    False information has increasingly become a subject of much discussion. Recently, disinformation has been linked to causing massive social harm, leading to the decline of democracy, and hindering global efforts in an international health crisis. In computing, and specifically Natural Language Processing (NLP), much effort has been put into tackling this problem. This has led to an increase of research in automated fact-checking and the language of disinformation. However, current research suffers from looking at a limited variety of sources. Much focus has, understandably, been given to platforms such as Twitter, Facebook and WhatsApp, as well as on traditional news articles online. Few works in NLP have looked at the specific communities where false information ferments. There has also been something of a topical constraint, with most examples of “Fake News” relating to current political issues. This thesis contributes to this rapidly growing research area by looking wider for new sources of data, and developing methods to analyse them. Specifically, it introduces two new datasets to the field and performs analyses on both. The first of these, a corpus of April Fools hoaxes, is analysed with a feature-driven approach to examine the generalisability of different features in the classification of false information. This is the first corpus of April Fools news articles, and is publicly available for researchers. The second dataset, a corpus of online Flat Earth communities, is also the first of its kind. In addition to performing the first NLP analysis of the language of Flat Earth fora, an exploration is performed to look for the existence of sub-groups within these communities, as well as an analysis of language change. To support this analysis, language change methods are surveyed, and a new method for comparing the language change of groups over time is developed. The methods used, brought together from both NLP and Corpus Linguistics, provide new insight into the language of false information, and the way communities discuss it
    corecore