1,809 research outputs found

    A systematic literature review on spam content detection and classification

    Get PDF
    The presence of spam content in social media is tremendously increasing, and therefore the detection of spam has become vital. The spam contents increase as people extensively use social media, i.e ., Facebook, Twitter, YouTube, and E-mail. The time spent by people using social media is overgrowing, especially in the time of the pandemic. Users get a lot of text messages through social media, and they cannot recognize the spam content in these messages. Spam messages contain malicious links, apps, fake accounts, fake news, reviews, rumors, etc. To improve social media security, the detection and control of spam text are essential. This paper presents a detailed survey on the latest developments in spam text detection and classification in social media. The various techniques involved in spam detection and classification involving Machine Learning, Deep Learning, and text-based approaches are discussed in this paper. We also present the challenges encountered in the identification of spam with its control mechanisms and datasets used in existing works involving spam detection

    Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

    Get PDF
    Learning-based pattern classifiers, including deep networks, have shown impressive performance in several application domains, ranging from computer vision to cybersecurity. However, it has also been shown that adversarial input perturbations carefully crafted either at training or at test time can easily subvert their predictions. The vulnerability of machine learning to such wild patterns (also referred to as adversarial examples), along with the design of suitable countermeasures, have been investigated in the research field of adversarial machine learning. In this work, we provide a thorough overview of the evolution of this research area over the last ten years and beyond, starting from pioneering, earlier work on the security of non-deep learning algorithms up to more recent work aimed to understand the security properties of deep learning algorithms, in the context of computer vision and cybersecurity tasks. We report interesting connections between these apparently-different lines of work, highlighting common misconceptions related to the security evaluation of machine-learning algorithms. We review the main threat models and attacks defined to this end, and discuss the main limitations of current work, along with the corresponding future challenges towards the design of more secure learning algorithms.Comment: Accepted for publication on Pattern Recognition, 201

    A Late Multi-Modal Fusion Model for Detecting Hybrid Spam E-mail

    Get PDF
    In recent years, spammers are now trying to obfuscate their intents by introducing hybrid spam e-mail combining both image and text parts, which is more challenging to detect in comparison to e-mails containing text or image only. The motivation behind this research is to design an effective approach filtering out hybrid spam e-mails to avoid situations where traditional text-based or image-baesd only filters fail to detect hybrid spam e-mails. To the best of our knowledge, a few studies have been conducted with the goal of detecting hybrid spam e-mails. Ordinarily, Optical Character Recognition (OCR) technology is used to eliminate the image parts of spam by transforming images into text. However, the research questions are that although OCR scanning is a very successful technique in processing text-and-image hybrid spam, it is not an effective solution for dealing with huge quantities due to the CPU power required and the execution time it takes to scan e-mail files. And the OCR techniques are not always reliable in the transformation processes. To address such problems, we propose new late multi-modal fusion training frameworks for a text-and-image hybrid spam e-mail filtering system compared to the classical early fusion detection frameworks based on the OCR method. Convolutional Neural Network (CNN) and Continuous Bag of Words were implemented to extract features from image and text parts of hybrid spam respectively, whereas generated features were fed to sigmoid layer and Machine Learning based classifiers including Random Forest (RF), Decision Tree (DT), Naive Bayes (NB) and Support Vector Machine (SVM) to determine the e-mail ham or spam.Comment: Accepted by 2023 the 2nd International Conference on Mechatronics and Electrical Engineering (MEEE 2023

    An Expert System Technique for Sentiment Analysis of Opinions

    Get PDF
    To help the users and the product owners it is quite necessary to extract aspects from the online reviews, their sentiment polarities, and associations between them. There is a great deal of work done in the field of sentiment analysis. Lexical and learning-based systems can be combined to separate the assessments from online opinions and reviews. In learning-based techniques, the Gaussian mixture model can be used for getting probabilistic results for polarities against aspects and naïve baize classifiers for the problem of spam comments which produced better and competitive results against previous techniques

    Intelligent Computing for Big Data

    Get PDF
    Recent advances in artificial intelligence have the potential to further develop current big data research. The Special Issue on ‘Intelligent Computing for Big Data’ highlighted a number of recent studies related to the use of intelligent computing techniques in the processing of big data for text mining, autism diagnosis, behaviour recognition, and blockchain-based storage

    Multimodal Convolutional Neural Networks to Detect Fetal Compromise During Labor and Delivery

    Get PDF
    The gold standard to assess whether a baby is at risk of oxygen deprivation during childbirth, is monitoring continuously the fetal heart rate with cardiotocography (CTG). The aim is to identify babies that could benefit from an emergency operative delivery (e.g., Cesarean section), in order to prevent death or permanent brain injury. The long, dynamic and complex CTG patterns are poorly understood and known to have high false positive and false negative rates. Visual interpretation by clinicians is challenging and reliable accurate fetal monitoring in labor remains an enormous unmet medical need. In this work, we applied deep learning methods to achieve data-driven automated CTG evaluation. Multimodal Convolutional Neural Network (MCNN) and Stacked MCNN models were used to analyze the largest available database of routinely collected CTG and linked clinical data (comprising more than 35000 births). We also assessed in detail the impact of the signal quality on the MCNN performance. On a large hold-out testing set from Oxford (n= 4429 births), MCNN improved the prediction of cord acidemia at birth when compared with Clinical Practice and previous computerized approaches. On two external datasets, MCNN demonstrated better performance compared to current feature extraction-based methods. Our group is the first to apply deep learning for the analysis of CTG. We conclude that MCNN hold potential for the prediction of cord acidemia at birth and further work is warranted. Despite the advances, our deep learning models are currently not suitable for the detection of severe fetal injury in the absence of cord acidemia - a heterogeneous, small, and poorly understood group. We suggest that the most promising way forward are hybrid approaches to CTG interpretation in labor, in which different diagnostic models can estimate the risk for different types of fetal compromise, incorporating clinical knowledge with data-driven analyses
    • …
    corecore