1,809 research outputs found
A systematic literature review on spam content detection and classification
The presence of spam content in social media is tremendously increasing, and therefore the detection of spam has become vital. The spam contents increase as people extensively use social media, i.e ., Facebook, Twitter, YouTube, and E-mail. The time spent by people using social media is overgrowing, especially in the time of the pandemic. Users get a lot of text messages through social media, and they cannot recognize the spam content in these messages. Spam messages contain malicious links, apps, fake accounts, fake news, reviews, rumors, etc. To improve social media security, the detection and control of spam text are essential. This paper presents a detailed survey on the latest developments in spam text detection and classification in social media. The various techniques involved in spam detection and classification involving Machine Learning, Deep Learning, and text-based approaches are discussed in this paper. We also present the challenges encountered in the identification of spam with its control mechanisms and datasets used in existing works involving spam detection
Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning
Learning-based pattern classifiers, including deep networks, have shown
impressive performance in several application domains, ranging from computer
vision to cybersecurity. However, it has also been shown that adversarial input
perturbations carefully crafted either at training or at test time can easily
subvert their predictions. The vulnerability of machine learning to such wild
patterns (also referred to as adversarial examples), along with the design of
suitable countermeasures, have been investigated in the research field of
adversarial machine learning. In this work, we provide a thorough overview of
the evolution of this research area over the last ten years and beyond,
starting from pioneering, earlier work on the security of non-deep learning
algorithms up to more recent work aimed to understand the security properties
of deep learning algorithms, in the context of computer vision and
cybersecurity tasks. We report interesting connections between these
apparently-different lines of work, highlighting common misconceptions related
to the security evaluation of machine-learning algorithms. We review the main
threat models and attacks defined to this end, and discuss the main limitations
of current work, along with the corresponding future challenges towards the
design of more secure learning algorithms.Comment: Accepted for publication on Pattern Recognition, 201
A Late Multi-Modal Fusion Model for Detecting Hybrid Spam E-mail
In recent years, spammers are now trying to obfuscate their intents by
introducing hybrid spam e-mail combining both image and text parts, which is
more challenging to detect in comparison to e-mails containing text or image
only. The motivation behind this research is to design an effective approach
filtering out hybrid spam e-mails to avoid situations where traditional
text-based or image-baesd only filters fail to detect hybrid spam e-mails. To
the best of our knowledge, a few studies have been conducted with the goal of
detecting hybrid spam e-mails. Ordinarily, Optical Character Recognition (OCR)
technology is used to eliminate the image parts of spam by transforming images
into text. However, the research questions are that although OCR scanning is a
very successful technique in processing text-and-image hybrid spam, it is not
an effective solution for dealing with huge quantities due to the CPU power
required and the execution time it takes to scan e-mail files. And the OCR
techniques are not always reliable in the transformation processes. To address
such problems, we propose new late multi-modal fusion training frameworks for a
text-and-image hybrid spam e-mail filtering system compared to the classical
early fusion detection frameworks based on the OCR method. Convolutional Neural
Network (CNN) and Continuous Bag of Words were implemented to extract features
from image and text parts of hybrid spam respectively, whereas generated
features were fed to sigmoid layer and Machine Learning based classifiers
including Random Forest (RF), Decision Tree (DT), Naive Bayes (NB) and Support
Vector Machine (SVM) to determine the e-mail ham or spam.Comment: Accepted by 2023 the 2nd International Conference on Mechatronics and
Electrical Engineering (MEEE 2023
An Expert System Technique for Sentiment Analysis of Opinions
To help the users and the product owners it is quite necessary to extract aspects from the online reviews, their sentiment polarities, and associations between them. There is a great deal of work done in the field of sentiment analysis. Lexical and learning-based systems can be combined to separate the assessments from online opinions and reviews. In learning-based techniques, the Gaussian mixture model can be used for getting probabilistic results for polarities against aspects and naïve baize classifiers for the problem of spam comments which produced better and competitive results against previous techniques
Intelligent Computing for Big Data
Recent advances in artificial intelligence have the potential to further develop current big data research. The Special Issue on ‘Intelligent Computing for Big Data’ highlighted a number of recent studies related to the use of intelligent computing techniques in the processing of big data for text mining, autism diagnosis, behaviour recognition, and blockchain-based storage
Multimodal Convolutional Neural Networks to Detect Fetal Compromise During Labor and Delivery
The gold standard to assess whether a baby is at risk of oxygen deprivation during childbirth, is monitoring continuously the fetal heart rate with cardiotocography (CTG). The aim is to identify babies that could benefit from an emergency operative delivery (e.g., Cesarean section), in order to prevent death or permanent brain injury. The long, dynamic and complex CTG patterns are poorly understood and known to have high false positive and false negative rates. Visual interpretation by clinicians is challenging and reliable accurate fetal monitoring in labor remains an enormous unmet medical need. In this work, we applied deep learning methods to achieve data-driven automated CTG evaluation. Multimodal Convolutional Neural Network (MCNN) and Stacked MCNN models were used to analyze the largest available database of routinely collected CTG and linked clinical data (comprising more than 35000 births). We also assessed in detail the impact of the signal quality on the MCNN performance. On a large hold-out testing set from Oxford (n= 4429 births), MCNN improved the prediction of cord acidemia at birth when compared with Clinical Practice and previous computerized approaches. On two external datasets, MCNN demonstrated better performance compared to current feature extraction-based methods. Our group is the first to apply deep learning for the analysis of CTG. We conclude that MCNN hold potential for the prediction of cord acidemia at birth and further work is warranted. Despite the advances, our deep learning models are currently not suitable for the detection of severe fetal injury in the absence of cord acidemia - a heterogeneous, small, and poorly understood group. We suggest that the most promising way forward are hybrid approaches to CTG interpretation in labor, in which different diagnostic models can estimate the risk for different types of fetal compromise, incorporating clinical knowledge with data-driven analyses
- …