Search CORE

1,106 research outputs found

Stochastic Methods to Find Maximum Likelihood for Spam E-mail Classification

Author: AJ Viterbi
S Roy
V Jaswal
WA Awad
Publication venue: Hosted by Utah State University Libraries
Publication date: 15/03/2019
Field of study

The increasing volume of unsolicited bulk e-mails leads to the need for reliable stochastic spam detection methods for the classification of the received sequence of e-mails. When a sequence of emails is received by a recipient during a time period, the spam filters have already classified them as spam or not spam. Due to the dynamic nature of the spam, there might be emails marked as not spam but are actually real spams and vice versa. For the sake of security, it is important to be able to detect real spam emails. This paper utilizes stochastic methods to refine the preliminary spam detection and to find maximum likelihood for spam e-mail classification. The method is based on the Bayesian theorem, hidden Markov model (HMM), and the Viterbi algorithm

Crossref

DigitalCommons@USU

Detection of Review Abuse via Semi-Supervised Binary Multi-Target Tensor Decomposition

Author: Feng S.
Hooi B.
Hu C.
Li H.
Rai P.
Rai P.
Ye J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/05/2019
Field of study

Product reviews and ratings on e-commerce websites provide customers with detailed insights about various aspects of the product such as quality, usefulness, etc. Since they influence customers' buying decisions, product reviews have become a fertile ground for abuse by sellers (colluding with reviewers) to promote their own products or to tarnish the reputation of competitor's products. In this paper, our focus is on detecting such abusive entities (both sellers and reviewers) by applying tensor decomposition on the product reviews data. While tensor decomposition is mostly unsupervised, we formulate our problem as a semi-supervised binary multi-target tensor decomposition, to take advantage of currently known abusive entities. We empirically show that our multi-target semi-supervised model achieves higher precision and recall in detecting abusive entities as compared to unsupervised techniques. Finally, we show that our proposed stochastic partial natural gradient inference for our model empirically achieves faster convergence than stochastic gradient and Online-EM with sufficient statistics.Comment: Accepted to the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2019. Contains supplementary material. arXiv admin note: text overlap with arXiv:1804.0383

arXiv.org e-Print Archive

Crossref

TA-COS 2018 : 2nd Workshop on Text Analytics for Cybersecurity and Online Safety : Proceedings

Author: De Pauw Guy
Desmet Bart
Lefever Els
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2018
Field of study

Ghent University Academic Bibliography

Detecting Spam Email With Machine Learning Optimized With Bio-Inspired Metaheuristic Algorithms

Author: Gibson Simran
Issac Biju
Jacob Seibu Mary
Zhang Li
Publication venue: IEEE
Publication date: 13/10/2020
Field of study

Electronic mail has eased communication methods for many organisations as well as individuals. This method is exploited for fraudulent gain by spammers through sending unsolicited emails. This article aims to present a method for detection of spam emails with machine learning algorithms that are optimized with bio-inspired methods. A literature review is carried to explore the efficient methods applied on different datasets to achieve good results. An extensive research was done to implement machine learning models using Naïve Bayes, Support Vector Machine, Random Forest, Decision Tree and Multi-Layer Perceptron on seven different email datasets, along with feature extraction and pre-processing. The bio-inspired algorithms like Particle Swarm Optimization and Genetic Algorithm were implemented to optimize the performance of classifiers. Multinomial Naïve Bayes with Genetic Algorithm performed the best overall. The comparison of our results with other machine learning and bio-inspired models to show the best suitable model is also discussed

Northumbria Research Link

Teeside University's Research Repository