1,106 research outputs found
Stochastic Methods to Find Maximum Likelihood for Spam E-mail Classification
The increasing volume of unsolicited bulk e-mails leads to the need for reliable stochastic spam detection methods for the classification of the received sequence of e-mails. When a sequence of emails is received by a recipient during a time period, the spam filters have already classified them as spam or not spam. Due to the dynamic nature of the spam, there might be emails marked as not spam but are actually real spams and vice versa. For the sake of security, it is important to be able to detect real spam emails. This paper utilizes stochastic methods to refine the preliminary spam detection and to find maximum likelihood for spam e-mail classification. The method is based on the Bayesian theorem, hidden Markov model (HMM), and the Viterbi algorithm
Detection of Review Abuse via Semi-Supervised Binary Multi-Target Tensor Decomposition
Product reviews and ratings on e-commerce websites provide customers with
detailed insights about various aspects of the product such as quality,
usefulness, etc. Since they influence customers' buying decisions, product
reviews have become a fertile ground for abuse by sellers (colluding with
reviewers) to promote their own products or to tarnish the reputation of
competitor's products. In this paper, our focus is on detecting such abusive
entities (both sellers and reviewers) by applying tensor decomposition on the
product reviews data. While tensor decomposition is mostly unsupervised, we
formulate our problem as a semi-supervised binary multi-target tensor
decomposition, to take advantage of currently known abusive entities. We
empirically show that our multi-target semi-supervised model achieves higher
precision and recall in detecting abusive entities as compared to unsupervised
techniques. Finally, we show that our proposed stochastic partial natural
gradient inference for our model empirically achieves faster convergence than
stochastic gradient and Online-EM with sufficient statistics.Comment: Accepted to the 25th ACM SIGKDD Conference on Knowledge Discovery and
Data Mining, 2019. Contains supplementary material. arXiv admin note: text
overlap with arXiv:1804.0383
Detecting Spam Email With Machine Learning Optimized With Bio-Inspired Metaheuristic Algorithms
Electronic mail has eased communication methods for many organisations as well as individuals. This method is exploited for fraudulent gain by spammers through sending unsolicited emails. This article aims to present a method for detection of spam emails with machine learning algorithms that are optimized with bio-inspired methods. A literature review is carried to explore the efficient methods applied on different datasets to achieve good results. An extensive research was done to implement machine learning models using Naïve Bayes, Support Vector Machine, Random Forest, Decision Tree and Multi-Layer Perceptron on seven different email datasets, along with feature extraction and pre-processing. The bio-inspired algorithms like Particle Swarm Optimization and Genetic Algorithm were implemented to optimize the performance of classifiers. Multinomial Naïve Bayes with Genetic Algorithm performed the best overall. The comparison of our results with other machine learning and bio-inspired models to show the best suitable model is also discussed
- …