1,106 research outputs found

    Stochastic Methods to Find Maximum Likelihood for Spam E-mail Classification

    Get PDF
    The increasing volume of unsolicited bulk e-mails leads to the need for reliable stochastic spam detection methods for the classification of the received sequence of e-mails. When a sequence of emails is received by a recipient during a time period, the spam filters have already classified them as spam or not spam. Due to the dynamic nature of the spam, there might be emails marked as not spam but are actually real spams and vice versa. For the sake of security, it is important to be able to detect real spam emails. This paper utilizes stochastic methods to refine the preliminary spam detection and to find maximum likelihood for spam e-mail classification. The method is based on the Bayesian theorem, hidden Markov model (HMM), and the Viterbi algorithm

    Detection of Review Abuse via Semi-Supervised Binary Multi-Target Tensor Decomposition

    Full text link
    Product reviews and ratings on e-commerce websites provide customers with detailed insights about various aspects of the product such as quality, usefulness, etc. Since they influence customers' buying decisions, product reviews have become a fertile ground for abuse by sellers (colluding with reviewers) to promote their own products or to tarnish the reputation of competitor's products. In this paper, our focus is on detecting such abusive entities (both sellers and reviewers) by applying tensor decomposition on the product reviews data. While tensor decomposition is mostly unsupervised, we formulate our problem as a semi-supervised binary multi-target tensor decomposition, to take advantage of currently known abusive entities. We empirically show that our multi-target semi-supervised model achieves higher precision and recall in detecting abusive entities as compared to unsupervised techniques. Finally, we show that our proposed stochastic partial natural gradient inference for our model empirically achieves faster convergence than stochastic gradient and Online-EM with sufficient statistics.Comment: Accepted to the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2019. Contains supplementary material. arXiv admin note: text overlap with arXiv:1804.0383

    Detecting Spam Email With Machine Learning Optimized With Bio-Inspired Metaheuristic Algorithms

    Get PDF
    Electronic mail has eased communication methods for many organisations as well as individuals. This method is exploited for fraudulent gain by spammers through sending unsolicited emails. This article aims to present a method for detection of spam emails with machine learning algorithms that are optimized with bio-inspired methods. A literature review is carried to explore the efficient methods applied on different datasets to achieve good results. An extensive research was done to implement machine learning models using Naïve Bayes, Support Vector Machine, Random Forest, Decision Tree and Multi-Layer Perceptron on seven different email datasets, along with feature extraction and pre-processing. The bio-inspired algorithms like Particle Swarm Optimization and Genetic Algorithm were implemented to optimize the performance of classifiers. Multinomial Naïve Bayes with Genetic Algorithm performed the best overall. The comparison of our results with other machine learning and bio-inspired models to show the best suitable model is also discussed
    • …
    corecore