34 research outputs found

    A Review on mobile SMS Spam filtering techniques

    Get PDF
    Under short messaging service (SMS) spam is understood the unsolicited or undesired messages received on mobile phones. These SMS spams constitute a veritable nuisance to the mobile subscribers. This marketing practice also worries service providers in view of the fact that it upsets their clients or even causes them lose subscribers. By way of mitigating this practice, researchers have proposed several solutions for the detection and filtering of SMS spams. In this paper, we present a review of the currently available methods, challenges, and future research directions on spam detection techniques, filtering, and mitigation of mobile SMS spams. The existing research literature is critically reviewed and analyzed. The most popular techniques for SMS spam detection, filtering, and mitigation are compared, including the used data sets, their findings, and limitations, and the future research directions are discussed. This review is designed to assist expert researchers to identify open areas that need further improvement

    A Comparative Analysis of SMS Spam Detection employing Machine Learning Methods

    Get PDF
    In recent times, the increment of mobile phone usage has resulted in a huge number of spam messages. Spammers continuously apply more and more new tricks that cause managing or preventing spam messages a challenging task. The aim of this study is to detect spam message to prevent different cybercrimes as spam messages have become a security threat nowadays. In this paper, studies on SMS spam problems to perform a better accuracy using several different techniques such as Support Vector Machine, K-Nearest Neighbor, Naïve Bayes, Random Forest, Logistic Regression and some more are performed. The result indicated that Support Vector Machine achieved the highest accuracy of 99%, indicating it might be useful as an effective machine learning system for future research.acceptedVersionPeer reviewe

    Deep learning to filter SMS spam

    Get PDF
    The popularity of short message service (SMS) has been growing over the last decade. For businesses, these text messages are more effective than even emails. This is because while 98% of mobile users read their SMS by the end of the day, about 80% of the emails remain unopened. The popularity of SMS has also given rise to SMS Spam, which refers to any irrelevant text messages delivered using mobile networks. They are severely annoying to users. Most existing research that has attempted to filter SMS Spam has relied on manually identified features. Extending the current literature, this paper uses deep learning to classify Spam and Not-Spam text messages. Specifically, Convolutional Neural Network and Long Short-term memory models were employed. The proposed models were based on text data only, and self-extracted the feature set. On a benchmark dataset consisting of 747 Spam and 4,827 Not-Spam text messages, a remarkable accuracy of 99.44% was achieved

    Semi-supervised novelty detection with one class SVM for SMS spam detection

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.The volume of SMS messages sent on a daily basis globally has continued to grow significantly over the past years. Hence, mobile phones are becoming increasingly vulnerable to SMS spam messages, thereby exposing users to the risk of fraud and theft of personal data. Filtering of messages to detect and eliminate SMS spam is now a critical functionality for which different types of machine learning approaches are still being explored. In this paper, we propose a system for detecting SMS spam using a semi-supervised novelty detection approach based on one class SVM classifier. The system is built as an anomaly detector that learns only from normal SMS messages thus enabling detection models to be implemented in the absence of labelled SMS spam training examples. We evaluated our proposed system using a benchmark dataset consisting of 747 SMS spam and 4827 non-spam messages. The results show that our proposed method outperformed the traditional supervised machine learning approaches based on binary, frequency or TF-IDF bag-of-words. The overall accuracy was 98% with 100% SMS spam detection rate and only around 3% false positive rate

    A Comparative Study of Word Embedding Techniques for SMS Spam Detection

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.E-mail and SMS are the most popular communication tools used by businesses, organizations and educational institutions. Every day, people receive hundreds of messages which could be either spam or ham. Spam is any form of unsolicited, unwanted digital communication, usually sent out in bulk. Spam emails and SMS waste resources by unnecessarily flooding network lines and consuming storage space. Therefore, it is important to develop high accuracy spam detection models to effectively block spam messages, so as to optimize resources and protect users. Various word-embedding techniques such as Bag of Words (BOW), N-grams, TF-IDF, Word2Vec and Doc2Vec have been widely applied to NLP problems, however a comparative study of these techniques for SMS spam detection is currently lacking. Hence, in this paper, we provide a comparative analysis of these popular word embedding techniques for SMS spam detection by evaluating their performance on a publicly available ham and spam dataset. We investigate the performance of the word embedding techniques using 5 different machine learning classifiers i.e. Multinomial Naive Bayes (MNB), KNN, SVM, Random Forest and Extra Trees. Based on the dataset employed in the study, N-gram, BOW and TF-IDF with oversampling recorded the highest F1 scores of 0.99 for ham and 0.94 for spam

    Prediction of bank frauds by SMS or voice, from cell phone data analysis: a systematic literature review

    Get PDF
    CISTI 2021. 16ª Conferência Ibérica de Sistemas e Tecnologias de Informação, realizada em Chaves, Portugal, de 23 – 26 junho de 2021.Revisão sistemática de literatura sobre fraudes por SMS e voz. Estado da arte em 2021.Nos últimos anos registou-se um crescimento acentuado de fraudes bancárias cometidas por SMS (Short Messaging System) e voz. Um dos fatores que contribui para o aumento de casos de fraudes por SMS é o baixo custo de aquisição de grandes volumes de mensagens, a confiabilidade (a mensagem chegará ao destinatário) e o fato de não precisar de Internet para chegar até a vítima. Em relação as fraudes financeiras por voz, estas podem ser usadas para persuadir as vítimas a efetuarem transferências bancárias para as contas dos fraudulentos, com a promessa de receber avultadas somas em prémios. A deteção destes tipos de fraudes não é uma tarefa trivial, pois exige a aplicação de técnicas e métodos apropriados dependendo da sua natureza. Assim, neste artigo é apresentada uma Revisão Sistemática de Literatura (RSL) de 2015 a 2020, com o intuito de analisar o estado da arte sobre fraudes bancárias cometidas por SMS ou voz. A RSL permitiu identificar os tipos mais comuns de fraudes bancárias por SMS ou voz, e as respetivas técnicas de deteção.In recent years there has been a marked increase in bank fraud by SMS (Short Messaging System) and voice. One of the factors contributing to increase in cases of SMS fraud is the low cost of acquiring large volumes of messages, the reliability (the message will reach the recipient) and the fact that it does not need the Internet to reach the victim. In relation to financial fraud by voice, these can be used to persuade victims to make bank transfers to fraudulent accounts, with the promise of receiving large sums in prizes. The prevention of these types of fraud is not a trivial task, as it requires the application of appropriate techniques and methods depending on their nature. This article presents a Systematic Literature Review (SLR) from 2015 to 2020, with the aim of analyzing the state of the art on bank frauds committed by SMS or voice. The SLR allowed the identification of the most common types of bank fraud by SMS or voice, and the respective detection techniques.Este trabalho é financiado por Fundos Nacionais através da agência de financiamento portuguesa FCT - Fundação para a Ciência e Tecnologia no âmbito do projeto UIDB / 50014/2020.info:eu-repo/semantics/publishedVersio

    Fault Detection and Isolation of Wind Turbines using Immune System Inspired Algorithms

    Get PDF
    Recently, the research focus on renewable sources of energy has been growing intensively. This is mainly due to potential depletion of fossil fuels and its associated environmental concerns, such as pollution and greenhouse gas emissions. Wind energy is one of the fastest growing sources of renewable energy, and policy makers in both developing and developed countries have built their vision on future energy supply based on and by emphasizing the wind power. The increase in the number of wind turbines, as well as their size, have led to undeniable care and attention to health and condition monitoring as well as fault diagnosis of wind turbine systems and their components. In this thesis, two main immune inspired algorithms are used to perform Fault Detection and Isolation (FDI) of a Wind Turbine (WT), namely the Negative Selection Algorithm (NSA) as well as the Dendritic Cell Algorithm (DCA). First, an NSA-based fault diagnosis methodology is proposed in which a hierarchical bank of NSAs is used to detect and isolate both individual as well as simultaneously occurring faults common to the wind turbines. A smoothing moving window filter is then utilized to further improve the reliability and performance of the proposed FDI scheme. Moreover, the performance of the proposed scheme is compared with the state-of-the-art data-driven technique, namely Support Vector Machine (SVM) to demonstrate and illustrate the superiority and advantages of the proposed NSA-based FDI scheme. Finally, a nonparametric statistical comparison test is implemented to evaluate the proposed methodology with that of the SVM under various fault severities. In the second part, another immune inspired methodology, namely the Dendritic Cell Algorithm (DCA) is used to perform online sensor fault FDI. A noise filter is also designed to attenuate the measurement noise, resulting in better FDI results. The proposed DCA-based FDI scheme is then compared with the previously developed NSA-based FDI scheme, and a nonparametric statistical comparison test is also performed. Both of the proposed immune inspired frameworks are applied to a well-known wind turbine benchmark model in order to validate the effectiveness of the proposed methodologies
    corecore