9,656 research outputs found

    Spam filtering based on preference ranking

    Full text link
    When the average number of spam messages received is continually increasing exponentially, both the Internet service provider and the end user suffer. The lack of an efficient solution may threaten the usability of the email as a communication means. In this paper we present a filtering mechanism applying the idea of preference ranking. This filtering mechanism will distinguish spam emails from other email on the Internet. The preference ranking gives the similarity values for nominated emails and spam emails specified by users, so that the ISP/end users can deal with spam emails at filtering points. We designed three filtering points to classify nominated emails into spam email, unsure email and legitimate email. This filtering mechanism can be applied on both middleware and at the client-side. The experiments show that high precision, recall and TCR (total cost ratio) of spam emails can be predicted for the preference based filtering mechanisms. <br /

    Spam - solutions and their problems

    Get PDF
    We analyze the success of filtering as a solution to the spam problem when used alone or concurrently with sender and/or receiver pricing. We find that filters alone may exacerbate the spam problem if the spammer attempts to evade them by sending multiple variants of the message to each consumer. Sender and receiver prices can effectively reduce or eliminating spam, either on their own or when used together with filtering. Finally, we discuss the impli- cations for social welfare of using the different spam controls.Spam; filtering; email; receiver pricing; sender pricing

    A Survey of Email Spam Filtering Methods

    Get PDF
    E-mail is one of the most secure medium for online communication and transferring data or messages through the web. An overgrowing increase in popularity, the number of unsolicited data has also increased rapidly. To filtering data, different approaches exist which automatically detect and remove these untenable messages. There are several numbers of email spam filtering technique such as Knowledge-based technique, Clustering techniques, Learning based technique, Heuristic processes and so on. This paper illustrates a survey of different existing email spam filtering system regarding Machine Learning Technique (MLT) such as Naive Bayes, SVM, K-Nearest Neighbor, Bayes Additive Regression, KNN Tree, and rules. However, here we present the classification, evaluation and comparison of different email spam filtering system Keywords: e-mail spam, spam filtering methods, machine learning technique, classification, SVM, AN

    ANALISIS DAN IMPLEMENTASI SPAM EMAIL FILTERING MENGGUNAKAN VECTOR SPACE MODEL (ANALYSIS AND IMPLEMENTATION OF SPAM EMAIL FILTERING USING VECTOR SPACE MODEL)

    Get PDF
    ABSTRAKSI: Banyaknya penggunaan internet sebagai media komunikasi, penyebaran berita serta makin banyaknya layanan penyedia email di internet menyebabkan email spam semakin banyak. Hal ini tentu merugikan bagi pengguna email karena harus menghabiskan banyak waktu untuk menghapus email-email spam tersebut dan dapat menyebabkan media penyimpanan pada email server menjadi penuh. Email spam biasanya berisi pesan komersial tentang suatu produk, usaha, atau bahkan pesan tentang pornografi yang tidak diinginkan oleh user. Saat ini sudah banyak teknik spam filtering yang dibuat untuk mengatasi email spam ini, seperti rule based filtering, naĆÆve bayesian filtering dan support vector machine. Kebanyakan dari aplikasi yang menggunakan teknik spam filtering saat ini, seperti Yahoo Mail tidak dapat mengenali pola dari dokumen email, dan menggunakan pencocokan ekspresi reguler, dimana jika terdapat suatu kata yang mengandung spam dalam suatu email, email tersebut difilter. Meskipun pendekatan ini dapat memfilter email spam, namun hal ini dapat menyebabkan email-email penting juga difilter karena mengandung term tersebut.Pada tugas akhir ini telah dirancang dan diimplementasikan suatu perangkat lunak spam email filtering menggunakan salah satu pendekatan teknik information retreival, yang disebut Vector Space Model. Vektor Space Model memperlakukan query sebagai vektor dalam ruang multidimensional. Sekumpulan data indexing berupa email spam dan email legitimate diberikan kepada perangkat lunak spam email filtering ini, sehingga dapat mengkategorisasikan email dengan mengidentifikasi content dari email untuk menentukan email mana yang merupakan spam email.. Sehingga, ketika spam tersebut cocok, maka perangkat lunak ini akan memfilternya.Kata Kunci : spam, email filtering, information retreival, vektor space model.ABSTRACT: Too much using of internet as communication media, news spreading, and there are a lot of email service provider in internet cause the number of spam email being excessively. It surely can harm the email user because the user have to spend much time to delete spam emails and can cause the storage media on email server being full. Spam email is flooding the internet with many copies of the same message, in a attempt to force the message on people who would not choose to receive it. Spam email usualy consist of commercial message to some product, bussiness message, or even porn message on user who would not want it. At present, there are many spam filtering technique that are developed to force this spam email, for example rule base filtering, naive bayesian filtering and support vector machine. Most of email applications that using spam filtering technique, such as Yahoo Mail, can not understand the semantics of email document, and use a regular expression match, where if a term appears in a particular email, it is filtered. Although this approach is able to filter spam emails, it could occasionally filter some important emails, which might just cotain such term.This Final Project has designed and implemented a spam email filtering tool using one of Information Retrieval Technique, called Vector Space Model. Vector Space Model act the query as a vector in mutidimensional room. Given an indexing data of spam and legitimate message, so that the spam email filtering tool is able to categorize email, by indentifying content of email to determine which one is spam email .Thus, whenever spam is match, it is filtered.Keyword: spam, email filtering, information retreival, vektor space model

    SHED: Spam Ham Email Dataset

    Get PDF
    Automatic filtering of spam emails becomes essential feature for a good email service provider. To gain direct or indirect benefits organizations/individuals are sending a lot of spam emails. Such kind emails activities are not only distracting the user but also consume lot of resources including processing power, memory and network bandwidth. The security issues are also associated with these unwanted emails as these emails may contain malicious content and/or links. Content based spam filtering is one of the effective approaches used for filtering. However, its efficiency depends upon the training set. The most of the existing datasets were collected and prepared a long back and the spammers have been changing the content to evade the filters trained based on these datasets. In this paper, we introduce Spam Ham email dataset (SHED): a dataset consisting spam and ham email. We evaluated the performance of filtering techniques trained by previous datasets and filtering techniques trained by SHED. It was observed that the filtering techniques trained by SHED outperformed the technique trained by other dataset. Furthermore, we also classified the spam email into various categories

    IMPLEMENTASI INFORMATION GAIN DAN LEARNING VECTOR QUANTIZATION SEBAGAI TEXT CLASSIFICATION PADA SPAM FILTERING AN IMPLEMENTATION OF INFORMATION GAIN AND LEARNING VECTOR QUANTIZATION AS TEXT CLASSIFICATION FOR SPAM FILTERING

    Get PDF
    ABSTRAKSI: Dengan semakin luasnya penggunaan internet menjadikan email semakin populer sebagai pilihan untuk mengirimkan iklan ke banyak orang, email semacam ini disebut sebagai spam email. Spam email telah menjadi permasalahan serius khususnya bagi pengguna email. Spam email mampu membanjiri mailbox pengguna email dengan emailā€“email yang tidak diharapkan kedatangannya. Spam filtering dibutuhkan untuk membantu mengenali adanya spam email atau bukan. Spam filtering dapat dibuat menggunakan teknik text classification dengan terlebih dahulu melakukan data preprocessing. Tujuan Tugas Akhir ini adalah mengimplementasikan jaringan syaraf tiruan Learning Vector Quantization sebagai text classification dan menerapkan information gain pada feature extraction sebagai teknik untuk spam filtering dan menghitung tingkat akurasinya. Dari hasil pengujian telihat bahwa LVQ dapat digunakan sebagai teknik text klasifikasi pada spam filtering hal ini diperoleh dari hasil testing yang menunjukkan akurasi 98% dengan uji coba menggunakan 1001 data learning dan testing dengan 100 data.Kata Kunci : spam email, spam filtering, text classification, data preprocessing, learning vector quantization, information gain.ABSTRACT: Along with wide application of internet made email become popular way for sending unsolicited advertising to many people, this kind of email called spam email. Spam email become a serious problem for e-mail users. Many spare e-mails flood into people\u27s email inboxes and bring catastrophe to their study and work. Spam filtering help us to recognize weather email is a spam or not. Using text classification method, spam filtering can be built by done the data preprocessing first. The purpose of this final project is implement Neural Network Learning Vector Quantization as Text Classification and Information Gain (IG) as Feature Extraction based spam filtering and get the accuracy measures. From the testing result known that Learning Vector Quantization can be use as text classification for spam filtering it can be show from the accuracy 98% with 1001 learning data set and 100 testing data set.Keyword: spam e-mail, spam filtering, text classification, data preprocessing, information gain, learning vector quantization
    • ā€¦
    corecore