Search CORE

266 research outputs found

Computing with Granular Words

Author: Hou Hailong
Publication venue: ScholarWorks @ Georgia State University
Publication date: 07/05/2011
Field of study

Computational linguistics is a sub-field of artificial intelligence; it is an interdisciplinary field dealing with statistical and/or rule-based modeling of natural language from a computational perspective. Traditionally, fuzzy logic is used to deal with fuzziness among single linguistic terms in documents. However, linguistic terms may be related to other types of uncertainty. For instance, different users search ‘cheap hotel’ in a search engine, they may need distinct pieces of relevant hidden information such as shopping, transportation, weather, etc. Therefore, this research work focuses on studying granular words and developing new algorithms to process them to deal with uncertainty globally. To precisely describe the granular words, a new structure called Granular Information Hyper Tree (GIHT) is constructed. Furthermore, several technologies are developed to cooperate with computing with granular words in spam filtering and query recommendation. Based on simulation results, the GIHT-Bayesian algorithm can get more accurate spam filtering rate than conventional method Naive Bayesian and SVM; computing with granular word also generates better recommendation results based on users’ assessment when applied it to search engine

ScholarWorks @ Georgia State University

BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

Author: Banos Vangelis
Kasioumis Nikolaos
Kim Yunhyong
Kopidaki Stella
Ross Seamus
Rynning Morten
Stepanyan Karen
Publication venue: BlogForever
Publication date: 25/10/2013
Field of study

This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

ZENODO

Enlighten

Email classification using data reduction method

Author: Islam Rafiqul
Xiang Yang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Classifying user emails correctly from penetration of spam is an important research issue for anti-spam researchers. This paper has presented an effective and efficient email classification technique based on data filtering method. In our testing we have introduced an innovative filtering technique using instance selection method (ISM) to reduce the pointless data instances from training model and then classify the test data. The objective of ISM is to identify which instances (examples, patterns) in email corpora should be selected as representatives of the entire dataset, without significant loss of information. We have used WEKA interface in our integrated classification model and tested diverse classification algorithms. Our empirical studies show significant performance in terms of classification accuracy with reduction of false positive instances.<br /

Deakin Research Online

Implementation and evaluation of a spam classifier based on the dynamic behaviour of immune cells

Author: Santos Luís André Brísio Marques dos
Publication venue: Porto : [s. n.]
Publication date: 01/01/2008
Field of study

Repositório Aberto da Universidade do Porto

Artificial intelligence in the cyber domain: Offense and defense

Author: Diep Quoc Bao
Truong Thanh Cong
Zelinka Ivan
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Artificial intelligence techniques have grown rapidly in recent years, and their applications in practice can be seen in many fields, ranging from facial recognition to image analysis. In the cybersecurity domain, AI-based techniques can provide better cyber defense tools and help adversaries improve methods of attack. However, malicious actors are aware of the new prospects too and will probably attempt to use them for nefarious purposes. This survey paper aims at providing an overview of how artificial intelligence can be used in the context of cybersecurity in both offense and defense.Web of Science123art. no. 41

Multidisciplinary Digital Publishing Institute

DSpace at VSB Technical University of Ostrava

Hybrid GA-SVM for Efficient Feature Selection in E-mail Classification

Author: Abimbola Adigun
Stephen Olabiyisi
Temitayo Fagbola
Publication venue: The International Institute for Science, Technology and Education (IISTE)
Publication date: 28/02/2012
Field of study

Feature selection is a problem of global combinatorial optimization in machine learning in which subsets of relevant features are selected to realize robust learning models. The inclusion of irrelevant and redundant features in the dataset can result in poor predictions and high computational overhead. Thus, selecting relevant feature subsets can help reduce the computational cost of feature measurement, speed up learning process and improve model interpretability. SVM classifier has proven inefficient in its inability to produce accurate classification results in the face of large e-mail dataset while it also consumes a lot of computational resources. In this study, a Genetic Algorithm-Support Vector Machine (GA-SVM) feature selection technique is developed to optimize the SVM classification parameters, the prediction accuracy and computation time. Spam assassin dataset was used to validate the performance of the proposed system. The hybrid GA-SVM showed remarkable improvements over SVM in terms of classification accuracy and computation time. Keywords: E-mail Classification, Feature-Selection, Genetic algorithm, Support Vector Machin

International Institute for Science, Technology and Education (IISTE): E-Journals

A machine learning approach to server-side anti-spam e-mail filtering

Author: Gerasimov S.
Mashechkin I.
Petrovskiy M.
Rozinkin A.
Publication venue: Інститут програмних систем НАН України
Publication date: 01/01/2006
Field of study

Spam-detection systems based on traditional methods have several obvious disadvantages like low detection rate, necessity of regular knowledge bases’ updates, impersonal filtering rules. New intelligent methods for spam detection, which use statistical and machine learning algorithms, solve these problems successfully. But these methods are not widespread in spam filtering for enterprise-level mail servers, because of their high resources consumption and insufficient accuracy regarding false-positive errors. The developed solution offers precise and fast algorithm. Its classification quality is better than the quality of Naïve-Bayes method that is the most widespread machine learning method now. The problem of time efficiency that is typical for all learning based methods for spam filtering is solved using multi-agent architecture. It allows easy system scaling and building unified corporate spam detection system based on heterogeneous enterprise mail systems. Pilot program implementation and its experimental evaluation for standard data sets and for real mail flows have demonstrated that our approach outperforms existing learning and traditional spam filtering methods. That allows considering it as a promising platform for constructing enterprise spam filtering systems

Наукова електронна бібліотека періодичних видань НАН України (Vernadsky National Library of Ukraine)