2 research outputs found

    Feature Selection by Multiobjective Optimization: Application to Spam Detection System by Neural Networks and Grasshopper Optimization Algorithm

    Get PDF
    Networks are strained by spam, which also overloads email servers and blocks mailboxes with unwanted messages and files. Setting the protective level for spam filtering might become even more crucial for email users when malicious steps are taken since they must deal with an increase in the number of valid communications being marked as spam. By finding patterns in email communications, spam detection systems (SDS) have been developed to keep track of spammers and filter email activity. SDS has also enhanced the tool for detecting spam by reducing the rate of false positives and increasing the accuracy of detection. The difficulty with spam classifiers is the abundance of features. The importance of feature selection (FS) comes from its role in directing the feature selection algorithm’s search for ways to improve the SDS’s classification performance and accuracy. As a means of enhancing the performance of the SDS, we use a wrapper technique in this study that is based on the multi-objective grasshopper optimization algorithm (MOGOA) for feature extraction and the recently revised EGOA algorithm for multilayer perceptron (MLP) training. The suggested system’s performance was verified using the SpamBase, SpamAssassin, and UK-2011 datasets. Our research showed that our novel approach outperformed a variety of established practices in the literature by as much as 97.5%, 98.3%, and 96.4% respectively.©2022 the Authors. Published by IEEE. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/fi=vertaisarvioitu|en=peerReviewed

    A spatial feature engineering algorithm for creating air pollution health datasets

    No full text
    Air pollution is one of the significant causes of mortality and morbidity every year. In recent years, many researchers have focused their attention on the associations of air pollution and health. Air pollution data and health data is used in these studies and feature engineering is used to create and optimize the air quality and health features. In order to associate these datasets, the residential address, community/county/block/city, and hospital/school address are utilized as association parameters. A spatial problem is raised when the Air Quality Monitoring (AQM) stations are concentrated in urban areas within the regions, and the residential address or any other spatial parameter is used. An intersection of AQM stations coverage in urban areas is observed where AQM stations are operating in close proximity, which raises the question of how to associate the patients with the relevant AQM station. In most studies, the distance of patients to the AQM stations is also not taken into account. In this study, we propose a spatial feature engineering algorithm with functions to find the coordinates for patients, calculate distances to the AQM stations, and associate patient records to the nearest AQM station. Hence, removing the limitations of current air pollution health datasets. The proposed algorithm is applied to a case study in Klang Valley, Malaysia. The results show that the proposed algorithm can generate air pollution health datasets efficiently, and it also provides the radius facility to exclude the patients who are situated far away from the stations
    corecore