Search CORE

10 research outputs found

Efficient Feature Subset Selection Algorithm for High Dimensional Data

Author: Chormunge Smita
Jena Sudarson
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/08/2016
Field of study

Feature selection approach solves the dimensionality problem by removing irrelevant and redundant features. Existing Feature selection algorithms take more time to obtain feature subset for high dimensional data. This paper proposes a feature selection algorithm based on Information gain measures for high dimensional data termed as IFSA (Information gain based Feature Selection Algorithm) to produce optimal feature subset in efficient time and improve the computational performance of learning algorithms. IFSA algorithm works in two folds: First apply filter on dataset. Second produce the small feature subset by using information gain measure. Extensive experiments are carried out to compare proposed algorithm and other methods with respect to two different classifiers (Naive bayes and IBK) on microarray and text data sets. The results demonstrate that IFSA not only produces the most select feature subset in efficient time but also improves the classifier performance

IAES journal

Crossref

Institute of Advanced Engineering and Science

Correlation based feature selection with clustering for high dimensional data

Author: Smita Chormunge
Sudarson Jena
Publication venue: 'Elsevier BV'
Publication date: 01/12/2018
Field of study

Feature selection is an essential technique to reduce the dimensionality problem in data mining task. Traditional feature selection algorithms are fail to scale on large space. This paper proposes a new method to solve dimensionality problem where clustering is integrating with correlation measure to produce good feature subset. First Irrelevant features are eliminated by using k-means clustering method and then non-redundant features are selected by correlation measure from each cluster. The proposed method is evaluate on Microarray and Text datasets and the results are compared with other renowned feature selection methods using Naïve Bayes classifier. To verify the accuracy of the proposed method with different number of relevant features, percentagewise criteria is used. The experimental results reveal the efficiency and accuracy of the proposed method. Keywords: Clustering, Feature selection, Correlation, Dimensionality reductio

Directory of Open Access Journals

Improving the anomaly detection by combining PSO search methods and J48 algorithm

Author: aghdam
aljawarneh
bahl
bahl
bahl
chormunge
gharaee
giindiiz
hall
nikhitha
panigrahi
popoola
rai
sheena
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

The feature selection techniques are used to find the most important and relevant features in a dataset. Therefore, in this study feature selection technique was used to improve the performance of Anomaly Detection. Many feature selection techniques have been developed and implemented on the NSL-KDD dataset. However, with the rapid growth of traffic on a network where more applications, devices, and protocols participate, the traffic data is complex and heterogeneous contribute to security issues. This makes the NSL-KDD dataset no longer reliable for it. The detection model must also be able to recognize the type of novel attack on complex network datasets. So, a robust analysis technique for a more complex and larger dataset is required, to overcome the increase of security issues in a big data network. This study proposes particle swarm optimization (PSO) Search methods as a feature selection method. As contribute to feature analysis knowledge, In the experiment a combination of particle swarm optimization (PSO) Search methods with other search methods are examined. To overcome the limitation NSL-KDD dataset, in the experiments the CICIDS2017 dataset used. To validate the selected features from the proposed technique J48 classification algorithm used in this study. The detection performance of the combination PSO Search method with J48 examined and compare with other feature selection and previous study. The proposed technique successfully finds the important features of the dataset, which improve detection performance with 99.89% accuracy. Compared with the previous study the proposed technique has better accuracy, TPR, and FPR.Anomaly Detection, CICIDS201

Crossref

Universiti Teknologi Malaysia Institutional Repository