6 research outputs found
BUGOPTIMIZE: Bugs dataset Optimization with Majority Vote Cluster-Based Fine-Tuned Feature Selection for Scalable Handling
Software bugs are prevalent in the software development lifecycle, posing challenges to developers in ensuring product quality and reliability. Accurate prediction of bug counts can significantly aid in resource allocation and prioritization of bug-fixing efforts. However, the vast number of attributes in bug datasets often requires effective feature selection techniques to enhance prediction accuracy and scalability. Existing feature selection methods, though diverse, suffer from limitations such as suboptimal feature subsets and lack of scalability. This paper proposes BUGOPTIMIZE, a novel algorithm tailored to address these challenges. BUGOPTIMIZE innovatively integrates majority voting cluster-based fine-tuned feature selection to optimize bug datasets for scalable handling and accurate prediction. The algorithm initiates by clustering the dataset using K-means, EM, and Hierarchical clustering algorithms and performs majority voting to assign data points to final clusters. It then employs filter-based, wrapper-based, and embedded feature selection techniques within each cluster to identify common features. Additionally, feature selection is applied to the entire dataset to extract another set of common features. These selected features are combined to form the final best feature set. Experimental results demonstrate the efficacy of BUGOPTIMIZE compared to existing feature selection methods, reducing MAE and RMSE in Linear Regression (MAE: 0.2668 to 0.2609, RMSE: 0.3251 to 0.308) and Random Forest (MAE: 0.1626 to 0.1341, RMSE: 0.2363 to 0.224), highlighting its significant contribution to bug dataset optimization and prediction accuracy in software development while addressing feature selection limitations. By mitigating the disadvantages of current approaches and introducing a comprehensive and scalable solution, BUGOPTIMIZE presents a significant advancement in bug dataset optimization and prediction accuracy in software development environments
An Efficient Intrusion Detection System to Combat Cyber Threats using a Deep Neural Network Model
The proliferation of Internet of Things (IoT) solutions has led to a significant increase in cyber-attacks targeting IoT networks. Securing networks and especially wireless IoT networks against these attacks has become a crucial but challenging task for organizations. Therefore, ensuring the security of wireless IoT networks is of the utmost importance in today’s world. Among various solutions for detecting intruders, there is a growing demand for more effective techniques. This paper introduces a network intrusion detection system (NIDS) based on a deep neural network that utilizes network data features selected through the bagging and boosting methods. The presented NIDS implements both binary and multiclass attack detection models and was evaluated using the KDDCUP 99 and CICDDoS datasets. The experimental results demonstrated that the presented NIDS achieved an impressive accuracy rate of 99.4% while using a minimal number of features. This high level of accuracy makes the presented IDS a valuable tool
Embedded System Performance Analysis for Implementing a Portable Drowsiness Detection System for Drivers
Drowsiness on the road is a widespread problem with fatal consequences; thus,
a multitude of systems and techniques have been proposed. Among existing
methods, Ghoddoosian et al. utilized temporal blinking patterns to detect early
signs of drowsiness, but their algorithm was tested only on a powerful desktop
computer, which is not practical to apply in a moving vehicle setting. In this
paper, we propose an efficient platform to run Ghoddosian's algorithm, detail
the performance tests we ran to determine this platform, and explain our
threshold optimization logic. After considering the Jetson Nano and Beelink
(Mini PC), we concluded that the Mini PC is the most efficient and practical to
run our embedded system in a vehicle. To determine this, we ran communication
speed tests and evaluated total processing times for inference operations.
Based on our experiments, the average total processing time to run the
drowsiness detection model was 94.27 ms for Jetson Nano and 22.73 ms for the
Beelink (Mini PC). Considering the portability and power efficiency of each
device, along with the processing time results, the Beelink (Mini PC) was
determined to be most suitable. Also, we propose a threshold optimization
algorithm, which determines whether the driver is drowsy or alert based on the
trade-off between the sensitivity and specificity of the drowsiness detection
model. Our study will serve as a crucial next step for drowsiness detection
research and its application in vehicles. Through our experiment, we have
determinend a favorable platform that can run drowsiness detection algorithms
in real-time and can be used as a foundation to further advance drowsiness
detection research. In doing so, we have bridged the gap between an existing
embedded system and its actual implementation in vehicles to bring drowsiness
technology a step closer to prevalent real-life implementation.Comment: 26 pages, 13 figures, 4 table
Recommended from our members
Ransomware detection using deep learning based unsupervised feature extraction and a cost sensitive Pareto Ensemble classifier
Ransomware attacks pose a serious threat to Internet resources due to their far-reaching effects. It's Zero-day variants are even more hazardous, as less is known about them. In this regard, when used for ransomware attack detection, conventional machine learning approaches may become data-dependent, insensitive to error cost, and thus may not tackle zero-day ransomware attacks. Zero-day ransomware have normally unseen underlying data distribution. This paper presents a Cost-Sensitive Pareto Ensemble strategy, CSPE-R to detect novel Ransomware attacks. Initially, the proposed framework exploits the unsupervised deep Contractive Auto Encoder (CAE) to transform the underlying varying feature space to a more uniform and core semantic feature space. To learn the robust features, the proposed CSPE-R ensemble technique explores different semantic spaces at various levels of detail. Heterogeneous base estimators are then trained over these extracted subspaces to find the core relevance between the various families of the ransomware attacks. Then, a novel Pareto Ensemble-based estimator selection strategy is implemented to achieve a cost-sensitive compromise between false positives and false negatives. Finally, the decision of selected estimators are aggregated to improve the detection against unknown ransomware attacks. The experimental results show that the proposed CSPE-R framework performs well against zero-day ransomware attacks
Towards Effective Detection of Botnet Attacks using BoT-IoT Dataset
In the world of cybersecurity, intrusion detection systems (IDS) have leveraged the power of artificial intelligence for the efficient detection of attacks. This is done by applying supervised machine learning (ML) techniques on labeled datasets. A growing body of literature has been devoted to the use of BoT-IoT dataset for IDS based ML frameworks. A few number of related works have recognized the need for a balanced dataset and applied techniques to alleviate the issue of imbalance. However, a significant amount of related research works failed to treat the imbalance in the BoT-IoT dataset. A lack of unanimity was observed in the literature towards the definition of taxonomy for balancing techniques. The study presented here seeks to explore the degree to which the imbalance of the dataset has been treated and to determine the taxonomy of techniques used. In this thesis, a comparison analysis is performed by using a small subset of an entire dataset to determine the threshold sample limit at which the model achieves the highest accuracy. In addition to this analysis, a study was conducted to determine the extent to which each feature of the dataset has an impact on the threshold performance. The study is implemented on the BoT-IoT dataset using three supervised ML classifiers: K-nearest Neighbor, Random Forest, and Logistic Regression. The four principal findings of this thesis are: existing taxonomies are not understood and imbalance of the dataset is not treated; high performance across all metrics is achieved on a highly imbalanced dataset; model is able to achieve the threshold performance using a small subset of samples; certain features had varying impact on the threshold value using different techniques
IoT malicious traffic identification using wrapper-based feature selection mechanisms
© 2020 Elsevier Ltd Machine Learning (ML) plays very significant role in the Internet of Things (IoT) cybersecurity for malicious and intrusion traffic identification. In other words, ML algorithms are widely applied for IoT traffic identification in IoT risk management. However, due to inaccurate feature selection, ML techniques misclassify a number of malicious traffic in smart IoT network for secured smart applications. To address the problem, it is very important to select features set that carry enough information for accurate smart IoT anomaly and intrusion traffic identification. In this paper, we firstly applied bijective soft set for effective feature selection to select effective features, and then we proposed a novel CorrACC feature selection metric approach. Afterward, we designed and developed a new feature selection algorithm named Corracc based on CorrACC, which is based on wrapper technique to filter the features and select effective feature for a particular ML classifier by using ACC metric. For the evaluation our proposed approaches, we used four different ML classifiers on the BoT-IoT dataset. Experimental results obtained by our algorithms are promising and can achieve more than 95% accuracy