A Comparative Performance Analysis of Explainable Machine Learning
Models With And Without RFECV Feature Selection Technique Towards Ransomware
Classification
Ransomware has emerged as one of the major global threats in recent days. The
alarming increasing rate of ransomware attacks and new ransomware variants
intrigue the researchers in this domain to constantly examine the
distinguishing traits of ransomware and refine their detection or
classification strategies. Among the broad range of different behavioral
characteristics, the trait of Application Programming Interface (API) calls and
network behaviors have been widely utilized as differentiating factors for
ransomware detection, or classification. Although many of the prior approaches
have shown promising results in detecting and classifying ransomware families
utilizing these features without applying any feature selection techniques,
feature selection, however, is one of the potential steps toward an efficient
detection or classification Machine Learning model because it reduces the
probability of overfitting by removing redundant data, improves the model's
accuracy by eliminating irrelevant features, and therefore reduces training
time. There have been a good number of feature selection techniques to date
that are being used in different security scenarios to optimize the performance
of the Machine Learning models. Hence, the aim of this study is to present the
comparative performance analysis of widely utilized Supervised Machine Learning
models with and without RFECV feature selection technique towards ransomware
classification utilizing the API call and network traffic features. Thereby,
this study provides insight into the efficiency of the RFECV feature selection
technique in the case of ransomware classification which can be used by peers
as a reference for future work in choosing the feature selection technique in
this domain.Comment: arXiv admin note: text overlap with arXiv:2210.1123