Enhancing Flight Delay Prediction through Feature Engineering in Machine Learning Classifiers: A Real Time Data Streams Case Study

Abstract

The process of creating and selecting features from raw data to enhance the accuracy of machine learning models is referred to as feature engineering. In the context of real-time data streams, feature engineering becomes particularly important because the data is constantly changing and the model must be able to adapt quickly. A case study of using feature engineering in a flight information system is described in this paper. We used feature engineering to improve the performance of machine learning classifiers for predicting flight delays and describe various techniques for extracting and constructing features from the raw data, including time-based features, trend-based features, and error-based features. Before applying these techniques, we applied feature pre-processing techniques, including the CTAO algorithm for feature pre-processing, followed by the SCSO (Sand cat swarm optimization) algorithm for feature extraction and the Enhanced harmony search for feature optimization. The resultant feature set contained the 9 most relevant features for deciding whether a flight would be delayed or not. Additionally, we evaluate the performance of various classifiers using these engineered features and contrast the results with those obtained using raw features. The results show that feature engineering significantly improves the performance of the classifiers and allows for more accurate prediction of flight delays in real-time

    Similar works