5 research outputs found
A Critical Study on Stability Measures of Feature Selection with a Novel Extension of Lustgarten Index
Stability of feature selection algorithm refers to its robustness to the perturbations of the training set, parameter settings or initialization. A stable feature selection algorithm is crucial for identifying the relevant feature subset of meaningful and interpretable features which is extremely important in the task of knowledge discovery. Though there are many stability measures reported in the literature for evaluating the stability of feature selection, none of them follows all the requisite properties of a stability measure. Among them, the Kuncheva index and its modifications, are widely used in practical problems. In this work, the merits and limitations of the Kuncheva index and its existing modifications (Lustgarten, Wald, nPOG/nPOGR, Nogueira) are studied and analysed with respect to the requisite properties of stability measure. One more limitation of the most recent modified similarity measure, Nogueira’s measure, has been pointed out. Finally, corrections to Lustgarten’s measure have been proposed to define a new modified stability measure that satisfies the desired properties and overcomes the limitations of existing popular similarity based stability measures. The effectiveness of the newly modified Lustgarten’s measure has been evaluated with simple toy experiments
A Critical Study on Stability Measures of Feature Selection with a Novel Extension of Lustgarten Index
Stability of feature selection algorithm refers to its robustness to the perturbations of the training set, parameter settings or initialization. A stable feature selection algorithm is crucial for identifying the relevant feature subset of meaningful and interpretable features which is extremely important in the task of knowledge discovery. Though there are many stability measures reported in the literature for evaluating the stability of feature selection, none of them follows all the requisite properties of a stability measure. Among them, the Kuncheva index and its modifications, are widely used in practical problems. In this work, the merits and limitations of the Kuncheva index and its existing modifications (Lustgarten, Wald, nPOG/nPOGR, Nogueira) are studied and analysed with respect to the requisite properties of stability measure. One more limitation of the most recent modified similarity measure, Nogueira’s measure, has been pointed out. Finally, corrections to Lustgarten’s measure have been proposed to define a new modified stability measure that satisfies the desired properties and overcomes the limitations of existing popular similarity based stability measures. The effectiveness of the newly modified Lustgarten’s measure has been evaluated with simple toy experiments
Comparative Study of Univariate and Multivariate Long Short-Term Memory for Very Short-Term Forecasting of Global Horizontal Irradiance
Accurate global horizontal irradiance (GHI) forecasting is crucial for efficient management and forecasting of the output power of photovoltaic power plants. However, developing a reliable GHI forecasting model is challenging because GHI varies over time, and its variation is affected by changes in weather patterns. Recently, the long short-term memory (LSTM) deep learning network has become a powerful tool for modeling complex time series problems. This work aims to develop and compare univariate and several multivariate LSTM models that can predict GHI in Guntur, India on a very short-term basis. To build the multivariate time series models, we considered all possible combinations of temperature, humidity, and wind direction variables along with GHI as inputs and developed seven multivariate models, while in the univariate model, we considered only GHI variability. We collected the meteorological data for Guntur from 1 January 2016 to 31 December 2016 and built 12 datasets, each containing variability of GHI, temperature, humidity, and wind direction of a month. We then constructed the models, each of which measures up to 2 h ahead of forecasting of GHI. Finally, to measure the symmetry among the models, we evaluated the performances of the prediction models using root mean square error (RMSE) and mean absolute error (MAE). The results indicate that, compared to the univariate method, each multivariate LSTM performs better in the very short-term GHI prediction task. Moreover, among the multivariate LSTM models, the model that incorporates the temperature variable with GHI as input has outweighed others, achieving average RMSE values 0.74 W/m2–1.5 W/m2