5 research outputs found

    Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation

    Full text link
    Feature selection (FS) has become an indispensable task in dealing with today's highly complex pattern recognition problems with massive number of features. In this study, we propose a new wrapper approach for FS based on binary simultaneous perturbation stochastic approximation (BSPSA). This pseudo-gradient descent stochastic algorithm starts with an initial feature vector and moves toward the optimal feature vector via successive iterations. In each iteration, the current feature vector's individual components are perturbed simultaneously by random offsets from a qualified probability distribution. We present computational experiments on datasets with numbers of features ranging from a few dozens to thousands using three widely-used classifiers as wrappers: nearest neighbor, decision tree, and linear support vector machine. We compare our methodology against the full set of features as well as a binary genetic algorithm and sequential FS methods using cross-validated classification error rate and AUC as the performance criteria. Our results indicate that features selected by BSPSA compare favorably to alternative methods in general and BSPSA can yield superior feature sets for datasets with tens of thousands of features by examining an extremely small fraction of the solution space. We are not aware of any other wrapper FS methods that are computationally feasible with good convergence properties for such large datasets.Comment: This is the Istanbul Sehir University Technical Report #SHR-ISE-2016.01. A short version of this report has been accepted for publication at Pattern Recognition Letter

    A DATA-DRIVEN PID CONTROLLER FOR FLEXIBLE JOINT MANIPULATOR USING NORMALIZED SIMULTANEOUS PERTURBATION STOCHASTIC APPROXIMATION

    Get PDF
    This paper presents a data-driven PID controller based on Normalized Simultaneous Perturbation Stochastic Approximation (SPSA). Initially, an unstable convergence of conventional SPSA is illustrated, which motivate us to introduce its improved version. The unstable convergence always happened in the data-driven controller tuning, when the closed-loop control system became unstable. In the case of flexible joint manipulator, it will exhibit unstable tip angular position with high magnitude of vibration. Here, the conventional SPSA is modified by introducing a normalized gradient approximation to update the design variable. To be more specific, each measurement of the cost function from the perturbations is normalized to the maximum cost function measurement at the current iteration. As a result, this improvement is expected to avoid the updated control parameter from producing an unstable control performance. The effectiveness of the normalized SPSA is tested to the data-driven PID control scheme of a flexible joint plant. The simulation result shows that the data-driven controller tuning using the normalized SPSA is able to provide a stable convergence with 76.68 % improvement in average cost function. Moreover, it also exhibits lower average and best values for both norms of error and input performances as compared to the existing modified SPSA.A DATA-DRIVEN PID CONTROLLER FOR FLEXIBLE JOINT MANIPULATOR USING NORMALIZED SIMULTANEOUS PERTURBATION STOCHASTIC APPROXIMATIO

    A Data-Driven PID Controller For Flexible Joint Manipulator Using Normalized Simultaneous Perturbation Stochastic Approximation

    Get PDF
    This paper presents a data-driven PID controller based on Normalized Simultaneous Perturbation Stochastic Approximation (SPSA). Initially, an unstable convergence of conventional SPSA is illustrated, which motivate us to introduce its improved version. The unstable convergence always happened in the data-driven controller tuning, when the closed-loop control system became unstable. In the case of flexible joint manipulator, it will exhibit unstable tip angular position with high magnitude of vibration. Here, the conventional SPSA is modified by introducing a normalized gradient approximation to update the design variable. To be more specific, each measurement of the cost function from the perturbations is normalized to the maximum cost function measurement at the current iteration. As a result, this improvement is expected to avoid the updated control parameter from producing an unstable control performance. The effectiveness of the normalized SPSA is tested to the data-driven PID control scheme of a flexible joint plant. The simulation result shows that the data-driven controller tuning using the normalized SPSA is able to provide a stable convergence with 76.68 % improvement in average cost function. Moreover, it also exhibits lower average and best values for both norms of error and input performances as compared to the existing modified SPSA

    Comparison of Machine Learning Methods in the Selection of Predictors of Atmospheric-Ocean General Circulation Models

    Get PDF
    IntroductionNowadays, climate change is one of the human challenges in the exploitation and management of water resources. Temperature along with precipitation is one of the most important climatic elements and is one of the main factors in zoning and climatic classification. Due to location of Iran within the drought belt and proximity to the high-pressure tropical zone, this country has an arid and semi-arid climate and suffers from drought in majority of years. Therefore, temperature fluctuations and variability are important issues, and make the study of temperature changes a necessity. In the current study, four data mining algorithms in selecting predictors for downscaling of maximum temperature in Birjand synoptic station have been studied, compared and the superior algorithm has been introduced. As the number of large scale features are high, selection of machine learning algorithm will play as an important role in statistical downscaling of climatic variables such as maximum temperature. Materials and MethodsToday, the data set is such that many variables are used to describe the climatic phenomenon in environmental studies. As the number of data is huge, choosing the predictors is one of the most important steps in preprocessing machine learning. In this study, four machine learning methods including stochastic approximation of simultaneous turbulence (SPSA), Least Absolute Shrinkage and Selection Operator (LASSO), Ridge and Gradient Boosting Method (GBM) in selecting important features in downscaling of maximum temperature in Birjand synoptic station during the statistical period of 1961-2019 were studied and compared. It is a mechanism to find a combination of predictors that with a minimum number of predictors can produce an acceptable evaluation index in estimating the variable under study. For the present study, the weather information of Birjand Synoptic Meteorological Station has been prepared by the Meteorological Organization of Iran. In order to calibrate and validate the machine learning algorithms, 70% and 30% of the available monthly data, respectively, were allocated for this purpose. To conduct this research, coding in R-Studio environment and Caret and Fscaret packages were used. In this study, to evaluate the performance of the algorithms, three indices includes relative Nash-Sutcliffe Efficiency (rNSE), Volume Efficiency (VE) and Kling-Gupta Efficiency (KGE) were used.Results and DiscussionBefore using the algorithms in selecting large-scale predictors, the correlation between these variables and the maximum observational temperature at Birjand station was investigated. Large scale variables mslp, P1_v, P8_v, P8_u, P850 Temp, with a maximum correlation temperature of 0.6 showed that the correlation is acceptable given the complexity of the climate change phenomenon. In addition, these results show that all the algorithms used the important factors including F1, F2, F15, F16, F18, F20 and F26 by more than 50% and the first variable (mean pressure at the ocean surface) was the most important parameter in downscaling of maximum temperature. Also, the highest importance was for P1_v and the lowest value related to P5_u, as 73.2% and 15%, respectively. Violin plots of downscaled maximum temperature in validation step of different algorithms along with the observed maximum temperature in Birjand synoptic station in each of the algorithms showed that the values of the first and third quartiles in the output data of SPSA algorithm compared to other algorithms were closer to the observed data. According to the evaluation criteria, SPSA algorithm has a higher performance than other algorithms in reproducing the maximum monthly temperature values in Birjand synoptic station. Also, based on the volumetric efficiency evaluation criteria and relative Nash-Sutcliffe, GBM algorithm was more successful in selecting predictors than Ridge and LASSO algorithms. It is also observed that SPSA algorithm shows different results than other algorithms. In comparison of mean and variance of downscaled and observed maximum temperature, the results of t-test and F-test showed that SPSA algorithm has higher efficiency than other algorithms in regenerating mean and variance of observed maximum temperature in Birjand synoptic station at the 5% significance level.ConclusionThe data used in this study included large scale atmospheric variables and the maximum observed temperature at Birjand station. The algorithms were used to select important predictors and the performance of these methods in the validation part. According to the results of this study, the highest importance among large-scale variables is related to P1_v and the lowest value is related to P5_u, the values of which were 73.2% and 15%, respectively. The SPSA algorithm also performs better than other algorithms in selecting predictors and consequently the maximum temperature
    corecore