5 research outputs found

    A hybrid chaotic particle swarm optimization with differential evolution for feature selection

    No full text
    The selection of feature subsets has been broadly utilized in data mining and machine learning tasks to produce a solution with a small number of features which improves the classifier's accuracy and it also aims to reduce the dataset dimensionality while still sustaining high classification performance. Particle swarm optimization (PSO), which is inspired by social behaviors of individuals in bird swarms, is a nature-inspired and global optimization algorithm. Particle Swarm Optimization (PSO) has been widely applied to feature selection because of its effectiveness and efficiency. The PSO method is easy to implement and has shown good performance for many real-world optimization tasks. However, since feature selection is a challenging task with a complex search space, PSO has problems with pre-mature convergence and easily gets trapped at local optimum solutions. Hence, the need to balance the search behaviour between exploitation and exploration. In our previous work, a novel chaotic dynamic weight particle swarm optimization (CHPSO) in which a chaotic map and dynamic weight was introduced to improve the search process of PSO for feature selection. Therefore, this paper improved on CHPSO by introducing a hybrid of chaotic particle swarm optimization and differential evolution known as CHPSODE. The search accuracy and performance of the proposed (CHPSODE) algorithms was evaluated on eight commonly used classical benchmark functions. The experimental results showed that the CHPSODE achieves good results in discovering a realistic solution for solving a feature selection problem by balancing the exploration and exploitation search process and as such has proven to be a reliable and efficient metaheuristics algorithm for feature selection

    Analysis of metaheuristics featureselection algorithm for classification

    No full text
    Classification is a very vital task that is performed in machine learning. A technique used for classification is trained on various instances to foresee the class labels of hidden instances, and this is known as testing instances. The technique used for classification is able to find the connection between the class and instances due to the aid from the training process known as attributes. Redundant and non-relevant data are eradicated from the dataset with feature selection technique and these gives room for enhancement of the classification performance through feature selection. This research displays the feature selection techniques performances and are divided into wrapped-based metaheuristics algorithm and filter-based algorithms using two educational datasets. Four different classification techniques were used on the datasets and the outcome shows that Decision Tree (DT) gave the best performance on the datasets. Furthermore, the result shows that the proposed CHPSO-DE outshined other feature selection algorithms in that it obtained the best classification performance by using fewer features. The result of the various feature selection and classification technique will help researchers in getting the most efficient of feature selection algorithms and classification techniques

    A data mining approach to predict academic performance of students using ensemble techniques

    No full text
    Recently, Educational Data Mining (EDM), emerged as a new area of research due to the enlargement of various statistical methods used to explore data in educational settings. One of the applications of EDM is the prediction of student performance. The application of Data Mining methods in an educational setting is able to discover some hidden knowledge and patterns which will help in decision making for administrators for enhancing the educational system. In a web based education system, the behavioral features of learners is very significant in showing the interaction between students and the LMS. In this paper, our aim is to propose a new performance prediction model for students which is based on data mining methods which includes new features known as behavioral features of students. The proposed predictive model is evaluated using classifiers like Naïve Bayesian (NB), Decision Tree (DT), K-Nearest Neighbor (KNN), Discriminant Analysis (Disc) and Pairwise Coupling (PWC). Additionally, so as to enhance the classifiers performance, the ensemble methods such as AdaBoost, Bag and RUSBoost were used to enhance the accuracy of the performance model of the students. The achieved results shows that there exist a strong relationship between behavior of students and their academic performance. The accuracy of the proposed model achieved 84.2% with behavioral features while it achieved 72.6% without behavioral features. More so, an accuracy of 94.1% was gotten when the ensemble methods were applied to the classifiers to improve the academic performance. Therefore the result gotten shows the reliability of the proposed model

    Boosting enabled efficient machine learning technique for accurate prediction of crop yield towards precision agriculture

    No full text
    Abstract Due to the limited availability of natural resources, it is essential that agricultural productivity keep pace with population growth. Despite unfavorable weather circumstances, this project's major objective is to boost production. As a consequence of technological advancements in agriculture, precision farming as a way for enhancing crop yields is gaining appeal and becoming more prevalent. When it comes to predicting future data, machine learning employs a number of methods, including the creation of models and the acquisition of prediction rules based on past data. In this manuscript, we examine various techniques to machine learning, as well as an automated agricultural yield projection model based on selecting the most relevant features. For the purpose of selecting features, the Grey Level Co-occurrence Matrix method is utilised. For classification, we make use of the AdaBoost Decision Tree, Artificial Neural Network (ANN), and K-Nearest Neighbour (KNN) algorithms. The data set that was used in this study is simply a compilation of information about a variety of topics, including yield, pesticide use, rainfall, and average temperature. This data collection consists of 33 characteristics or qualities in total. The crops soya beans, maze, potato, rice, paddy, wheat, and sorghum are included in this data collection. This data collection was made possible through the collaboration of the Food and Agriculture Organisation (FAO) and the World Data Bank, both of which make their data available to the public. The AdaBoost decision tree has achieved the highest level of accuracy possible when used to anticipate agricultural yield. Both the accuracy rate and the recall rate are quite high at 99 percent
    corecore