Search CORE

27,487 research outputs found

A Multi-Gene Genetic Programming Application for Predicting Students Failure at School

Author: Eke B. O.
Orove J. O.
Osegi N. E.
Publication venue
Publication date: 11/03/2015
Field of study

Several efforts to predict student failure rate (SFR) at school accurately still remains a core problem area faced by many in the educational sector. The procedure for forecasting SFR are rigid and most often times require data scaling or conversion into binary form such as is the case of the logistic model which may lead to lose of information and effect size attenuation. Also, the high number of factors, incomplete and unbalanced dataset, and black boxing issues as in Artificial Neural Networks and Fuzzy logic systems exposes the need for more efficient tools. Currently the application of Genetic Programming (GP) holds great promises and has produced tremendous positive results in different sectors. In this regard, this study developed GPSFARPS, a software application to provide a robust solution to the prediction of SFR using an evolutionary algorithm known as multi-gene genetic programming. The approach is validated by feeding a testing data set to the evolved GP models. Result obtained from GPSFARPS simulations show its unique ability to evolve a suitable failure rate expression with a fast convergence at 30 generations from a maximum specified generation of 500. The multi-gene system was also able to minimize the evolved model expression and accurately predict student failure rate using a subset of the original expressionComment: 14 pages, 9 figures, Journal paper. arXiv admin note: text overlap with arXiv:1403.0623 by other author

arXiv.org e-Print Archive

CiteSeerX

Evolving Spatially Aggregated Features from Satellite Imagery for Regional Modeling

Author: AE Hoerl
D Buckingham
J Bongard
J Dong
J Dozier
J Martinec
J Rees
JR Koza
K Krawiec
M Hollander
M Schmidt
M Tedesco
MD Schmidt
R Tibshirani
TH Painter
WR Tobler
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/12/2017
Field of study

Satellite imagery and remote sensing provide explanatory variables at relatively high resolutions for modeling geospatial phenomena, yet regional summaries are often desirable for analysis and actionable insight. In this paper, we propose a novel method of inducing spatial aggregations as a component of the machine learning process, yielding regional model features whose construction is driven by model prediction performance rather than prior assumptions. Our results demonstrate that Genetic Programming is particularly well suited to this type of feature construction because it can automatically synthesize appropriate aggregations, as well as better incorporate them into predictive models compared to other regression methods we tested. In our experiments we consider a specific problem instance and real-world dataset relevant to predicting snow properties in high-mountain Asia

arXiv.org e-Print Archive

High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm

Author: Gebbie Tim
Hendricks Dieter
Wilcox Diane
Publication venue: 'Academy of Science of South Africa'
Publication date: 02/08/2015
Field of study

We implement a master-slave parallel genetic algorithm (PGA) with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs) to implement a PGA and visualise the results using disjoint minimal spanning trees (MSTs). We demonstrate that our GPU PGA, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable due to compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.Comment: 10 pages, 5 figures, 4 tables, More thorough discussion of implementatio

arXiv.org e-Print Archive

Academy of Science of South Africa (ASSAf): Open Journal Systems

Directory of Open Access Journals

Automatic epilepsy detection using fractal dimensions segmentation and GP-SVM classification

Author: Jirka Jakub
Krejcar Ondřej
Kuča Kamil
Prauzek Michal
Publication venue: 'Dove Medical Press Ltd.'
Publication date: 01/01/2018
Field of study

Objective: The most important part of signal processing for classification is feature extraction as a mapping from original input electroencephalographic (EEG) data space to new features space with the biggest class separability value. Features are not only the most important, but also the most difficult task from the classification process as they define input data and classification quality. An ideal set of features would make the classification problem trivial. This article presents novel methods of feature extraction processing and automatic epilepsy seizure classification combining machine learning methods with genetic evolution algorithms. Methods: Classification is performed on EEG data that represent electric brain activity. At first, the signal is preprocessed with digital filtration and adaptive segmentation using fractal dimensions as the only segmentation measure. In the next step, a novel method using genetic programming (GP) combined with support vector machine (SVM) confusion matrix as fitness function weight is used to extract feature vectors compressed into lower dimension space and classify the final result into ictal or interictal epochs. Results: The final application of GP SVM method improves the discriminatory performance of a classifier by reducing feature dimensionality at the same time. Members of the GP tree structure represent the features themselves and their number is automatically decided by the compression function introduced in this paper. This novel method improves the overall performance of the SVM classification by dramatically reducing the size of input feature vector. Conclusion: According to results, the accuracy of this algorithm is very high and comparable, or even superior to other automatic detection algorithms. In combination with the great efficiency, this algorithm can be used in real-time epilepsy detection applications. From the results of the algorithm's classification, we can observe high sensitivity, specificity results, except for the Generalized Tonic Clonic Seizure (GTCS). As the next step, the optimization of the compression stage and final SVM evaluation stage is in place. More data need to be obtained on GTCS to improve the overall classification score for GTCS.Web of Science142449243

'On the Application of Hierarchical Coevolutionary Genetic Algorithms: Recombination and Evaluation Partners'

Author: Aickelin Uwe
Bull Larry
Publication venue
Publication date: 01/01/2003
Field of study

This paper examines the use of a hierarchical coevolutionary genetic algorithm under different partnering strategies. Cascading clusters of sub-populations are built from the bottom up, with higher-level sub-populations optimising larger parts of the problem. Hence higher-level sub-populations potentially search a larger search space with a lower resolution whilst lower-level sub-populations search a smaller search space with a higher resolution. The effects of different partner selection schemes amongst the sub-populations on solution quality are examined for two constrained optimisation problems. We examine a number of recombination partnering strategies in the construction of higher-level individuals and a number of related schemes for evaluating sub-solutions. It is shown that partnering strategies that exploit problem-specific knowledge are superior and can counter inappropriate (sub-) fitness measurements

CiteSeerX