8 research outputs found

    Improved gene expression programming to solve the inverse problem for ordinary differential equations

    Get PDF
    Many complex systems in the real world evolve with time. These dynamic systems are often modeled by ordinary differential equations in mathematics. The inverse problem of ordinary differential equations is to convert the observed data of a physical system into a mathematical model in terms of ordinary differential equations. Then the modelay be used to predict the future behavior of the physical system being modeled. Genetic programming has been taken as a solver of this inverse problem. Similar to genetic programming, gene expression programming could do the same job since it has a similar ability of establishing the model of ordinary differential systems. Nevertheless, such research is seldom studied before. This paper is one of the first attempts to apply gene expression programming for solving the inverse problem of ordinary differential equations. Based on a statistic observation of traditional gene expression programming, an improvement is made in our algorithm, that is, genetic operators should act more often on the dominant part of genes than on the recessive part. This may help maintain population diversity and also speed up the convergence of the algorithm. Experiments show that this improved algorithm performs much better than genetic programming and traditional gene expression programming in terms of running time and prediction precisio

    Schema theory based data engineering in gene expression programming for big data analytics

    Get PDF
    Gene expression programming (GEP) is a data driven evolutionary technique that well suits for correlation mining. Parallel GEPs are proposed to speed up the evolution process using a cluster of computers or a computer with multiple CPU cores. However, the generation structure of chromosomes and the size of input data are two issues that tend to be neglected when speeding up GEP in evolution. To fill the research gap, this paper proposes three guiding principles to elaborate the computation nature of GEP in evolution based on an analysis of GEP schema theory. As a result, a novel data engineered GEP is developed which follows closely the generation structure of chromosomes in parallelization and considers the input data size in segmentation. Experimental results on two data sets with complementary features show that the data engineered GEP speeds up the evolution process significantly without loss of accuracy in data correlation mining. Based on the experimental tests, a computation model of the data engineered GEP is further developed to demonstrate its high scalability in dealing with potential big data using a large number of CPU cores

    Capacity Estimation Methods Applied to Mini Hydro Plants

    Get PDF

    Hybrid bootstrap-based approach with binary artificial bee colony and particle swarm optimization in Taguchi's T-Method

    Get PDF
    Taguchi's T-Method is one of the Mahalanobis Taguchi System (MTS)-ruled prediction techniques that has been established specifically but not limited to small, multivariate sample data. When evaluating data using a system such as the Taguchi's T-Method, bias issues often appear due to inconsistencies induced by model complexity, variations between parameters that are not thoroughly configured, and generalization aspects. In Taguchi's T-Method, the unit space determination is too reliant on the characteristics of the dependent variables with no appropriate procedures designed. Similarly, the least square-proportional coefficient is well known not to be robust to the effect of the outliers, which indirectly affects the accuracy of the weightage of SNR that relies on the model-fit accuracy. The small effect of the outliers in the data analysis may influence the overall performance of the predictive model unless more development is incorporated into the current framework. In this research, the mechanism of improved unit space determination was explicitly designed by implementing the minimum-based error with the leave-one-out method, which was further enhanced by embedding strategies that aim to minimize the impact of variance within each parameter estimator using the leave-one-out bootstrap (LOOB) and 0.632 estimates approaches. The complexity aspect of the prediction model was further enhanced by removing features that did not provide valuable information on the overall prediction. In order to accomplish this, a matrix called Orthogonal Array (OA) was used within the existing Taguchi's T-Method. However, OA's fixed-scheme matrix, as well as its drawback in coping with the high-dimensionality factor, leads to a sub- optimal solution. On the other hand, the usage of SNR, decibel (dB) as its objective function proved to be a reliable measure. The architecture of a Hybrid Binary Artificial Bee Colony and Particle Swarm Optimization (Hybrid Binary ABC-PSO), including the Binary Bitwise ABC (BitABC) and Probability Binary PSO (PBPSO), has been developed as a novel search engine that helps to cater the limitation of OA. The SNR (dB) and mean absolute error (MAE) were the main part of the performance measure used in this research. The generalization aspect was a fundamental addition incorporated into this research to control the effect of overfitting in the analysis. The proposed enhanced parameter estimators with feature selection optimization in this analysis had been tested on 10 case studies and had improved predictive accuracy by an average of 46.21% depending on the cases. The average standard deviation of MAE, which describes the variability impact of the optimized method in all 10 case studies, displayed an improved trend relative to the Taguchi’s T-Method. The need for standardization and a robust approach to outliers is recommended for future research. This study proved that the developed architecture of Hybrid Binary ABC-PSO with Bootstrap and minimum-based error using leave-one-out as the proposed parameter estimators enhanced techniques in the methodology of Taguchi's T-Method by effectively improving its prediction accuracy

    Predicción local mediante algoritmos evolutivos

    Get PDF
    Desde siempre, un problema clásico en el campo de la Inteligencia Artificial ha sido la búsqueda de la forma de predecir comportamientos a partir de una base de datos que modeliza tales comportamientos. Un claro ejemplo son las Series Temporales, que buscan representar mediante medidas el comportamiento de un fenómeno a lo largo de un periodo de tiempo. Otro ejemplo clásico ha sido la predicción de la cotización de las acciones en bolsa, bien tratada como serie temporal, o bien en función de ciertos medidores. El problema de la mayoría de los algoritmos de aprendizaje automático radica en su búsqueda de una aproximación global a estos problemas de predicción, es decir, buscan un modelo construido sobre todo el conjunto de patrones para predecir cualquier patrón. Esta tesis parte del planteamiento de que esta aproximación no es la más acertada, dado que no todos los patrones presentan las mismas características. Como ejemplo, un sistema de predicción de mareas que trate todos los patrones que representan las medidas del nivel del agua a lo largo de un año por igual nunca podría ser tan acertado como uno que separe los patrones en conjuntos según la época del año, realice un aprendizaje para cada grupo, y posteriormente, según el grupo al que pertenezca, realice su labor de predicción correspondiente. Así pues, nuestro objetivo ha sido la búsqueda de un algoritmo inteligente que no sólo sea capaz de aprender a predecir, sino también a buscar y clasificar las peculiaridades de cada subconjunto de datos, descargando esta tarea del investigador que quiera usar el algoritmo. La potencia de los algoritmos desarrollados en esta tesis se basan en la búsqueda y aprendizaje de y sobre estos subjuntos especiales. Esto nos permite incluso buscar comportamientos anómalos de los datos, y realizar reglas de predicción para ellos, lo cual es de vital importancia a la hora de predecir catástrofes. En esta tesis se han desarrollado 2 algoritmos distintos, pero basados en la misma idea para el objetivo presentado. El primero está basado en las premisas de Packard sobre la predicción de sistemas dinámicos. La segunda es una idea completamente nueva, buscando una forma distinta de mejorar la primera aproximación. Ambos algoritmos se han usado sobre los mismos conjuntos de datos, obtenidos de campos completamente diferentes (series temporales artificiales, series reales, datos de bolsa, etc.), con unos resultados que mejoran, en la mayoría de los casos, los resultados de otros algoritmos clásicos avanzados de aprendizaje automático.-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------A classic problem in the Artificial Intelligence area has been the prediction of behaviors using a data set modelling those behaviors. As examples, we can consider Time Series, which represent with measures the behavior of a certain phenomenon along a time period. Another classic example is the price of the stocks in the stock market, considered as time series or as a function for certain parameters too. The main problem of most machine learning algorithms focuses on their search of a global approach to those prediction problems. That means, they try for create a model over the pattern set to predict another pattern. This is based on the idea that this approach is not the most indicated for all the problems, because not all the patterns present the same characteristics. For example, a system for tide forecasting that considers in the same category all the patterns, will not be as good as another system that separates the patterns into different sets representing different seasons, and then use those sets separately to extract the information and make predictions. Thus, our objective has been the search of intelligent algorithms that are not only able to predict, but also to find and classify the characteristics of each data subset, thus avoiding that task to the researcher. The advantage of the algorithms developed in this thesis are the abilities to search and learn on special subsets of the data set. This also allows the algorithms to find abnormal behaviors in the data, making different predictions for them. That is a matter of utter importance to predict catastrophes. In order to achieve our objective, two different algorithms have been developed and both are based on the same idea. The first one is based on the ideas of Packard about prediction on dynamical systems. Improving the first approach, we have developed the second one. Both algorithms have been tested on the same data sets, obtained from different domains (artificial time series, real time series, stock market, etc.) producing results that, in most cases, improve the results of other classic and advanced automatic learning algorithms

    Time Series Prediction Based on Gene Expression Programming

    No full text
    corecore