54 research outputs found

    Contrôle de la croissance de la taille des individus en programmation génétique

    Get PDF
    La programmation génétique (GP) est une hyperheuristique d’optimisation ayant été appliquée avec succès à un large éventail de problèmes. Cependant, son intérêt est souvent considérablement diminué du fait de son utilisation élevée en ressources de calcul et de sa convergence laborieuse. Ces problèmes sont causés par une croissance immodérée de la taille des solutions et par l’apparition de structures inutiles dans celles-ci. Dans ce mémoire, nous présentons HARM-GP, une nouvelle approche résolvant en grande partie ces problèmes en permettant une adaptation dynamique de la distribution des tailles des solutions, tout en minimisant l’effort de calcul requis. Les performances de HARM-GP ont été testées sur un ensemble de douze problèmes et comparées avec celles de neuf techniques issues de la littérature. Les résultats montrent que HARM-GP excelle au contrôle de la croissance des arbres et du surapprentissage, tout en maintenant de bonnes performances sur les autres aspects.Genetic programming is a hyperheuristic optimization approach that has been applied to a wide range of problems involving symbolic representations or complex data structures. However, the method can be severely hindered by the increased computational resources required and premature convergence caused by uncontrolled code growth. We introduce HARM-GP, a novel operator equalization approach that adaptively shapes the genotype size distribution of individuals in order to effectively control code growth. Its probabilistic nature minimizes the overhead on the evolutionary process while its generic formulation allows this approach to remain independent of the problem and genetic operators used. Comparative results are provided over twelve problems with different dynamics, and over nine other algorithms taken from the literature. They show that HARM-GP is excellent at controlling code growth while maintaining good overall performances. Results also demonstrate the effectiveness of HARM-GP at limiting overtraining and overfitting in real-world supervised learning problems

    On the Challenges of Software Performance Optimization with Statistical Methods

    Get PDF
    Most recent programming languages, such as Java, Python and Ruby, include a collection framework as part of their standard library (or runtime). The Java Collection Framework provides a number of collection classes, some of which implement the same abstract data type, making them interchangeable. Developers can therefore choose between several functionally equivalent options. Since collections have different performance characteristics, and can be allocated in thousands of programs locations, the choice of collection has an important impact on performance. Unfortunately, programmers often make sub-optimal choices when selecting their collections.In this thesis, we consider the problem of building automated tools that would help the programmer choose between different collection implementations. We divide this problem into two sub-problems. First, we need to measure the performance of a collection, and use relevant statistical methods to make meaningful comparisons. Second, we need to predict the performance of a collection with as little benchmarking as possible.To measure and analyze the performance of Java collections, we identify problems with the established methods, and suggest the need for more appropriate statistical methods, borrowed from Bayesian statistics. We use these statistical methods in a reproduction of two state-of-the-art dynamic collection selection approaches: CoCo and CollectionSwitch. Our Bayesian approach allows us to make sound comparisons between the previously reported results and our own experimental evidence.We find that we cannot reproduce the original results, and report on possible causes for the discrepancies between our results and theirs.To predict the performance of a collection, we consider an existing tool called Brainy. Brainy suggests collections to developers for C++ programs, using machine learning. One particularity of Brainy is that it generates its own training data, by synthesizing programs and benchmarking them. As a result Brainy can automatically learn about new collections and new CPU architectures, while other approaches required an expert to teach the system about collection performance. We adapt Brainy to the Java context, and investigate whether Brainy's adaptability also holds for Java. We find that Brainy's benchmark synthesis methods do not apply well to the Java context, as they introduce some significant biases. We propose a new generative model for collection benchmarks and present the challenges that porting Brainy to Java entails

    Grammatical evolution hyper-heuristic for combinatorial optimization problems

    Get PDF
    Designing generic problem solvers that perform well across a diverse set of problems is a challenging task. In this work, we propose a hyper-heuristic framework to automatically generate an effective and generic solution method by utilizing grammatical evolution. In the proposed framework, grammatical evolution is used as an online solver builder, which takes several heuristic components (e.g., different acceptance criteria and different neighborhood structures) as inputs and evolves templates of perturbation heuristics. The evolved templates are improvement heuristics, which represent a complete search method to solve the problem at hand. To test the generality and the performance of the proposed method, we consider two well-known combinatorial optimization problems: exam timetabling (Carter and ITC 2007 instances) and the capacitated vehicle routing problem (Christofides and Golden instances). We demonstrate that the proposed method is competitive, if not superior, when compared to state-of-the-art hyper-heuristics, as well as bespoke methods for these different problem domains. In order to further improve the performance of the proposed framework we utilize an adaptive memory mechanism, which contains a collection of both high quality and diverse solutions and is updated during the problem solving process. Experimental results show that the grammatical evolution hyper-heuristic, with an adaptive memory, performs better than the grammatical evolution hyper-heuristic without a memory. The improved framework also outperforms some bespoke methodologies, which have reported best known results for some instances in both problem domains

    Investigation into the use of evolutionary algorithms for fully automated planning

    Get PDF
    This thesis presents a new approach to the Arti cial Intelligence (AI) problem of fully automated planning. Planning is the act of deliberation before acting that guides rational behaviour and is a core area of AI. Many practical real-world problems can be classed as planning problems, therefore practical and theoretical developments in AI planning are well motivated. Unfortunately, planning for even toy domains is hard, many different search algorithms have been proposed, and new approaches are actively encouraged. The approach taken in this thesis is to adopt ideas from Evolutionary Algorithms (EAs) and apply the techniques to fully automated plan synthesis. EA methods have enjoyed great success in many problem areas of AI. They are a new kind of search technique that have their foundation in evolution. Previous attempts to apply EAs to plan synthesis have promised encouraging results, but have been ad-hoc and piecemeal. This thesis thoroughly investigates the approach of applying evolutionary search to the fully automated planning problem. This is achieved by developing and modifying a proof of concept planner called GENPLAN. Before EA-based systems can be used, a thorough examination of various parameter settings must be explored. Once this was completed, the performance of GENPLAN was evaluated using a selection of benchmark domains and other competition style planners. The dif culties raised by the benchmark domains and the extent to which they cause problems for the approach are highlighted along with problems associated with EA search. Modi cations are proposed and experimented with in an attempt to alleviate some of the identi ed problems. EAs offer a exible framework for fully automated planning, but demonstrate a clear weakness across a range of currently used benchmark domains for plan synthesis

    Software redundancy: what, where, how

    Get PDF
    Software systems have become pervasive in everyday life and are the core component of many crucial activities. An inadequate level of reliability may determine the commercial failure of a software product. Still, despite the commitment and the rigorous verification processes employed by developers, software is deployed with faults. To increase the reliability of software systems, researchers have investigated the use of various form of redundancy. Informally, a software system is redundant when it performs the same functionality through the execution of different elements. Redundancy has been extensively exploited in many software engineering techniques, for example for fault-tolerance and reliability engineering, and in self-adaptive and self- healing programs. Despite the many uses, though, there is no formalization or study of software redundancy to support a proper and effective design of software. Our intuition is that a systematic and formal investigation of software redundancy will lead to more, and more effective uses of redundancy. This thesis develops this intuition and proposes a set of ways to characterize qualitatively as well as quantitatively redundancy. We first formalize the intuitive notion of redundancy whereby two code fragments are considered redundant when they perform the same functionality through different executions. On the basis of this abstract and general notion, we then develop a practical method to obtain a measure of software redundancy. We prove the effectiveness of our measure by showing that it distinguishes between shallow differences, where apparently different code fragments reduce to the same underlying code, and deep code differences, where the algorithmic nature of the computations differs. We also demonstrate that our measure is useful for developers, since it is a good predictor of the effectiveness of techniques that exploit redundancy. Besides formalizing the notion of redundancy, we investigate the pervasiveness of redundancy intrinsically found in modern software systems. Intrinsic redundancy is a form of redundancy that occurs as a by-product of modern design and development practices. We have observed that intrinsic redundancy is indeed present in software systems, and that it can be successfully exploited for good purposes. This thesis proposes a technique to automatically identify equivalent method sequences in software systems to help developers assess the presence of intrinsic redundancy. We demonstrate the effectiveness of the technique by showing that it identifies the majority of equivalent method sequences in a system with good precision and performance

    Evolving Artificial Neural Networks using Cartesian Genetic Programming

    Get PDF
    NeuroEvolution is the application of Evolutionary Algorithms to the training of Artificial Neural Networks. NeuroEvolution is thought to possess many benefits over traditional training methods including: the ability to train recurrent network structures, the capability to adapt network topology, being able to create heterogeneous networks of arbitrary transfer functions, and allowing application to reinforcement as well as supervised learning tasks. This thesis presents a series of rigorous empirical investigations into many of these perceived advantages of NeuroEvolution. In this work it is demonstrated that the ability to simultaneously adapt network topology along with connection weights represents a significant advantage of many NeuroEvolutionary methods. It is also demonstrated that the ability to create heterogeneous networks comprising a range of transfer functions represents a further significant advantage. This thesis also investigates many potential benefits and drawbacks of NeuroEvolution which have been largely overlooked in the literature. This includes the presence and role of genetic redundancy in NeuroEvolution's search and whether program bloat is a limitation. The investigations presented focus on the use of a recently developed NeuroEvolution method based on Cartesian Genetic Programming. This thesis extends Cartesian Genetic Programming such that it can represent recurrent program structures allowing for the creation of recurrent Artificial Neural Networks. Using this newly developed extension, Recurrent Cartesian Genetic Programming, and its application to Artificial Neural Networks, are demonstrated to be extremely competitive in the domain of series forecasting

    Modelling, forecasting and trading of commodity spreads

    Get PDF
    Historically, econometric models have been developed to model financial instruments and markets however the vast majority of these ‘traditional’ models have one thing in common, linearity. While this is convenient and sometimes intuitive many linear models fail to fully capture the dynamic and complex nature of financial instruments and markets. More recently, ‘sophisticated’ methodologies have been evolved to accurately capture ‘non-linear’ relationships that exist between financial time series. This rapidly advancing field in quantitative finance is known as Artifical Intelligence. The earliest forms of artificial intelligence are Neural Networks however these have since been developed using more accurate learning algoirthms. Neural networks are also of particular use because of their capability of being able to continually learn as new information is fed into the network. In this research new data is introduced using both fixed and sliding window approaches for training each of the networks. Futhermore, Genetic Programming Algorithms are also highly regarded in the financial industry and have been increasingly applied as an optimisation technique. Therefore, each of the non-linear models are supported by existing research and as a result these methodologies have become practical tools for optimising existing models and predicting future movements in financial assets. In the absence of computational algorithms to rationalise large amounts of data, investors are confronted with a difficult and seemingly impossible task of trying to comprehend large datasets of information. Nevertheless, advancements in computing technology have enabled market participants to benefit from the use of neural networks (NN) and genetic programming (GP) algorithms in order to optimise and identify patterns and trends between explanatory variables and target outputs. This is of particular importance in the agricultural market such as grains, precious metals and other commodities are informationally rich with large amounts of data being readily available to evaluate. Among the first to use neural networks for financial analysis were Rumelhart and McClelland (1986), Lippman (1987), and Medsker et al. (1993). More recently, neural networks and genetic programming algorithms have been extensively applied to the foreign exchange market (Hornik et al., 1989; Lawrenz and Westerhoff, 2003), for credit analysis (Tam and Kiang, 1992), volatility forecasting (Ormoneit and Neuneier, 1996; Donaldson and Kamstra, 1997), option pricing, (Hutchinson et al.,1994), portfolio optimisation (Chang et al., 2000; Lin et al., 2001), to both developed (Swales and Yoon, 1992) and emerging (Kimoto et al., 1990) stock markets, and for optimisation of technical trading rules (Tsai et al.,1999; Neely et al., 2003). The application of non-linear methodologies to futures contracts and inparticular, commodity spread trading, is limited. Trippi and DeSieno (1992) and Kaastra and Boyd (1995), however were among the first to explore and apply neural networks to forecast futures markets. Financial markets and assets are influenced by an array of factors including but not limited to; human behaviour, economic variables, and many other systematic and non-systematic factors . As a result, many academics and practioners have devised numerous approaches and models to explain financial time series such as fundamental analysis, technical analysis and behavioural finance. The purpose of this research however is to identify, forecast and trade daily changes in commodity spreads using a combination of novel nonlinear modeling techniques and performance enhancing trading filters. During the research process, non-linear models such as neural networks and genetic algorithms are used to identify trends in complex and expansive commodity datasets. Each of the methodologies are used to produce predictions for future time periods. In this research forecasts for t+1 horizons are examined. Progressively, each chapter presents an evolution of research in the area of non-linear forecasting to address inefficiencies associated with more traditional neural architectures. In total a collection of five non-linear methodologies are proposed and analysed to trade commodity ‘spreads’. These non-linear methodologies are benchmarked against linear models which include Naïve strategies, Moving Average Convergence Divergence (MACD) strategies, buy and hold strategies, Autoregressive Moving Average (ARMA) models, and Cointegration models. In the final chapter of the research a mixed model approach is employed to include linear outputs from benchmark models as inputs during the training of each neural network. The research includes various adaptations of existing non-linear methodologies such as neural networks and genetic programming. Through historical data input, each non-linear methodology is trained to construct ‘optimal’ trading models. Models are selected to trade commodity spreads using data from Exchange Traded Funds (ETFs) and Futures contracts. In all cases the reader is presented with results from both unfiltered and filtered trading simulations. The aim of this thesis is to benefit both hedgers and speculators who are interested in applying non-linear methodologies to the task of forecasting changes in commodity spreads. By allowing market participants to input numerous explanatory variables, non-linear methodologies such as neural networks and genetic programming algorithms can become a valuable tool for predicting changes in commodity spreads. Empirical evidence reveals that non-linear methodologies are statistically superior compared to existing linear models and they also produce higher risk adjusted returns. Moreover, by including output from linear models in the input dataset to train non-linear models, market participants are also able benefit from a ‘synergy’ of information using a ‘mixed model’ approach. In order to improve trading results the research also offers examples of numerous trading filters which can also be of use to hedgers and speculators. On the whole the research contributes a wealth of knowledge to academic studies as it offers conclusive evidence to support the widespread integration and use of non-linear modelling in the form of artificial intelligence. Empirical results are evaluated by statistical measures as well as financial performance measures which are widely used by financial institutions
    • …
    corecore