Search CORE

278 research outputs found

An Overview of Schema Theory

Author: White David
Publication venue
Publication date: 01/01/2014
Field of study

The purpose of this paper is to give an introduction to the field of Schema Theory written by a mathematician and for mathematicians. In particular, we endeavor to to highlight areas of the field which might be of interest to a mathematician, to point out some related open problems, and to suggest some large-scale projects. Schema theory seeks to give a theoretical justification for the efficacy of the field of genetic algorithms, so readers who have studied genetic algorithms stand to gain the most from this paper. However, nothing beyond basic probability theory is assumed of the reader, and for this reason we write in a fairly informal style. Because the mathematics behind the theorems in schema theory is relatively elementary, we focus more on the motivation and philosophy. Many of these results have been proven elsewhere, so this paper is designed to serve a primarily expository role. We attempt to cast known results in a new light, which makes the suggested future directions natural. This involves devoting a substantial amount of time to the history of the field. We hope that this exposition will entice some mathematicians to do research in this area, that it will serve as a road map for researchers new to the field, and that it will help explain how schema theory developed. Furthermore, we hope that the results collected in this document will serve as a useful reference. Finally, as far as the author knows, the questions raised in the final section are new.Comment: 27 pages. Originally written in 2009 and hosted on my website, I've decided to put it on the arXiv as a more permanent home. The paper is primarily expository, so I don't really know where to submit it, but perhaps one day I will find an appropriate journa

arXiv.org e-Print Archive

Denison University

Real-coded genetic algorithm particle filters for high-dimensional state spaces

Author: Hussain MS
Publication venue: UCL (University College London)
Publication date: 28/04/2014
Field of study

This thesis successfully addresses the issues faced by particle filters in high-dimensional state-spaces by comparing them with genetic algorithms and then using genetic algorithm theory to address these issues. Sequential Monte Carlo methods are a class of online posterior density estimation algorithms that are suitable for non-Gaussian and nonlinear environments, however they are known to suffer from particle degeneracy; where the sample of particles becomes too sparse to approximate the posterior accurately. Various techniques have been proposed to address this issue but these techniques fail in high-dimensions. In this thesis, after a careful comparison between genetic algorithms and particle filters, we posit that genetic algorithm theoretic arguments can be used to explain the working of particle filters. Analysing the working of a particle filter, we note that it is designed similar to a genetic algorithm but does not include recombination. We argue based on the building-block hypothesis that the addition of a recombination operator would be able to address the sample impoverishment phenomenon in higher dimensions. We propose a novel real-coded genetic algorithm particle filter (RGAPF) based on these observations and test our hypothesis on the stochastic volatility estimation of financial stocks. The RGAPF successfully scales to higher-dimensions. To further strengthen our argument that whether building-block-hypothesis-like effects are due to the recombination operator, we compare the RGAPF with a mutation-only particle filter with an adjustable mutation rate that is set to equal the population-to-population variance of the RGAPF. The latter significantly and consistently performs better, indicating that recombination is having a subtle and significant effect that may be theoretically explained by genetic algorithm theory. After two successful attempts at validating our hypothesis we compare the performance of the RGAPF using different real-recombination operators. Observing the behaviour of the RGAPF under these recombination operators we propose a mean-centric recombination operator specifically for high-dimensional particle filtering. This recombination operator is successfully tested and compared with benchmark particle filters and a hybrid CMA-ES particle filter using simulated data and finally on real end-of-day data of the securities making up the FTSE-100 index. Each experiment is discussed in detail and we conclude with a brief description of the future direction of research

UCL Discovery

Adaptive scaling of evolvable systems

Author: Hammond Simon P.
Publication venue
Publication date: 01/01/2007
Field of study

Neo-Darwinian evolution is an established natural inspiration for computational optimisation with a diverse range of forms. A particular feature of models such as Genetic Algorithms (GA) [18, 12] is the incremental combination of partial solutions distributed within a population of solutions. This mechanism in principle allows certain problems to be solved which would not be amenable to a simple local search. Such problems require these partial solutions, generally known as building-blocks, to be handled without disruption. The traditional means for this is a combination of a suitable chromosome ordering with a sympathetic recombination operator. More advanced algorithms attempt to adapt to accommodate these dependencies during the search. The recent approach of Estimation of Distribution Algorithms (EDA) aims to directly infer a probabilistic model of a promising population distribution from a sample of fitter solutions [23]. This model is then sampled to generate a new solution set. A symbiotic view of evolution is behind the recent development of the Compositional Search Evolutionary Algorithms (CSEA) [49, 19, 8] which build up an incremental model of variable dependencies conditional on a series of tests. Building-blocks are retained as explicit genetic structures and conditionally joined to form higher-order structures. These have been shown to be effective on special classes of hierarchical problems but are unproven on less tightly-structured problems. We propose that there exists a simple yet powerful combination of the above approaches: the persistent, adapting dependency model of a compositional pool with the expressive and compact variable weighting of probabilistic models. We review and deconstruct some of the key methods above for the purpose of determining their individual drawbacks and their common principles. By this reasoned approach we aim to arrive at a unifying framework that can adaptively scale to span a range of problem structure classes. This is implemented in a novel algorithm called the Transitional Evolutionary Algorithm (TEA). This is empirically validated in an incremental manner, verifying the various facets of the TEA and comparing it with related algorithms for an increasingly structured series of benchmark problems. This prompts some refinements to result in a simple and general algorithm that is nevertheless competitive with state-of-the-art methods

CiteSeerX

University of Birmingham Research Archive, E-theses Repository

OpenGrey Repository

The influence of population size in geometric semantic GP

Author: Castelli Mauro
Manzoni Luca
Popovic Ales
Silva Sara
Vanneschi Leonardo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

In this work, we study the influence of the population size on the learning ability of Geometric Semantic Genetic Programming for the task of symbolic regression. A large set of experiments, considering different population size values on different regression problems, has been performed. Results show that, on real-life problems, having small populations results in a better training fitness with respect to the use of large populations after the same number of fitness evaluations. However, performance on the test instances varies among the different problems: in datasets with a high number of features, models obtained with large populations present a better performance on unseen data, while in datasets characterized by a relative small number of variables a better generalization ability is achieved by using small population size values. When synthetic problems are taken into account, large population size values represent the best option for achieving good quality solutions on both training and test instances

Archivio istituzionale della ricerca - Università di Trieste

A Field Guide to Genetic Programming

Author: Langdon William B.
McPhee Nicholas F.
Poli Ricardo
Publication venue: [S.L.] : Lulu Press (lulu.com), 2008.
Publication date: 01/01/2008
Field of study

xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

Metabiblioteca-Biblioteca Digital Libros Abiertos

Evolutionary Computation

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

This book presents several recent advances on Evolutionary Computation, specially evolution-based optimization methods and hybrid algorithms for several applications, from optimization and learning to pattern recognition and bioinformatics. This book also presents new algorithms based on several analogies and metafores, where one of them is based on philosophy, specifically on the philosophy of praxis and dialectics. In this book it is also presented interesting applications on bioinformatics, specially the use of particle swarms to discover gene expression patterns in DNA microarrays. Therefore, this book features representative work on the field of evolutionary computation and applied sciences. The intended audience is graduate, undergraduate, researchers, and anyone who wishes to become familiar with the latest research work on this field

Directory of Open Access Books (DOAB)

Hierarchically organised genetic algorithm for fuzzy network synthesis

Author: Filloy-García Enrique Rafael
Publication venue: The University of Edinburgh
Publication date: 01/01/2002
Field of study

Edinburgh Research Archive

Evolutionary computation for trading systems

Author: Kaucic Massimiliano
Publication venue: Università degli studi di Trieste
Publication date: 14/04/2008
Field of study

2007/2008Evolutionary computations, also called evolutionary algorithms, consist of several heuristics, which are able to solve optimization tasks by imitating some aspects of natural evolution. They may use different levels of abstraction, but they are always working on populations of possible solutions for a given task. The basic idea is that if only those individuals of a population which meet a certain selection criteria reproduce, while the remaining individuals die, the population will converge to those individuals that best meet the selection criteria. If imperfect reproduction is added the population can begin to explore the search space and will move to individuals that have an increased selection probability and that hand down this property to their descendants. These population dynamics follow the basic rule of the Darwinian evolution theory, which can be described in short as the “survival of the fittest”. Although evolutionary computations belong to a relative new research area, from a computational perspective they have already showed some promising features such as: • evolutionary methods reveal a remarkable balance between efficiency and efficacy; • evolutionary computations are well suited for parameter optimisation; • this type of algorithms allows a wide variety of extensions and constraints that cannot be provided in traditional methods; • evolutionary methods are easily combined with other optimization techniques and can also be extended to multi-objective optimization. From an economic perspective, these methods appear to be particularly well suited for a wide range of possible financial applications, in particular in this thesis I study evolutionary algorithms • for time series prediction; • to generate trading rules; • for portfolio selection. It is commonly believed that asset prices are not random, but are permeated by complex interrelations that often translate in assets mispricing and may give rise to potentially profitable opportunities. Classical financial approaches, such as dividend discount models or even capital asset pricing theories, are not able to capture these market complexities. Thus, in the last decades, researchers have employed intensive econometric and statistical modeling that examine the effects of a multitude of variables, such as price- earnings ratios, dividend yields, interest rate spreads and changes in foreign exchange rates, on a broad and variegated range of stocks at the same time. However, these models often result in complex functional forms difficult to manage or interpret and, in the worst case, are solely able to fit a given time series but are useless to predict it. Parallelly to quantitative approaches, other researchers have focused on the impact of investor psychology (in particular, herding and overreaction) and on the consequences of considering informed signals from management and analysts, such as share repurchases and analyst recommendations. These theories are guided by intuition and experience, and thus are difficult to be translated into a mathematical environment. Hence, the necessity to combine together these point of views in order to develop models that examine simultaneously hundreds of variables, including qualitative informations, and that have user friendly representations, is urged. To this end, the thesis focuses on the study of methodologies that satisfy these requirements by integrating economic insights, derived from academic and professional knowledge, and evolutionary computations. The main task of this work is to provide efficient algorithms based on the evolutionary paradigm of biological systems in order to compute optimal trading strategies for various profit objectives under economic and statistical constraints. The motivations for constructing such optimal strategies are: i) the necessity to overcome data-snooping and supervisorship bias in order to learn to predict good trading opportunities by using market and/or technical indicators as features on which to base the forecasting; ii) the feasibility of using these rules as benchmark for real trading systems; iii) the capability of ranking quantitatively various markets with respect to their profitability according to a given criterion, thus making possible portfolio allocations. More precisely, I present two algorithms that use artificial expert trading systems to predict financial time series, and a procedure to generate integrated neutral strategies for active portfolio management. The first algorithm is an automated procedure that simultaneously selects variables and detect outliers in a dynamic linear model using information criteria as objective functions and diagnostic tests as constraints for the distributional properties of errors. The novelties are the automatic implementation of econometric conditions in the model selection step, making possible a better exploration of the solution space on one hand, and the use of evolutionary computations to efficiently generate a reduction procedure from a very large number of independent variables on the other hand. In the second algorithm, the novelty is given by the definition of evolutionary learning in financial terms and its use in a multi-objective genetic algorithm in order to generate technical trading systems. The last tool is based on a trading strategy on six assets, where future movements of each variable are obtained by an evolutionary procedure that integrates various types of financial variables. The contribution is given by the introduction of a genetic algorithm to optimize trading signals parameters and the way in which different informations are represented and collected. In order to compare the contribution of this work to “classical” techniques and theories, the thesis is divided into three parts. The first part, titled Background, collects Chapters 2 and 3. Its purpose is to provide an introduction to search/optimization evolutionary techniques on one hand, and to the theories that relate the predictability in financial markets with the concept of efficiency proposed over time by scholars on the other hand. More precisely, Chapter 2 introduces the basic concepts and major areas of evolutionary computation. It presents a brief history of three major types of evolutionary algorithms, i.e. evolution strategies, evolutionary programming and genetic algorithms, and points out similarities and differences among them. Moreover it gives an overview of genetic algorithms and describes classical and genetic multi-objective optimization techniques. Chapter 3 first presents an overview of the literature on the predictability of financial time series. In particular, the extent to which the efficiency paradigm is affected by the introduction of new theories, such as behavioral finance, is described in order to justify the market forecasting methodologies developed by practitioners and academics in the last decades. Then, a description of the econometric and financial techniques that will be used in conjunction with evolutionary algorithms in the successive chapters is provided. Special attention is paid to economic implications, in order to highlight merits and shortcomings from a practitioner perspective. The second part of the thesis, titled Trading Systems, is devoted to the description of two procedures I have developed in order to generate artificial trading strategies on the basis of evolutionary algorithms, and it groups Chapters 4 and 5. In particular, chapter 4 presents a genetic algorithm for variable selection by minimizing the error in a multiple regression model. Measures of errors such as ME, RMSE, MAE, Theil’s inequality coefficient and CDC are analyzed choosing models based on AIC, BIC, ICOMP and similar criteria. Two components of penalty functions are taken in analysis- level of significance and Durbin Watson statistics. Asymptotic properties of functions are tested on several financial variables including stocks, bonds, returns, composite prices indices from the US and the EU economies. Variables with outliers that distort the efficiency and consistency of estimators are removed to solve masking and smearing problems that they may cause in estimations. Two examples complete the chapter. In both cases, models are designed to produce short-term forecasts for the excess returns of the MSCI Europe Energy sector on the MSCI Europe index and a recursive estimation- window is used to shed light on their predictability performances. In the first application the data-set is obtained by a reduction procedure from a very large number of leading macro indicators and financial variables stacked at various lags, while in the second the complete set of 1-month lagged variables is considered. Results show a promising capability to predict excess sector returns through the selection, using the proposed methodology, of most valuable predictors. In Chapter 5 the paradigm of evolutionary learning is defined and applied in the context of technical trading rules for stock timing. A new genetic algorithm is developed by integrating statistical learning methods and bootstrap to a multi-objective non dominated sorting algorithm with variable string length, making possible to evaluate statistical and economic criteria at the same time. Subsequently, the chapter discusses a practical case, represented by a simple trading strategy where total funds are invested in either the S&P 500 Composite Index or in 3-month Treasury Bills. In this application, the most informative technical indicators are selected from a set of almost 5000 signals by the algorithm. Successively, these signals are combined into a unique trading signal by a learning method. I test the expert weighting solution obtained by the plurality voting committee, the Bayesian model averaging and Boosting procedures with data from the the S&P 500 Composite Index, in three market phases, up-trend, down- trend and sideways-movements, covering the period 2000–2006. In the third part, titled Portfolio Selection, I explain how portfolio optimization models may be constructed on the basis of evolutionary algorithms and on the signals produced by artificial trading systems. First, market neutral strategies from an economic point of view are introduced, highlighting their risks and benefits and focusing on their quantitative formulation. Then, a description of the GA-Integrated Neutral tool, a MATLAB set of functions based on genetic algorithms for active portfolio management, is given. The algorithm specializes in the parameter optimization of trading signals for an integrated market neutral strategy. The chapter concludes showing an application of the tool as a support to decisions in the Absolute Return Interest Rate Strategies sub-fund of Generali Investments.Gli “algoritmi evolutivi”, noti anche come “evolutionary computations” comprendono varie tecniche di ottimizzazione per la risoluzione di problemi, mediante alcuni aspetti suggeriti dall’evoluzione naturale. Tali metodologie sono accomunate dal fatto che non considerano un’unica soluzione alla volta, bens`ı trattano intere popolazioni di possibili soluzioni per un dato problema. L’idea sottostante `e che, se un algoritmo fa evolvere solamente gli individui di una data popolazione che soddisfano a un certo criterio di selezione, e lascia morire i restanti, la popolazione converger`a agli individui che meglio soddisfano il criterio di selezione. Con una selezione non ottimale, cio`e una che ammette pure soluzioni sub-ottimali, la popolazione rappresenter` a meglio l’intero spazio di ricerca e sar`a in grado di individuare in modo pi`u consistente gli individui migliori da far evolvere. Queste dinamiche interne alle popolazioni seguono i principi Darwiniani dell’evoluzione, che si possono sinteticamente riassumere nella dicitura “la sopravvivenza del più adatto”. Sebbene gli algoritmi evolutivi siano un’area di ricerca relativamente nuova, dal punto di vista computazionale hanno dimostrato alcune caratteristiche interessanti fra cui le seguenti: • permettono un notevole equilibrio tra efficienza ed efficacia; • sono particolarmente indicati per la configurazione dei parametri in problemi di ottimizzazione; • consentono una flessibilit`a nella definizione matematica dei problemi e dei vincoli che non si trova nei metodi tradizionali; • possono facilmente essere integrati con altre tecniche di ottimizzazione ed essere essere modificati per risolvere problemi multi-obiettivo. Dal un punto di vista economico, l’applicazione di queste procedure pu`o risultare utile specialmente in campo finanziario. In particolare, nella mia tesi ho studiato degli algoritmi evolutivi per • la previsione di serie storiche finanziarie; • la costruzione di regole di trading; • la selezione di portafogli. Da un punto di vista pi`u ampio, lo scopo di questa ricerca `e dunque l’analisi dell’evoluzione e della complessit`a dei mercati finanziari. In tal senso, dal momento che i prezzi non seguono andamenti puramente casuali, ma sono governati da un insieme molto articolato di eventi correlati, i modelli e le teorie classiche, come i dividend discount model e le varie capital asset pricing theories, non sono pi`u sufficienti per determinare potenziali opportunit`a di profitto. A tal fine, negli ultimi decenni, alcuni ricercatori hanno sviluppato una vasta gamma di modelli econometrici e statistici in grado di esaminare contemporaneamente le relazioni e gli effetti di centinaia di variabili, come ad esempio, price-earnings ratios, dividendi, differenziali fra tassi di interesse e variazioni dei tassi di cambio, per una vasta gamma di assets. Comunque, questo approccio, che fa largo impiego di strumenti di calcolo, spesso porta a dei modelli troppo complicati per essere gestiti o interpretati, e, nel peggiore dei casi, pur essendo ottimi per descrivere situazioni passate, risultano inutili per fare previsioni. Parallelamente a questi approcci quantitativi, si `e manifestato un grande interesse sulla psicologia degli investitori e sulle conseguenze derivanti dalle opinioni di esperti e analisti nelle dinamiche del mercato. Questi studi sono difficilmente traducibili in modelli matematici e si basano principalmente sull’intuizione e sull’esperienza. Da qui la necessit` a di combinare insieme questi due punti di vista, al fine di sviluppare modelli che siano in grado da una parte di trattare contemporaneamente un elevato numero di variabili in modo efficiente e, dall’altra, di incorporare informazioni e opinioni qualitative. La tesi affronta queste tematiche integrando le conoscenze economiche, sia accademiche che professionali, con gli algoritmi evolutivi. Pi`u pecisamente, il principale obiettivo di questo lavoro `e lo sviluppo di algoritmi efficienti basati sul paradigma dell’evoluzione dei sistemi biologici al fine di determinare strategie di trading ottimali in termini di profitto e di vincoli economici e statistici. Le ragioni che motivano lo studio di tali strategie ottimali sono: i) la necessit`a di risolvere i problemi di data-snooping e supervivorship bias al fine di ottenere regole di investimento vantaggiose utilizzando indicatori di mercato e/o tecnici per la previsione; ii) la possibilità di impiegare queste regole come benchmark per sistemi di trading reali; iii) la capacit`a di individuare gli asset pi`u vantaggiosi in termini di profitto, o di altri criteri, rendendo possibile una migliore allocazione di risorse nei portafogli. In particolare, nella tesi descrivo due algoritmi che impiegano sistemi di trading artificiali per predire serie storiche finanziarie e una procedura di calcolo per strategie integrate neutral market per la gestione attiva di portafogli. Il primo algoritmo `e una procedura automatica che seleziona le variabili e simultaneamente determina gli outlier in un modello dinamico lineare utilizzando criteri informazionali come funzioni obiettivo e test diagnostici come vincoli per le caratteristiche delle distribuzioni degli errori. Le novit`a del metodo sono da una parte l’implementazione automatica di condizioni econometriche nella fase di selezione, consentendo una migliore analisi dello EVOLUTIONARY COMPUTATIONS FOR TRADING SYSTEMS 3 spazio delle soluzioni, e dall’altra parte, l’introduzione di una procedura di riduzione evolutiva capace di riconoscere in modo efficiente le variabili pi`u informative. Nel secondo algoritmo, le novità sono costituite dalla definizione dell’apprendimento evolutivo in termini finanziari e dall’applicazione di un algoritmo genetico multi-obiettivo per la costruzione di sistemi di trading basati su indicatori tecnici. L’ultimo metodo proposto si basa su una strategia di trading su sei assets, in cui le dinamiche future di ciascuna variabile sono ottenute impiegando una procedura evolutiva che integra diverse tipologie di variabili finanziarie. Il contributo è dato dall’impiego di un algoritmo genetico per ottimizzare i parametri negli indicatori tecnici e dal modo in cui le differenti informazioni sono presentate e collegate. La tesi `e organizzata in tre parti. La prima parte, intitolata Background, comprende i Capitoli 2 e 3, ed è intesa a fornire un’introduzione alle tecniche di ricerca/ottimizzazione su base evolutiva da una parte, e alle teorie che si occupano di efficienza e prevedibilit`a dei mercati finanziari dall’altra. Più precisamente, il Capitolo 2 introduce i concetti base e i maggiori campi di studio della computazione evolutiva. In tal senso, si dà una breve presentazione storica di tre dei maggiori tipi di algoritmi evolutivi, ciò e le strategie evolutive, la programmazione evolutiva e gli algoritmi genetici, evidenziandone caratteri comuni e differenze. Il capitolo si chiude con una panoramica sugli algoritmi genetici e sulle tecniche classiche e genetiche di ottimizzazione multi-obiettivo. Il Capitolo 3 affronta nel dettaglio la problematica della prevedibilit`a delle serie storiche finanziarie mettendo in luce, in particolare, quanto il paradigma dell’efficienza sia influenzato dalle pi`u recenti teorie finanziarie, come ad esempio la finanza comportamentale. Lo scopo è quello di dare una giustificazione su basi teoriche per le metodologie di previsione sviluppate nella tesi. Segue una descrizione dei metodi econometrici e di analisi tecnica che nei capitoli successivi verrano impiegati assieme agli algoritmi evolutivi. Una particolare attenzione è data alle implicazioni economiche, al fine di evidenziare i loro meriti e i loro difetti da un punto di vista pratico. La seconda parte, intitolata Trading Systems, raggruppa i Capitoli 4 e 5 ed è dedicata alla descrizione di due procedure che ho sviluppato per generare sistemi di trading artificiali sulla base di algoritmi evolutivi. In particolare, il Capitolo 4 presenta un algortimo genetico per la selezione di variabili attraverso la minimizzazione dell’errore in un modello di regressione multipla. Misure di errore, quali il ME, il RMSE, il MAE, il coefficiente di Theil e il CDC sono analizzate a partire da modelli selezionati sulla scorta di criteri informazionali, come ad esempio AIC, BIC, ICOMP. A livello di vincoli diagnostici, ho considerato una funzione di penalità a due componenti e la statistica di Durbin Watson. Il programma impiega variabili finanziarie di vario tipo, come rendimenti di titoli, bond e prezzi di indici composti ottenuti dalle economie Statunitense ed Europea. Nel caso le serie storiche 4 MASSIMILIANO KAUCIC considerate presentino outliers che distorcono l’efficienza e la consistenza degli stimatori, l’algoritmo `e in grado di individuarle e rimuoverle dalla serie, risolvendo il problema di masking and smearing. Il capitolo si conclude con due applicazioni, in cui i modelli sono progettati per produrre previsioni di breve periodo per l’extra rendimento del settore MSCI Europe Energy sull’indice MSCI Europe e una procedura di tipo recursive estimation-window è utilizzata per evidenziarne le performance previsionali. Nel primo esempio, l’insieme dei dati `e ottenuto estraendo le variabili di interesse da un considerevole numero di indicatori di tipo macro e da variabili finanziarie ritardate rispetto alla variabile dipendente. Nel secondo esempio ho invece considerato l’intero insieme di variabili ritardate di 1 mese. I risultati mostrano una notevole capacità previsiva per l’extra rendimento, individuando gli indicatori maggiormente informativi. Nel Capitolo 5, il concetto di apprendimento evolutivo viene definito ed applicato alla costruzione di regole di trading su indicatori tecnici per lo stock timing. In tal senso, ho sviluppato un algoritmo che integra metodi di apprendimento statistico e di boostrap con un particolare algoritmo multi-obiettivo. La procedura derivante è in grado di valutare contemporaneamente criteri economici e statistici. Per descrivere il suo funzionamento, ho considerato un semplice esempio di trading in cui tutto il capitale è investito in un indice (che nel caso trattato è l’indice S&P 500 Composite) o in un titolo a basso rischio (nell’esempio, i Treasury Bills a 3 mesi). Il segnale finale di trading `e il risultato della selezione degli indicatori tecnici pi`u informativi a partire da un insieme di circa 5000 indicatori e la loro conseguente integrazione mediante un metodo di apprendimento (il plurality voting committee, il bayesian model averaging o il Boosting). L’analisi è stata condotta sull’intervallo temporale dal 2000 al 2006, suddiviso in tre sottoperiodi: il primo rappresenta l’indice in un

OpenstarTs

Empirical Analysis of Schemata in Genetic Programming

Author: Smart Will
Publication venue: 'Victoria University of Wellington Library'
Publication date: 01/01/2011
Field of study

Schemata and buiding blocks have been used in Genetic Programming (GP) in several contexts including subroutines, theoretical analysis and for empirical analysis. Of these three the least explored is empirical analysis. This thesis presents a powerful GP empirical analysis technique for analysis of all schemata of a given form occurring in any program of a given population at scales not previously possible for the kinds of global analysis performed. There are many competing GP forms of schema and, rather than choosing one for analysis, the thesis defines the match-tree meta-form of schema as a general language expressing forms of schema for use by the analysis system. This language can express most forms of schema previously used in tree-based GP. The new method can perform wide-ranging analyses on the prohibitively large set of all schemata in the programs by introducing the concepts of maximal schema, maximal program subset, representative set of schemata, and representative program subset. These structures are used to optimize the analysis, shrinking its complexity to a manageable size without sacrificing the result. Characterization experiments analyze GP populations of up to 501 60- node programs, using 11 forms of schema including rooted-hyperschemata and non-rooted fragments. The new method has close to quadratic complexity on population size, and quartic complexity on program size. Efficacy experiments present example analyses using the new method. The experiments offer interesting insights into the dynamics of GP runs including fine-grained analysis of convergence and the visualization of schemata during a GP evolution. Future work will apply the many possible extensions of this new method to understanding how GP operates, including studies of convergence, building blocks and schema fitness. This method provides a much finer-resolution microscope into the inner workings of GP and will be used to provide accessable visualizations of the evolutionary process

ResearchArchive at Victoria University of Wellington