266 research outputs found

    A survey on financial applications of metaheuristics

    Get PDF
    Modern heuristics or metaheuristics are optimization algorithms that have been increasingly used during the last decades to support complex decision-making in a number of fields, such as logistics and transportation, telecommunication networks, bioinformatics, finance, and the like. The continuous increase in computing power, together with advancements in metaheuristics frameworks and parallelization strategies, are empowering these types of algorithms as one of the best alternatives to solve rich and real-life combinatorial optimization problems that arise in a number of financial and banking activities. This article reviews some of the works related to the use of metaheuristics in solving both classical and emergent problems in the finance arena. A non-exhaustive list of examples includes rich portfolio optimization, index tracking, enhanced indexation, credit risk, stock investments, financial project scheduling, option pricing, feature selection, bankruptcy and financial distress prediction, and credit risk assessment. This article also discusses some open opportunities for researchers in the field, and forecast the evolution of metaheuristics to include real-life uncertainty conditions into the optimization problems being considered.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness (TRA2013-48180-C3-P, TRA2015-71883-REDT), FEDER, and the Universitat Jaume I mobility program (E-2015-36)

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    Speeding up Multiple Instance Learning Classification Rules on GPUs

    Get PDF
    Multiple instance learning is a challenging task in supervised learning and data mining. How- ever, algorithm performance becomes slow when learning from large-scale and high-dimensional data sets. Graphics processing units (GPUs) are being used for reducing computing time of algorithms. This paper presents an implementation of the G3P-MI algorithm on GPUs for solving multiple instance problems using classification rules. The GPU model proposed is distributable to multiple GPUs, seeking for its scal- ability across large-scale and high-dimensional data sets. The proposal is compared to the multi-threaded CPU algorithm with SSE parallelism over a series of data sets. Experimental results report that the com- putation time can be significantly reduced and its scalability improved. Specifically, an speedup of up to 149× can be achieved over the multi-threaded CPU algorithm when using four GPUs, and the rules interpreter achieves great efficiency and runs over 108 billion Genetic Programming operations per second

    Field Guide to Genetic Programming

    Get PDF

    Dynamic Feature Engineering and model selection methods for temporal tabular datasets with regime changes

    Full text link
    The application of deep learning algorithms to temporal panel datasets is difficult due to heavy non-stationarities which can lead to over-fitted models that under-perform under regime changes. In this work we propose a new machine learning pipeline for ranking predictions on temporal panel datasets which is robust under regime changes of data. Different machine-learning models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks with and without simple feature engineering are evaluated in the pipeline with different settings. We find that GBDT models with dropout display high performance, robustness and generalisability with relatively low complexity and reduced computational cost. We then show that online learning techniques can be used in post-prediction processing to enhance the results. In particular, dynamic feature neutralisation, an efficient procedure that requires no retraining of models and can be applied post-prediction to any machine learning model, improves robustness by reducing drawdown in regime changes. Furthermore, we demonstrate that the creation of model ensembles through dynamic model selection based on recent model performance leads to improved performance over baseline by improving the Sharpe and Calmar ratios of out-of-sample prediction performances. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility of results

    Trading the stock market : hybrid financial analyses and evolutionary computation

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Arquitectura de Computadores y Automática, leída el 02-07-2014Esta tesis presenta la implementación de un innovador sistema de comercio automatizado que utiliza tres importantes análisis para determinar lugares y momentos de inversión. Para ello, este trabajo profundiza en sistemas automáticos de comercio y estudia series temporales de precios históricos pertenecientes a empresas que cotizan en el mercado bursátil. Estudiamos y clasifcamos las series temporales mediante el uso de una novedosa metodología basada en compresores de software. Este nuevo enfoque permite un estudio teórico de la formación de precios que demuestra resultados de divergencia entre precios reales de mercado y precios modelados mediante paseos aleatorios, apoyando así el desarrollo de modelos predictivos basados en el análisis de patrones históricos como los descritos en este documento. Además, esta metodología nos permite estudiar el comportamiento de series temporales de precios históricos en distintos sectores industriales mediante la búsqueda de patrones en empresas pertenecientes al mismo sector. Los resultados muestran agrupaciones que indican tendencias de mercado compartidas y ,por tanto, señalan que la inclusión de un análisis industrial puede reportar ventajas en la toma de decisiones de inversión. Comprobada la factibilidad de un sistema de predicción basado en series temporales y demostrada la existencia de tendencias macroeconómicas en las diferentes industrias, proponemos el desarrollo del sistema completo a través de diferentes etapas. Iterativamente y mediante varias aproximaciones, testeamos y analizamos las piezas que componen el sistema nal. Las primeras fases describen un sistema de comercio automatizado, basado en análisis técnico y fundamental de empresas, que presenta altos rendimientos y reduce el riesgo de pérdidas. El sistema utiliza un motor de optimización guiado por una versión modi cada de un algoritmo genético el la que presentamos operadores innovadores que proporcionan mecanismos para evitar una convergencia prematura del algoritmo y mejorar los resultados de rendimiento nales. Utilizando este mismo sistema de comercio automático proponemos técnicas de optimización novedosas en relación a uno de los problemas más característicos de estos sistemas, el tiempo de ejecución. Presentamos la paralelización del sistema de comercio automatizado mediante dos técnicas de computación paralela, computación distribuida y procesamiento grá co. Ambas arquitecturas presentan aceleraciones elevadas alcanzando los x50 y x256 respectivamente. Estápas posteriores presentan un cambio de metodologia de optimización, algoritmos genéticos por evolución gramatical, que nos permite comparar ambas estrategias e implementar características más avanzadas como reglas más complejas o la auto-generación de nuevos indicadores técnicos. Testearemos, con datos nancieros recientes, varios sistemas de comercio basados en diferentes funciones de aptitud, incluyendo una innovadora versión multi-objetivo, que nos permitirán analizar las ventajas de cada función de aptitud. Finalmente, describimos y testeamos la metodología del sistema de comercio automatizado basado en una doble capa de gramáticas evolutivas y que combina un análisis técnico, fundamental y macroeconómico en un análisis top-down híbrido. Los resultados obtenidos muestran rendimientos medios del 30% con muy pocas operaciones de perdidas.This thesis concerns to the implementation of a complex and pioneering automated trading system which uses three critical analysis to determine time-decisions and portfolios for investments. To this end, this work delves into automated trading systems and studies time series of historical prices related to companies listed in stock markets. Time series are studied using a novel methodology based on clusterings by software compressors. This new approach allows a theoretical study of price formation which shows results of divergence between market prices and prices modelled by random walks, thus supporting the implementation of predictive models based on the analysis of historical patterns. Furthermore, this methodology also provides us the tool to study behaviours of time series of historical prices from di erent industrial sectors seeking patterns among companies in the same industry. Results show clusters of companies pointing out market trends among companies developing similar activities, and suggesting a macroeconomic analysis to take advantage of investment decisions. Tested the feasibility of prediction systems based on analyses related to time series of historical prices and tested the existence of macroeconomic trends in the industries, we propose the implementation of a hybrid automated trading system through several stages which iteratively describe and test the components of the nal system. In the early stages, we implement an automated trading system based on technical and fundamental analysis of companies, it presents high returns and reducing losses. The implementation uses a methodology guided by a modi ed version of a genetic algorithm which presents novel genetic operators avoiding the premature convergence and improving nal results. Using the same automated trading system we propose novel optimization techniques related to one of the characteristic problems of these systems: the execution time. We present the parallelisation of the system using two parallel computing techniques, rst using distributed computation and, second, implementing a version for graphics processors. Both architectures achieve high speed-ups, reaching 50x and 256x respectively, thus, they present the necessary speed-ups required by systems analysing huge amount of nancial data. Subsequent stages present a transformation in the methodology, genetic algorithms for grammatical evolution, which allows us to compare the two evolutionary strategies and to implement more advanced features such as more complex rules or the self-generation of new technical indicators. In this context, we describe several automated trading system versions guided by di erent tness functions, including an innovative multi-objective version that we test with recent nancial data analysing the advantages of each tness function. Finally, we describe and test the methodology of an automated trading system based on a double layer of grammatical evolution combining technical, fundamental and macroeconomic analysis on a hybrid topdown analysis. The results show average returns of 30% with low number of negative operations.Depto. de Arquitectura de Computadores y AutomáticaFac. de InformáticaTRUEunpu

    Predicting Financial Markets using Text on the Web

    Get PDF

    A theoretical and computational basis for CATNETS

    Get PDF
    The main content of this report is the identification and definition of market mechanisms for Application Layer Networks (ALNs). On basis of the structured Market Engineering process, the work comprises the identification of requirements which adequate market mechanisms for ALNs have to fulfill. Subsequently, two mechanisms for each, the centralized and the decentralized case are described in this document. These build the theoretical foundation for the work within the following two years of the CATNETS project. --Grid Computing
    • …
    corecore