57 research outputs found

    Evolutionary Algorithms Based on Effective Search Space Reduction for Financial Optimization Problems

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 8. 문병로.This thesis presents evolutionary algorithms incorporated with effective search space reduction for financial optimization problems. Typical evolutionary algorithms try to find optimal solutions in the original, or unrestricted search space. However, they can be unsuccessful if the optimal solutions are too complex to be discovered from scratch. This can be relieved by restricting the forms of meaningful solutions or providing the initial population with some promising solutions. To this end, we propose three evolution approaches including modular, grammatical, and seeded evolutions for financial optimization problems. We also adopt local optimizations for fine-tuning the solutions, resulting in hybrid evolutionary algorithms. First, the thesis proposes a modular evolution. In the modular evolution, the possible forms of solutions are statically restricted to certain combinations of module solutions, which reflect more domain knowledge. To preserve the module solutions, we devise modular genetic operators which work on modular search space. The modular genetic operators and statically defined modules help genetic programming focus on highly promising search space. Second, the thesis introduces a grammatical evolution. We restrict the possible forms of solutions in genetic programming by a context-free grammar. In the grammatical evolution, genetic programming works on more extended search space than modular one. Grammatically typed genetic operators are introduced for the grammatical evolution. Compared with the modular evolution, grammatical evolution requires less domain knowledge. Finally, the thesis presents a seeded evolution. Our seeded evolution provides the initial population with partially optimized solutions. The set of genes for the partial optimization is selected in terms of encoding complexity. The partially optimized solutions help genetic algorithm find more promising solutions efficiently. Since they are not too excessively optimized, genetic algorithm is still able to search better solutions. Extensive empirical results are provided using three real-world financial optimization problems: attractive technical pattern discovery, extended attractive technical pattern discovery, and large-scale stock selection. They show that our search space reductions are fairly effective for the problems. By combining the search space reductions with systematic evolutionary algorithm frameworks, we show that evolutionary algorithms can be exploited for realistic profitable trading.1. Introduction 1 1.1 Search Methods 3 1.2 Search Space Reduction 4 1.3 Main Contributions 5 1.4 Organization 7 2. Preliminaries 8 2.1 Evolutionary Algorithms 8 2.1.1 Genetic Algorithm 10 2.1.2 Genetic Programing 11 2.2 Evolutionary Algorithms in Finance 12 2.3 Search Space Reduction 12 2.3.1 Modular Evolution 12 2.3.2 Grammatical Evolution 13 2.3.3 Seeded Evolution 14 2.3.4 Summary 14 2.4 Terminology 15 2.4.1 Technical Pattern and Technical Trading Rule 15 2.4.2 Forecasting Model and Trading Model 16 2.4.3 Portfolio and Rebalancing 17 2.4.4 Data Snooping Bias 17 2.5 Financial Optimization Problems 19 2.5.1 Attractive Technical Pattern Discovery and Its Extension 19 2.5.2 Stock Selection 20 2.6 Issues 21 2.6.1 General Assumptions 21 2.6.2 Performance Measure 22 3. Modular Evolution 23 3.1 Modular Genetic Programming 24 3.2 Hybrid Genetic Programming 28 3.3 Attractive Technical Pattern Discovery 29 3.3.1 Introduction 29 3.3.2 Problem Formulation 31 3.3.3 Modular Search Space 33 3.3.4 Experimental Results 35 3.3.5 Summary 41 4. Grammatical Evolution 44 4.1 Grammatical Type System 45 4.2 Hybrid Genetic Programming 47 4.3 Extended Attractive Technical Pattern Discovery 51 4.3.1 Introduction 51 4.3.2 Problem Formulation 54 4.3.3 Experimental Results 56 4.3.4 Summary 73 5. Seeded Evolution 76 5.1 Heuristic Seeding 77 5.2 Hybrid Genetic Algorithm 78 5.3 Large-Scale Stock Selection 81 5.3.1 Introduction 81 5.3.2 Problem Formulation 83 5.3.3 Ranking with Partitions 85 5.3.4 Experimental Results 87 5.3.5 Summary 96 6. Conclusions 104Docto

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access two-volume set constitutes the proceedings of the 26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The total of 60 regular papers presented in these volumes was carefully reviewed and selected from 155 submissions. The papers are organized in topical sections as follows: Part I: Program verification; SAT and SMT; Timed and Dynamical Systems; Verifying Concurrent Systems; Probabilistic Systems; Model Checking and Reachability; and Timed and Probabilistic Systems. Part II: Bisimulation; Verification and Efficiency; Logic and Proof; Tools and Case Studies; Games and Automata; and SV-COMP 2020

    Proceedings of the RESOLVE Workshop 2002

    Get PDF
    Proceedings of the RESOLVE Workshop 200

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 29th European Symposium on Programming, ESOP 2020, which was planned to take place in Dublin, Ireland, in April 2020, as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The actual ETAPS 2020 meeting was postponed due to the Corona pandemic. The papers deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    Density-Aware Linear Algebra in a Column-Oriented In-Memory Database System

    Get PDF
    Linear algebra operations appear in nearly every application in advanced analytics, machine learning, and of various science domains. Until today, many data analysts and scientists tend to use statistics software packages or hand-crafted solutions for their analysis. In the era of data deluge, however, the external statistics packages and custom analysis programs that often run on single-workstations are incapable to keep up with the vast increase in data volume and size. In particular, there is an increasing demand of scientists for large scale data manipulation, orchestration, and advanced data management capabilities. These are among the key features of a mature relational database management system (DBMS). With the rise of main memory database systems, it now has become feasible to also consider applications that built up on linear algebra. This thesis presents a deep integration of linear algebra functionality into an in-memory column-oriented database system. In particular, this work shows that it has become feasible to execute linear algebra queries on large data sets directly in a DBMS-integrated engine (LAPEG), without the need of transferring data and being restricted by hard disc latencies. From various application examples that are cited in this work, we deduce a number of requirements that are relevant for a database system that includes linear algebra functionality. Beside the deep integration of matrices and numerical algorithms, these include optimization of expressions, transparent matrix handling, scalability and data-parallelism, and data manipulation capabilities. These requirements are addressed by our linear algebra engine. In particular, the core contributions of this thesis are: firstly, we show that the columnar storage layer of an in-memory DBMS yields an easy adoption of efficient sparse matrix data types and algorithms. Furthermore, we show that the execution of linear algebra expressions significantly benefits from different techniques that are inspired from database technology. In a novel way, we implemented several of these optimization strategies in LAPEG’s optimizer (SpMachO), which uses an advanced density estimation method (SpProdest) to predict the matrix density of intermediate results. Moreover, we present an adaptive matrix data type AT Matrix to obviate the need of scientists for selecting appropriate matrix representations. The tiled substructure of AT Matrix is exploited by our matrix multiplication to saturate the different sockets of a multicore main-memory platform, reaching up to a speed-up of 6x compared to alternative approaches. Finally, a major part of this thesis is devoted to the topic of data manipulation; where we propose a matrix manipulation API and present different mutable matrix types to enable fast insertions and deletes. We finally conclude that our linear algebra engine is well-suited to process dynamic, large matrix workloads in an optimized way. In particular, the DBMS-integrated LAPEG is filling the linear algebra gap, and makes columnar in-memory DBMS attractive as efficient, scalable ad-hoc analysis platform for scientists

    IST Austria Thesis

    Get PDF
    This dissertation focuses on algorithmic aspects of program verification, and presents modeling and complexity advances on several problems related to the static analysis of programs, the stateless model checking of concurrent programs, and the competitive analysis of real-time scheduling algorithms. Our contributions can be broadly grouped into five categories. Our first contribution is a set of new algorithms and data structures for the quantitative and data-flow analysis of programs, based on the graph-theoretic notion of treewidth. It has been observed that the control-flow graphs of typical programs have special structure, and are characterized as graphs of small treewidth. We utilize this structural property to provide faster algorithms for the quantitative and data-flow analysis of recursive and concurrent programs. In most cases we make an algebraic treatment of the considered problem, where several interesting analyses, such as the reachability, shortest path, and certain kind of data-flow analysis problems follow as special cases. We exploit the constant-treewidth property to obtain algorithmic improvements for on-demand versions of the problems, and provide data structures with various tradeoffs between the resources spent in the preprocessing and querying phase. We also improve on the algorithmic complexity of quantitative problems outside the algebraic path framework, namely of the minimum mean-payoff, minimum ratio, and minimum initial credit for energy problems. Our second contribution is a set of algorithms for Dyck reachability with applications to data-dependence analysis and alias analysis. In particular, we develop an optimal algorithm for Dyck reachability on bidirected graphs, which are ubiquitous in context-insensitive, field-sensitive points-to analysis. Additionally, we develop an efficient algorithm for context-sensitive data-dependence analysis via Dyck reachability, where the task is to obtain analysis summaries of library code in the presence of callbacks. Our algorithm preprocesses libraries in almost linear time, after which the contribution of the library in the complexity of the client analysis is (i)~linear in the number of call sites and (ii)~only logarithmic in the size of the whole library, as opposed to linear in the size of the whole library. Finally, we prove that Dyck reachability is Boolean Matrix Multiplication-hard in general, and the hardness also holds for graphs of constant treewidth. This hardness result strongly indicates that there exist no combinatorial algorithms for Dyck reachability with truly subcubic complexity. Our third contribution is the formalization and algorithmic treatment of the Quantitative Interprocedural Analysis framework. In this framework, the transitions of a recursive program are annotated as good, bad or neutral, and receive a weight which measures the magnitude of their respective effect. The Quantitative Interprocedural Analysis problem asks to determine whether there exists an infinite run of the program where the long-run ratio of the bad weights over the good weights is above a given threshold. We illustrate how several quantitative problems related to static analysis of recursive programs can be instantiated in this framework, and present some case studies to this direction. Our fourth contribution is a new dynamic partial-order reduction for the stateless model checking of concurrent programs. Traditional approaches rely on the standard Mazurkiewicz equivalence between traces, by means of partitioning the trace space into equivalence classes, and attempting to explore a few representatives from each class. We present a new dynamic partial-order reduction method called the Data-centric Partial Order Reduction (DC-DPOR). Our algorithm is based on a new equivalence between traces, called the observation equivalence. DC-DPOR explores a coarser partitioning of the trace space than any exploration method based on the standard Mazurkiewicz equivalence. Depending on the program, the new partitioning can be even exponentially coarser. Additionally, DC-DPOR spends only polynomial time in each explored class. Our fifth contribution is the use of automata and game-theoretic verification techniques in the competitive analysis and synthesis of real-time scheduling algorithms for firm-deadline tasks. On the analysis side, we leverage automata on infinite words to compute the competitive ratio of real-time schedulers subject to various environmental constraints. On the synthesis side, we introduce a new instance of two-player mean-payoff partial-information games, and show how the synthesis of an optimal real-time scheduler can be reduced to computing winning strategies in this new type of games

    Interactive Visualization of Molecular Dynamics Simulation Data

    Get PDF
    Molecular Dynamics Simulations (MD) plays an essential role in the field of computational biology. The simulations produce extensive high-dimensional, spatio-temporal data describ-ing the motion of atoms and molecules. A central challenge in the field is the extraction and visualization of useful behavioral patterns from these simulations. Throughout this thesis, I collaborated with a computational biologist who works on Molecular Dynamics (MD) Simu-lation data. For the sake of exploration, I was provided with a large and complex membrane simulation. I contributed solutions to his data challenges by developing a set of novel visual-ization tools to help him get a better understanding of his simulation data. I employed both scientific and information visualization, and applied concepts of abstraction and dimensions projection in the proposed solutions. The first solution enables the user to interactively fil-ter and highlight dynamic and complex trajectory constituted by motions of molecules. The molecular dynamic trajectories are identified based on path length, edge length, curvature, and normalized curvature, and their combinations. The tool exploits new interactive visual-ization techniques and provides a combination of 2D-3D path rendering in a dual dimension representation to highlight differences arising from the 2D projection on a plane. The sec-ond solution introduces a novel abstract interaction space for Protein-Lipid interaction. The proposed solution addresses the challenge of visualizing complex, time-dependent interactions between protein and lipid molecules. It also proposes a fast GPU-based implementation that maps lipid-constituents involved in the interaction onto the abstract protein interaction space. I also introduced two abstract level-of-detail (LoD) representations with six levels of detail for lipid molecules and protein interaction. Finally, I proposed a novel framework consisting of four linked views: A time-dependent 3D view, a novel hybrid view, a clustering timeline, and a details-on-demand window. The framework exploits abstraction and projection to enable the user to study the molecular interaction and the behavior of the protein-protein interaction and clusters. I introduced a selection of visual designs to convey the behavior of protein-lipid interaction and protein-protein interaction through a unified coordinate system. Abstraction is used to present proteins in hybrid 2D space, and a projected tiled space is used to present both Protein-Lipid Interaction (PLI) and Protein-Protein Interaction (PPI) at the particle level in a heat-map style visual design. Glyphs are used to represent PPI at the molecular level. I coupled visually separable visual designs in a unified coordinate space. The result lets the user study both PLI and PPI separately, or together in a unified visual analysis framework

    The Epigenetic Research Program (EPR): a transdisciplinary approach for the dynamics of knowledge, society - and beyond

    Full text link
    'Mit dem 'epigenetischen Zugang' wurde ein einheitliches Forschungsprogramm aufgebaut, das zur Analyse von 'wissensbasierten Prozessen' in einer Unzahl von Bereichen dient. Konkret wurde mit dem epigenetischen Programm bislang auf der einen Seite ein anspruchsvolles 'transdisziplinäres Forschungsprogramm' konstruiert und auf der anderen Seite eine Reihe von Anwendungen im Bereich von Organisationsanalysen oder auch 'Nationalen Innovationssystemen' durchgeführt. Darüberhinaus erlaubt das epigenetische Programm, sich jenseits der gegenwärtig diskutierten Merkmale von 'Wissensgesellschaften' wie der Diffusion von Informations- und Kommunikationstechnologien oder der Ausweitung in den traditionellen Stätten der Wissensproduktion - Universitäten und Forschungsinstitute - zu bewegen. Zu guter Letzt sei der Hinweis angebracht, daß gerade die neue Architektur von Wissens- und Informationsgesellschaften innovative Schlaglichter auf Fragen der gesellschaftlichen Ungleichheit wirft und gegenwärtige Problemfelder in diesem Bereich scharf zu akzentuieren vermag.' (Autorenreferat)'With the 'epigenetic approach', an entire research program has been set up which is devoted to the study of 'knowledge-based processes' in human societies - and beyond. More concretely, an epigenetic approach has been built up in which two different areas are addressed and dealt with simultaneously, namely theoretical foundations for the analysis of 'knowledge based processes' and a comparatively large number of empirical applications, ranging from the study of organizations to the level of 'National Innovation Systems'. Moreover, the emphasis on 'knowledge and information societies' is not motivated by current reconfigurations via communication and information technologies or the expansion of 'knowledge generating capacities' beyond the confines of traditional universities or research institutes. Likewise, 'knowledge and information societies' are not conceptualized as a stage beyond socio-economic inequality, contrasting it, for example, to traditional 'class societies', but, once again, as a theoretical approach which offers new insights into the basic structure of current societal disparities.' (Autorenreferat)
    corecore