193 research outputs found

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    Regression tree construction by bootstrap: Model search for DRG-systems applied to Austrian health-data

    Get PDF
    Background. DRG-systems are used to allocate resources fairly to hospitals based on their performance. Statistically, this allocation is based on simple rules that can be modeled with regression trees. However, the resulting models often have to be adjusted manually to be medically reasonable and ethical. Methods. Despite the possibility of manual, performance degenerating adaptations of the original model, alternative trees are systematically searched. The bootstrap-based method bumping is used to build diverse and accurate regression tree models for DRG-systems. A two-step model selection approach is proposed. First, a reasonable model complexity is chosen, based on statistical, medical and economical considerations. Second, a medically meaningful and accurate model is selected. An analysis of 8 data-sets from Austrian DRG-data is conducted and evaluated based on the possibility to produce diverse and accurate models for predefined tree complexities. Results. The best bootstrap-based trees offer increased predictive accuracy compared to the trees built by the CART algorithm. The analysis demonstrates that even for very small tree sizes, diverse models can be constructed being equally or even more accurate than the single model built by the standard CART algorithm. Conclusions. Bumping is a powerful tool to construct diverse and accurate regression trees, to be used as candidate models for DRG-systems. Furthermore, Bumping and the proposed model selection approach are also applicable to other medical decision and prognosis tasks. 2010 Grubinger et al; licensee BioMed Central Ltd

    Field Guide to Genetic Programming

    Get PDF

    Assessing and improving the quality of model transformations

    Get PDF
    Software is pervading our society more and more and is becoming increasingly complex. At the same time, software quality demands remain at the same, high level. Model-driven engineering (MDE) is a software engineering paradigm that aims at dealing with this increasing software complexity and improving productivity and quality. Models play a pivotal role in MDE. The purpose of using models is to raise the level of abstraction at which software is developed to a level where concepts of the domain in which the software has to be applied, i.e., the target domain, can be expressed e??ectively. For that purpose, domain-speci??c languages (DSLs) are employed. A DSL is a language with a narrow focus, i.e., it is aimed at providing abstractions speci??c to the target domain. This makes that the application of models developed using DSLs is typically restricted to describing concepts existing in that target domain. Reuse of models such that they can be applied for di??erent purposes, e.g., analysis and code generation, is one of the challenges that should be solved by applying MDE. Therefore, model transformations are typically applied to transform domain-speci??c models to other (equivalent) models suitable for di??erent purposes. A model transformation is a mapping from a set of source models to a set of target models de??ned as a set of transformation rules. MDE is gradually being adopted by industry. Since MDE is becoming more and more important, model transformations are becoming more prominent as well. Model transformations are in many ways similar to traditional software artifacts. Therefore, they need to adhere to similar quality standards as well. The central research question discoursed in this thesis is therefore as follows. How can the quality of model transformations be assessed and improved, in particular with respect to development and maintenance? Recall that model transformations facilitate reuse of models in a software development process. We have developed a model transformation that enables reuse of analysis models for code generation. The semantic domains of the source and target language of this model transformation are so far apart that straightforward transformation is impossible, i.e., a semantic gap has to be bridged. To deal with model transformations that have to bridge a semantic gap, the semantics of the source and target language as well as possible additional requirements should be well understood. When bridging a semantic gap is not straightforward, we recommend to address a simpli??ed version of the source metamodel ??rst. Finally, the requirements on the transformation may, if possible, be relaxed to enable automated model transformation. Model transformations that need to transform between models in di??erent semantic domains are expected to be more complex than those that merely transform syntax. The complexity of a model transformation has consequences for its quality. Quality, in general, is a subjective concept. Therefore, quality can be de??ned in di??erent ways. We de??ned it in the context of model transformation. A model transformation can either be considered as a transformation de??nition or as the process of transforming a source model to a target model. Accordingly, model transformation quality can be de??ned in two di??erent ways. The quality of the de??nition is referred to as its internal quality. The quality of the process of transforming a source model to a target model is referred to as its external quality. There are also two ways to assess the quality of a model transformation (both internal and external). It can be assessed directly, i.e., by performing measurements on the transformation de??nition, or indirectly, i.e., by performing measurements in the environment of the model transformation. We mainly focused on direct assessment of internal quality. However, we also addressed external quality and indirect assessment. Given this de??nition of quality in the context of model transformations, techniques can be developed to assess it. Software metrics have been proposed for measuring various kinds of software artifacts. However, hardly any research has been performed on applying metrics for assessing the quality of model transformations. For four model transformation formalisms with di??fferent characteristics, viz., for ASF+SDF, ATL, Xtend, and QVTO, we de??ned sets of metrics for measuring model transformations developed with these formalisms. While these metric sets can be used to indicate bad smells in the code of model transformations, they cannot be used for assessing quality yet. A relation has to be established between the metric sets and attributes of model transformation quality. For two of the aforementioned metric sets, viz., the ones for ASF+SDF and for ATL, we conducted an empirical study aiming at establishing such a relation. From these empirical studies we learned what metrics serve as predictors for di??erent quality attributes of model transformations. Metrics can be used to quickly acquire insights into the characteristics of a model transformation. These insights enable increasing the overall quality of model transformations and thereby also their maintainability. To support maintenance, and also development in a traditional software engineering process, visualization techniques are often employed. For model transformations this appears as a feasible approach as well. Currently, however, there are few visualization techniques available tailored towards analyzing model transformations. One of the most time-consuming processes during software maintenance is acquiring understanding of the software. We expect that this holds for model transformations as well. Therefore, we presented two complementary visualization techniques for facilitating model transformation comprehension. The ??rst-technique is aimed at visualizing the dependencies between the components of a model transformation. The second technique is aimed at analyzing the coverage of the source and target metamodels by a model transformation. The development of the metric sets, and in particular the empirical studies, have led to insights considering the development of model transformations. Also, the proposed visualization techniques are aimed at facilitating the development of model transformations. We applied the insights acquired from the development of the metric sets as well as the visualization techniques in the development of a chain of model transformations that bridges a number of semantic gaps. We chose to solve this transformational problem not with one model transformation, but with a number of smaller model transformations. This should lead to smaller transformations, which are more understandable. The language on which the model transformations are de??ned, was subject to evolution. In particular the coverage visualization proved to be bene??cial for the co-evolution of the model transformations. Summarizing, we de??ned quality in the context of model transformations and addressed the necessity for a methodology to assess it. Therefore, we de??ned metric sets and performed empirical studies to validate whether they serve as predictors for model transformation quality. We also proposed a number of visualizations to increase model transformation comprehension. The acquired insights from developing the metric sets and the empirical studies, as well as the visualization tools, proved to be bene??cial for developing model transformations

    Algorithms to Explore the Structure and Evolution of Biological Networks

    Get PDF
    High-throughput experimental protocols have revealed thousands of relationships amongst genes and proteins under various conditions. These putative associations are being aggressively mined to decipher the structural and functional architecture of the cell. One useful tool for exploring this data has been computational network analysis. In this thesis, we propose a collection of novel algorithms to explore the structure and evolution of large, noisy, and sparsely annotated biological networks. We first introduce two information-theoretic algorithms to extract interesting patterns and modules embedded in large graphs. The first, graph summarization, uses the minimum description length principle to find compressible parts of the graph. The second, VI-Cut, uses the variation of information to non-parametrically find groups of topologically cohesive and similarly annotated nodes in the network. We show that both algorithms find structure in biological data that is consistent with known biological processes, protein complexes, genetic diseases, and operational taxonomic units. We also propose several algorithms to systematically generate an ensemble of near-optimal network clusterings and show how these multiple views can be used together to identify clustering dynamics that any single solution approach would miss. To facilitate the study of ancient networks, we introduce a framework called ``network archaeology'') for reconstructing the node-by-node and edge-by-edge arrival history of a network. Starting with a present-day network, we apply a probabilistic growth model backwards in time to find high-likelihood previous states of the graph. This allows us to explore how interactions and modules may have evolved over time. In experiments with real-world social and biological networks, we find that our algorithms can recover significant features of ancestral networks that have long since disappeared. Our work is motivated by the need to understand large and complex biological systems that are being revealed to us by imperfect data. As data continues to pour in, we believe that computational network analysis will continue to be an essential tool towards this end

    Cyber Security and Critical Infrastructures

    Get PDF
    This book contains the manuscripts that were accepted for publication in the MDPI Special Topic "Cyber Security and Critical Infrastructure" after a rigorous peer-review process. Authors from academia, government and industry contributed their innovative solutions, consistent with the interdisciplinary nature of cybersecurity. The book contains 16 articles: an editorial explaining current challenges, innovative solutions, real-world experiences including critical infrastructure, 15 original papers that present state-of-the-art innovative solutions to attacks on critical systems, and a review of cloud, edge computing, and fog's security and privacy issues

    QUALITY-DRIVEN CROSS LAYER DESIGN FOR MULTIMEDIA SECURITY OVER RESOURCE CONSTRAINED WIRELESS SENSOR NETWORKS

    Get PDF
    The strong need for security guarantee, e.g., integrity and authenticity, as well as privacy and confidentiality in wireless multimedia services has driven the development of an emerging research area in low cost Wireless Multimedia Sensor Networks (WMSNs). Unfortunately, those conventional encryption and authentication techniques cannot be applied directly to WMSNs due to inborn challenges such as extremely limited energy, computing and bandwidth resources. This dissertation provides a quality-driven security design and resource allocation framework for WMSNs. The contribution of this dissertation bridges the inter-disciplinary research gap between high layer multimedia signal processing and low layer computer networking. It formulates the generic problem of quality-driven multimedia resource allocation in WMSNs and proposes a cross layer solution. The fundamental methodologies of multimedia selective encryption and stream authentication, and their application to digital image or video compression standards are presented. New multimedia selective encryption and stream authentication schemes are proposed at application layer, which significantly reduces encryption/authentication complexity. In addition, network resource allocation methodologies at low layers are extensively studied. An unequal error protection-based network resource allocation scheme is proposed to achieve the best effort media quality with integrity and energy efficiency guarantee. Performance evaluation results show that this cross layer framework achieves considerable energy-quality-security gain by jointly designing multimedia selective encryption/multimedia stream authentication and communication resource allocation
    • …
    corecore