16 research outputs found

    A Field Guide to Genetic Programming

    Get PDF
    xiv, 233 p. : il. ; 23 cm.Libro ElectrónicoA Field Guide to Genetic Programming (ISBN 978-1-4092-0073-4) is an introduction to genetic programming (GP). GP is a systematic, domain-independent method for getting computers to solve problems automatically starting from a high-level statement of what needs to be done. Using ideas from natural evolution, GP starts from an ooze of random computer programs, and progressively refines them through processes of mutation and sexual recombination, until solutions emerge. All this without the user having to know or specify the form or structure of solutions in advance. GP has generated a plethora of human-competitive results and applications, including novel scientific discoveries and patentable inventions. The authorsIntroduction -- Representation, initialisation and operators in Tree-based GP -- Getting ready to run genetic programming -- Example genetic programming run -- Alternative initialisations and operators in Tree-based GP -- Modular, grammatical and developmental Tree-based GP -- Linear and graph genetic programming -- Probalistic genetic programming -- Multi-objective genetic programming -- Fast and distributed genetic programming -- GP theory and its applications -- Applications -- Troubleshooting GP -- Conclusions.Contents xi 1 Introduction 1.1 Genetic Programming in a Nutshell 1.2 Getting Started 1.3 Prerequisites 1.4 Overview of this Field Guide I Basics 2 Representation, Initialisation and GP 2.1 Representation 2.2 Initialising the Population 2.3 Selection 2.4 Recombination and Mutation Operators in Tree-based 3 Getting Ready to Run Genetic Programming 19 3.1 Step 1: Terminal Set 19 3.2 Step 2: Function Set 20 3.2.1 Closure 21 3.2.2 Sufficiency 23 3.2.3 Evolving Structures other than Programs 23 3.3 Step 3: Fitness Function 24 3.4 Step 4: GP Parameters 26 3.5 Step 5: Termination and solution designation 27 4 Example Genetic Programming Run 4.1 Preparatory Steps 29 4.2 Step-by-Step Sample Run 31 4.2.1 Initialisation 31 4.2.2 Fitness Evaluation Selection, Crossover and Mutation Termination and Solution Designation Advanced Genetic Programming 5 Alternative Initialisations and Operators in 5.1 Constructing the Initial Population 5.1.1 Uniform Initialisation 5.1.2 Initialisation may Affect Bloat 5.1.3 Seeding 5.2 GP Mutation 5.2.1 Is Mutation Necessary? 5.2.2 Mutation Cookbook 5.3 GP Crossover 5.4 Other Techniques 32 5.5 Tree-based GP 39 6 Modular, Grammatical and Developmental Tree-based GP 47 6.1 Evolving Modular and Hierarchical Structures 47 6.1.1 Automatically Defined Functions 48 6.1.2 Program Architecture and Architecture-Altering 50 6.2 Constraining Structures 51 6.2.1 Enforcing Particular Structures 52 6.2.2 Strongly Typed GP 52 6.2.3 Grammar-based Constraints 53 6.2.4 Constraints and Bias 55 6.3 Developmental Genetic Programming 57 6.4 Strongly Typed Autoconstructive GP with PushGP 59 7 Linear and Graph Genetic Programming 61 7.1 Linear Genetic Programming 61 7.1.1 Motivations 61 7.1.2 Linear GP Representations 62 7.1.3 Linear GP Operators 64 7.2 Graph-Based Genetic Programming 65 7.2.1 Parallel Distributed GP (PDGP) 65 7.2.2 PADO 67 7.2.3 Cartesian GP 67 7.2.4 Evolving Parallel Programs using Indirect Encodings 68 8 Probabilistic Genetic Programming 8.1 Estimation of Distribution Algorithms 69 8.2 Pure EDA GP 71 8.3 Mixing Grammars and Probabilities 74 9 Multi-objective Genetic Programming 75 9.1 Combining Multiple Objectives into a Scalar Fitness Function 75 9.2 Keeping the Objectives Separate 76 9.2.1 Multi-objective Bloat and Complexity Control 77 9.2.2 Other Objectives 78 9.2.3 Non-Pareto Criteria 80 9.3 Multiple Objectives via Dynamic and Staged Fitness Functions 80 9.4 Multi-objective Optimisation via Operator Bias 81 10 Fast and Distributed Genetic Programming 83 10.1 Reducing Fitness Evaluations/Increasing their Effectiveness 83 10.2 Reducing Cost of Fitness with Caches 86 10.3 Parallel and Distributed GP are Not Equivalent 88 10.4 Running GP on Parallel Hardware 89 10.4.1 Master–slave GP 89 10.4.2 GP Running on GPUs 90 10.4.3 GP on FPGAs 92 10.4.4 Sub-machine-code GP 93 10.5 Geographically Distributed GP 93 11 GP Theory and its Applications 97 11.1 Mathematical Models 98 11.2 Search Spaces 99 11.3 Bloat 101 11.3.1 Bloat in Theory 101 11.3.2 Bloat Control in Practice 104 III Practical Genetic Programming 12 Applications 12.1 Where GP has Done Well 12.2 Curve Fitting, Data Modelling and Symbolic Regression 12.3 Human Competitive Results – the Humies 12.4 Image and Signal Processing 12.5 Financial Trading, Time Series, and Economic Modelling 12.6 Industrial Process Control 12.7 Medicine, Biology and Bioinformatics 12.8 GP to Create Searchers and Solvers – Hyper-heuristics xiii 12.9 Entertainment and Computer Games 127 12.10The Arts 127 12.11Compression 128 13 Troubleshooting GP 13.1 Is there a Bug in the Code? 13.2 Can you Trust your Results? 13.3 There are No Silver Bullets 13.4 Small Changes can have Big Effects 13.5 Big Changes can have No Effect 13.6 Study your Populations 13.7 Encourage Diversity 13.8 Embrace Approximation 13.9 Control Bloat 13.10 Checkpoint Results 13.11 Report Well 13.12 Convince your Customers 14 Conclusions Tricks of the Trade A Resources A.1 Key Books A.2 Key Journals A.3 Key International Meetings A.4 GP Implementations A.5 On-Line Resources 145 B TinyGP 151 B.1 Overview of TinyGP 151 B.2 Input Data Files for TinyGP 153 B.3 Source Code 154 B.4 Compiling and Running TinyGP 162 Bibliography 167 Inde

    Sound Processing for Autonomous Driving

    Get PDF
    Nowadays, a variety of intelligent systems for autonomous driving have been developed, which have already shown a very high level of capability. One of the prerequisites for autonomous driving is an accurate and reliable representation of the environment around the vehicle. Current systems rely on cameras, RADAR, and LiDAR to capture the visual environment and to locate and track other traffic participants. Human drivers, in addition to vision, have hearing and use a lot of auditory information to understand the environment in addition to visual cues. In this thesis, we present the sound signal processing system for auditory based environment representation. Sound propagation is less dependent on occlusion than all other types of sensors and in some situations is less sensitive to different types of weather conditions such as snow, ice, fog or rain. Various audio processing algorithms provide the detection and classification of different audio signals specific to certain types of vehicles, as well as localization. First, the ambient sound is classified into fourteen major categories consisting of traffic objects and actions performed. Additionally, the classification of three specific types of emergency vehicles sirens is provided. Secondly, each object is localized using a combined localization algorithm based on time difference of arrival and amplitude. The system is evaluated on real data with a focus on reliable detection and accurate localization of emergency vehicles. On the third stage the possibility of visualizing the sound source on the image from the autonomous vehicle camera system is provided. For this purpose, a method for camera to microphones calibration has been developed. The presented approaches and methods have great potential to increase the accuracy of environment perception and, consequently, to improve the reliability and safety of autonomous driving systems in general

    Field Guide to Genetic Programming

    Get PDF

    Harnessing the Power of Generative Models for Mobile Continuous and Implicit Authentication

    Get PDF
    Authenticating a user's identity lies at the heart of securing any information system. A trade off exists currently between user experience and the level of security the system abides by. Using Continuous and Implicit Authentication a user's identity can be verified without any active participation, hence increasing the level of security, given the continuous verification aspect, as well as the user experience, given its implicit nature. This thesis studies using mobile devices inertial sensors data to identify unique movements and patterns that identify the owner of the device at all times. We implement, and evaluate approaches proposed in related works as well as novel approaches based on a variety of machine learning models, specifically a new kind of Auto Encoder (AE) named Variational Auto Encoder (VAE), relating to the generative models family. We evaluate numerous machine learning models for the anomaly detection or outlier detection case of spotting a malicious user, or an unauthorised entity currently using the smartphone system. We evaluate the results under conditions similar to other works as well as under conditions typically observed in real-world applications. We find that the shallow VAE is the best performer semi-supervised anomaly detector in our evaluations and hence the most suitable for the design proposed. The thesis concludes with recommendations for the enhancement of the system and the research body dedicated to the domain of Continuous and Implicit Authentication for mobile security

    Workshop Proceedings of the 12th edition of the KONVENS conference

    Get PDF
    The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut fĂŒr Informationswissenschaft und Sprachtechnologie of UniversitĂ€t Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Improving transductive data selection algorithms for machine translation

    Get PDF
    In this work, we study different ways of improving Machine Translation models by using the subset of training data that is the most relevant to the test set. This is achieved by using Transductive Algoritms (TA) for data selection. In particular, we explore two methods: Infrequent N-gram Recovery (INR) and Feature Decay Algorithms (FDA). Statistical Machine Translation (SMT) models do not always perform better when more data are used for training. Using these techniques to extract the training sentences leads to a better performance of the models for translating a particular test set than using the complete training dataset. Neural Machine Translation (NMT) can outperform SMT models, but they require more data to achieve the best performance. In this thesis, we explore how INR and FDA can also be beneficial to improving NMT models with just a fraction of the available data. On top of that, we propose several improvements for these data-selection methods by exploiting the information on the target side. First, we use the alignment between words in the source and target sides to modify the selection criteria of these methods. Those sentences containing n-grams that are more difficult to translate should be promoted so that more occurrences of these n-grams are selected. Another extension proposed is to select sentences based not on the test set but on an MT-generated approximated translation (so the target-side of the sentences are considered in the selection criteria). Finally, target-language sentences can be translated into the source-language so that INR and FDA have more candidates to select sentences from

    Design and development of prognostic and health management system for fly-by-wire primary flight control

    Get PDF
    Electro-Hydraulic Servo Actuators (EHSA) is the principal technology used for primary flight control in new aircrafts and legacy platforms. The development of Prognostic and Health Management technologies and their application to EHSA systems is of great interest in both the aerospace industry and the air fleet operators. This Ph.D. thesis is the results of research activity focused on the development of a PHM system for servovalve of fly-by-wire primary flight EHSA. One of the key features of the research is the implementation of a PHM system without the addition of new sensors, taking advantage of sensing and information already available. This choice allows extending the PHM capability to the EHSAs of legacy platforms and not only to new aircrafts. The enabling technologies borrow from the area of Bayesian estimation theory and specifically particle filtering and the information acquired from EHSA during pre-flight check is processed by appropriate algorithms in order to obtain relevant features, detect the degradation and estimate the Remaining Useful Life (RUL). The results are evaluated through appropriate metrics in order to assess the performance and effectiveness of the implemented PHM system. The major objective of this contribution is to develop an innovative fault diagnosis and failure prognosis framework for critical aircraft components that integrates effectively mathematically rigorous and validated signal processing, feature extraction, diagnostic and prognostic algorithms with novel uncertainty representation and management tools in a platform that is computationally efficient and ready to be transitioned on-board an aircraft

    Information technologies for pain management

    Get PDF
    Millions of people around the world suffer from pain, acute or chronic and this raises the importance of its screening, assessment and treatment. The importance of pain is attested by the fact that it is considered the fifth vital sign for indicating basic bodily functions, health and quality of life, together with the four other vital signs: blood pressure, body temperature, pulse rate and respiratory rate. However, while these four signals represent an objective physical parameter, the occurrence of pain expresses an emotional status that happens inside the mind of each individual and therefore, is highly subjective that makes difficult its management and evaluation. For this reason, the self-report of pain is considered the most accurate pain assessment method wherein patients should be asked to periodically rate their pain severity and related symptoms. Thus, in the last years computerised systems based on mobile and web technologies are becoming increasingly used to enable patients to report their pain which lead to the development of electronic pain diaries (ED). This approach may provide to health care professionals (HCP) and patients the ability to interact with the system anywhere and at anytime thoroughly changes the coordinates of time and place and offers invaluable opportunities to the healthcare delivery. However, most of these systems were designed to interact directly to patients without presence of a healthcare professional or without evidence of reliability and accuracy. In fact, the observation of the existing systems revealed lack of integration with mobile devices, limited use of web-based interfaces and reduced interaction with patients in terms of obtaining and viewing information. In addition, the reliability and accuracy of computerised systems for pain management are rarely proved or their effects on HCP and patients outcomes remain understudied. This thesis is focused on technology for pain management and aims to propose a monitoring system which includes ubiquitous interfaces specifically oriented to either patients or HCP using mobile devices and Internet so as to allow decisions based on the knowledge obtained from the analysis of the collected data. With the interoperability and cloud computing technologies in mind this system uses web services (WS) to manage data which are stored in a Personal Health Record (PHR). A Randomised Controlled Trial (RCT) was implemented so as to determine the effectiveness of the proposed computerised monitoring system. The six weeks RCT evidenced the advantages provided by the ubiquitous access to HCP and patients so as to they were able to interact with the system anywhere and at anytime using WS to send and receive data. In addition, the collected data were stored in a PHR which offers integrity and security as well as permanent on line accessibility to both patients and HCP. The study evidenced not only that the majority of participants recommend the system, but also that they recognize it suitability for pain management without the requirement of advanced skills or experienced users. Furthermore, the system enabled the definition and management of patient-oriented treatments with reduced therapist time. The study also revealed that the guidance of HCP at the beginning of the monitoring is crucial to patients' satisfaction and experience stemming from the usage of the system as evidenced by the high correlation between the recommendation of the application, and it suitability to improve pain management and to provide medical information. There were no significant differences regarding to improvements in the quality of pain treatment between intervention group and control group. Based on the data collected during the RCT a clinical decision support system (CDSS) was developed so as to offer capabilities of tailored alarms, reports, and clinical guidance. This CDSS, called Patient Oriented Method of Pain Evaluation System (POMPES), is based on the combination of several statistical models (one-way ANOVA, Kruskal-Wallis and Tukey-Kramer) with an imputation model based on linear regression. This system resulted in fully accuracy related to decisions suggested by the system compared with the medical diagnosis, and therefore, revealed it suitability to manage the pain. At last, based on the aerospace systems capability to deal with different complex data sources with varied complexities and accuracies, an innovative model was proposed. This model is characterized by a qualitative analysis stemming from the data fusion method combined with a quantitative model based on the comparison of the standard deviation together with the values of mathematical expectations. This model aimed to compare the effects of technological and pen-and-paper systems when applied to different dimension of pain, such as: pain intensity, anxiety, catastrophizing, depression, disability and interference. It was observed that pen-and-paper and technology produced equivalent effects in anxiety, depression, interference and pain intensity. On the contrary, technology evidenced favourable effects in terms of catastrophizing and disability. The proposed method revealed to be suitable, intelligible, easy to implement and low time and resources consuming. Further work is needed to evaluate the proposed system to follow up participants for longer periods of time which includes a complementary RCT encompassing patients with chronic pain symptoms. Finally, additional studies should be addressed to determine the economic effects not only to patients but also to the healthcare system

    XV. Magyar Szåmítógépes Nyelvészeti Konferencia

    Get PDF
    corecore