8,591 research outputs found

    Automatically Defined Templates for Improved Prediction of Non-stationary, Nonlinear Time Series in Genetic Programming

    Get PDF
    Soft methods of artificial intelligence are often used in the prediction of non-deterministic time series that cannot be modeled using standard econometric methods. These series, such as occur in finance, often undergo changes to their underlying data generation process resulting in inaccurate approximations or requiring additional human judgment and input in the process, hindering the potential for automated solutions. Genetic programming (GP) is a class of nature-inspired algorithms that aims to evolve a population of computer programs to solve a target problem. GP has been applied to time series prediction in finance and other domains. However, most GP-based approaches to these prediction problems do not consider regime change. This paper introduces two new genetic programming modularity techniques, collectively referred to as automatically defined templates, which better enable prediction of time series involving regime change. These methods, based on earlier established GP modularity techniques, take inspiration from software design patterns and are more closely modeled after the way humans actually develop software. Specifically, a regime detection branch is incorporated into the GP paradigm. Regime specific behavior evolves in a separate program branch, implementing the template method pattern. A system was developed to test, validate, and compare the proposed approach with earlier approaches to GP modularity. Prediction experiments were performed on synthetic time series and on the S&P 500 index. The performance of the proposed approach was evaluated by comparing prediction accuracy with existing methods. One of the two techniques proposed is shown to significantly improve performance of time series prediction in series undergoing regime change. The second proposed technique did not show any improvement and performed generally worse than existing methods or the canonical approaches. The difference in relative performance was shown to be due to a decoupling of reusable modules from the evolving main program population. This observation also explains earlier results regarding the inferior performance of genetic programming techniques using a similar, decoupled approach. Applied to financial time series prediction, the proposed approach beat a buy and hold return on the S&P 500 index as well as the return achieved by other regime aware genetic programming methodologies. No approach tested beat the benchmark return when factoring in transaction costs

    Geometric combinatorics and computational molecular biology: branching polytopes for RNA sequences

    Full text link
    Questions in computational molecular biology generate various discrete optimization problems, such as DNA sequence alignment and RNA secondary structure prediction. However, the optimal solutions are fundamentally dependent on the parameters used in the objective functions. The goal of a parametric analysis is to elucidate such dependencies, especially as they pertain to the accuracy and robustness of the optimal solutions. Techniques from geometric combinatorics, including polytopes and their normal fans, have been used previously to give parametric analyses of simple models for DNA sequence alignment and RNA branching configurations. Here, we present a new computational framework, and proof-of-principle results, which give the first complete parametric analysis of the branching portion of the nearest neighbor thermodynamic model for secondary structure prediction for real RNA sequences.Comment: 17 pages, 8 figure

    Evolutionary Computation and QSAR Research

    Get PDF
    [Abstract] The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.Instituto de Salud Carlos III, PIO52048Instituto de Salud Carlos III, RD07/0067/0005Ministerio de Industria, Comercio y Turismo; TSI-020110-2009-53)Galicia. ConsellerĂ­a de EconomĂ­a e Industria; 10SIN105004P

    Metaheuristic Design Patterns: New Perspectives for Larger-Scale Search Architectures

    Get PDF
    Design patterns capture the essentials of recurring best practice in an abstract form. Their merits are well established in domains as diverse as architecture and software development. They offer significant benefits, not least a common conceptual vocabulary for designers, enabling greater communication of high-level concerns and increased software reuse. Inspired by the success of software design patterns, this chapter seeks to promote the merits of a pattern-based method to the development of metaheuristic search software components. To achieve this, a catalog of patterns is presented, organized into the families of structural, behavioral, methodological and component-based patterns. As an alternative to the increasing specialization associated with individual metaheuristic search components, the authors encourage computer scientists to embrace the ‘cross cutting' benefits of a pattern-based perspective to optimization algorithms. Some ways in which the patterns might form the basis of further larger-scale metaheuristic component design automation are also discussed

    Navigation for automatic guided vehicles using omnidirectional optical sensing

    Get PDF
    Thesis (M. Tech. (Engineering: Electrical)) -- Central University of technology, Free State, 2013Automatic Guided Vehicles (AGVs) are being used more frequently in a manufacturing environment. These AGVs are navigated in many different ways, utilising multiple types of sensors for detecting the environment like distance, obstacles, and a set route. Different algorithms or methods are then used to utilise this environmental information for navigation purposes applied onto the AGV for control purposes. Developing a platform that could be easily reconfigured in alternative route applications utilising vision was one of the aims of the research. In this research such sensors detecting the environment was replaced and/or minimised by the use of a single, omnidirectional Webcam picture stream utilising an own developed mirror and Perspex tube setup. The area of interest in each frame was extracted saving on computational recourses and time. By utilising image processing, the vehicle was navigated on a predetermined route. Different edge detection methods and segmentation methods were investigated on this vision signal for route and sign navigation. Prewitt edge detection was eventually implemented, Hough transfers used for border detection and Kalman filtering for minimising border detected noise for staying on the navigated route. Reconfigurability was added to the route layout by coloured signs incorporated in the navigation process. The result was the manipulation of a number of AGV’s, each on its own designated coloured signed route. This route could be reconfigured by the operator with no programming alteration or intervention. The YCbCr colour space signal was implemented in detecting specific control signs for alternative colour route navigation. The result was used generating commands to control the AGV through serial commands sent on a laptop’s Universal Serial Bus (USB) port with a PIC microcontroller interface board controlling the motors by means of pulse width modulation (PWM). A total MATLAB® software development platform was utilised by implementing written M-files, Simulink® models, masked function blocks and .mat files for sourcing the workspace variables and generating executable files. This continuous development system lends itself to speedy evaluation and implementation of image processing options on the AGV. All the work done in the thesis was validated by simulations using actual data and by physical experimentation

    ePlant and the 3D Data Display Initiative: Integrative Systems Biology on the World Wide Web

    Get PDF
    Visualization tools for biological data are often limited in their ability to interactively integrate data at multiple scales. These computational tools are also typically limited by two-dimensional displays and programmatic implementations that require separate configurations for each of the user's computing devices and recompilation for functional expansion. Towards overcoming these limitations we have developed “ePlant” (http://bar.utoronto.ca/eplant) – a suite of open-source world wide web-based tools for the visualization of large-scale data sets from the model organism Arabidopsis thaliana. These tools display data spanning multiple biological scales on interactive three-dimensional models. Currently, ePlant consists of the following modules: a sequence conservation explorer that includes homology relationships and single nucleotide polymorphism data, a protein structure model explorer, a molecular interaction network explorer, a gene product subcellular localization explorer, and a gene expression pattern explorer. The ePlant's protein structure explorer module represents experimentally determined and theoretical structures covering >70% of the Arabidopsis proteome. The ePlant framework is accessed entirely through a web browser, and is therefore platform-independent. It can be applied to any model organism. To facilitate the development of three-dimensional displays of biological data on the world wide web we have established the “3D Data Display Initiative” (http://3ddi.org)

    The search for spinning black hole binaries in mock LISA data using a genetic algorithm

    Full text link
    Coalescing massive Black Hole binaries are the strongest and probably the most important gravitational wave sources in the LISA band. The spin and orbital precessions bring complexity in the waveform and make the likelihood surface richer in structure as compared to the non-spinning case. We introduce an extended multimodal genetic algorithm which utilizes the properties of the signal and the detector response function to analyze the data from the third round of mock LISA data challenge (MLDC 3.2). The performance of this method is comparable, if not better, to already existing algorithms. We have found all five sources present in MLDC 3.2 and recovered the coalescence time, chirp mass, mass ratio and sky location with reasonable accuracy. As for the orbital angular momentum and two spins of the Black Holes, we have found a large number of widely separated modes in the parameter space with similar maximum likelihood values.Comment: 25 pages, 9 figure
    • …
    corecore