4 research outputs found

    Preliminary Results For Neuroevolutionary Optimization Phase Order Generation For Static Compilation

    No full text
    There is a complex web of interactions between optimization phases in static program compilation. Because there are many different types of optimizations, and each changes the form of the program and can impact the result of subsequent optimizations, the selection of optimizations to apply is challenging and is known as the optimization phase ordering problem. There is a need to effectively optimize the order of the optimizations and specific optimizations used based on the statistics and other features of the program to gain the most benefit. In this work we propose the use of evolved neural networks to intelligently choose which optimizations are applied and in what order, to a method or a program as a whole, based on its features. In this paper we study the use of the memetic algorithm-based neuroevolutionary system called DXNN, and a genetic algorithm based neuroevolutionary system called NEAT, to evolve such neural networks. Copyright 2014 ACM

    Data Representation in Machine Learning Methods with its Application to Compilation Optimization and Epitope Prediction

    Get PDF
    In this dissertation we explore the application of machine learning algorithms to compilation phase order optimization, and epitope prediction. The common thread running through these two disparate domains is the type of data being dealt with. In both problem domains we are dealing with categorical data, with its representation playing a significant role in the performance of classification algorithms. We first present a neuroevolutionary approach which orders optimization phases to generate compiled programs with performance superior to those compiled using LLVM\u27s -O3 optimization level. Performance improvements calculated as the speed of the compiled program\u27s execution ranged from 27% for the ccbench program, to 40.8% for bzip2. This dissertation then explores the problem of data representation of 3D biological data, such as amino acids. A new approach for distributed representation of 3D biological data through the process of embedding is proposed and explored. Analogously to word embedding, we developed a system that uses atomic and residue coordinates to generate distributed representation for residues, which we call 3D Residue BioVectors. Preliminary results are presented which demonstrate that even the low dimensional 3D Residue BioVectors can be used to predict conformational epitopes and protein-protein interactions, with promising proficiency. The generation of such 3D BioVectors, and the proposed methodology, opens the door for substantial future improvements, and application domains. The dissertation then explores the problem domain of linear B-Cell epitope prediction. This problem domain deals with predicting epitopes based strictly on the protein sequence. We present the DRREP system, which demonstrates how an ensemble of shallow neural networks can be combined with string kernels and analytical learning algorithm to produce state of the art epitope prediction results. DRREP was tested on the SARS subsequence, the HIV, Pellequer, AntiJen datasets, and the standard SEQ194 test dataset. AUC improvements achieved over the state of the art ranged from 3% to 8%. Finally, we present the SEEP epitope classifier, which is a multi-resolution SMV ensemble based classifier which uses conjoint triad feature representation, and produces state of the art classification results. SEEP leverages the domain specific knowledge based protein sequence encoding developed within the protein-protein interaction research domain. Using an ensemble of multi-resolution SVMs, and a sliding window based pre and post processing pipeline, SEEP achieves an AUC of 91.2 on the standard SEQ194 test dataset, a 24% improvement over the state of the art

    Inductive Logic Programming for Compiler Tuning

    Get PDF
    corecore