    End-to-End Translation Validation for the Halide Language

    International audienceThis paper considers the correctness of domain-specific compilers for tensor programming languages through the study of Halide, a popular representative. It describes a translation validation algorithm for affine Halide specifications, independently of the scheduling language. The algorithm relies on "propheticž annotations added by the compiler to the generated array assignments. The annotations provide a refinement mapping from assignments in the generated code to the tensor definitions from the specification. Our implementation leverages an affine solver and a general SMT solver, and scales to complete Halide benchmarks

    Análisis de recursos de programas enteros y abstractos

    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Sistemas lnformáticos y de Computación, leída el 27-05-2022Since the beginning of automated computing in the middle of the last century, the development of computer science has been linked to an increasing importance in all areas of the current society. The inclusion of computer science processes in everyday life and, in particular, its inclusion in critical situations, cannot go linked only to the generation of hardware and software, but also to the analysis and verification of all its components. While hardware analysis is crucial for the generation and maintenance of the computation infrastructure, as it is able to detect or predict components that can have a wrong behavior, software analysis focuses on analyzing the behavior of computer programs to address properties such as security, correctness or optimality. Depending on the type of analysis applied to the software, we can detect potential vulnerabilities in the code, find incorrect specifications, apply optimizations based on the maximun and minimun cost of the programs, calculate the resource consumption of a program..Desde el comienzo de la computación automática a mediados del siglo pasado, el avance de la informática ha ido ligado a una cada vez mayor importancia en todos los ámbitos d ela sociedad actual. La inclusión de procesos informáticos en la vida cotidiana y, en particular, su inclusión en situaciones críticas, no puede ir ligada solo a la generación del hardware el software, sino también al análisis y verificación de todos sus componentes. Mientras que el análisis de hardware es crucial para la generación de la infraestructura informática y el mantenimiento de la misma, detectando o prediciendo componentes que puedan funcionar de manera errónea, el análisis de software se enfoca hacia el análisis del comportamiento de los programas informáticos para abordar propiedades como la seguridad, la corrección o la optimalidad. Dependiendo del tipo de análisis aplicado al software, podremos detectar fragmentos de código potencialmente vulnerables, especificaciones incorrectas, aplicar optimizaciones en base al coste máximo y mínimo de los programas, calcular el consumo de recursos de un programa...Fac. de InformáticaTRUEunpu

    Logical Inference Techniques for Loop Parallelization

    This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop’s memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S = ∅, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) ⇒ S = ∅). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient-independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECT-CLUB and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers

    Improved static analysis and verification of energy consumption and other resources via abstract interpretation

    Resource analysis aims at inferring the cost of executing programs for any possible input, in terms of a given resource, such as the traditional execution steps, time ormemory, and, more recently energy consumption or user defined resources (e.g., number of bits sent over a socket, number of database accesses, number of calls to particular procedures, etc.). This is performed statically, i.e., without actually running the programs. Resource usage information is useful for a variety of optimization and verification applications, as well as for guiding software design. For example, programmers can use such information to choose different algorithmic solutions to a problem; program transformation systems can use cost information to choose between alternative transformations; parallelizing compilers can use cost estimates for granularity control, which tries to balance the overheads of task creation and manipulation against the benefits of parallelization. In this thesis we have significatively improved an existing prototype implementation for resource usage analysis based on abstract interpretation, addressing a number of relevant challenges and overcoming many limitations it presented. The goal of that prototype was to show the viability of casting the resource analysis as an abstract domain, and howit could overcome important limitations of the state-of-the-art resource usage analysis tools. For this purpose, it was implemented as an abstract domain in the abstract interpretation framework of the CiaoPP system, PLAI.We have improved both the design and implementation of the prototype, for eventually allowing an evolution of the tool to the industrial application level. The abstract operations of such tool heavily depend on the setting up and finding closed-form solutions of recurrence relations representing the resource usage behavior of program components and the whole program as well. While there exist many tools, such as Computer Algebra Systems (CAS) and libraries able to find closed-form solutions for some types of recurrences, none of them alone is able to handle all the types of recurrences arising during program analysis. In addition, there are some types of recurrences that cannot be solved by any existing tool. This clearly constitutes a bottleneck for this kind of resource usage analysis. Thus, one of the major challenges we have addressed in this thesis is the design and development of a novel modular framework for solving recurrence relations, able to combine and take advantage of the results of existing solvers. Additionally, we have developed and integrated into our novel solver a technique for finding upper-bound closed-form solutions of a special class of recurrence relations that arise during the analysis of programs with accumulating parameters. Finally, we have integrated the improved resource analysis into the CiaoPP general framework for resource usage verification, and specialized the framework for verifying energy consumption specifications of embedded imperative programs in a real application, showing the usefulness and practicality of the resulting tool.---ABSTRACT---El Análisis de recursos tiene como objetivo inferir el coste de la ejecución de programas para cualquier entrada posible, en términos de algún recurso determinado, como pasos de ejecución, tiempo o memoria, y, más recientemente, el consumo de energía o recursos definidos por el usuario (por ejemplo, número de bits enviados a través de un socket, el número de accesos a una base de datos, cantidad de llamadas a determinados procedimientos, etc.). Ello se realiza estáticamente, es decir, sin necesidad de ejecutar los programas. La información sobre el uso de recursos resulta muy útil para una gran variedad de aplicaciones de optimización y verificación de programas, así como para asistir en el diseño de los mismos. Por ejemplo, los programadores pueden utilizar dicha información para elegir diferentes soluciones algorítmicas a un problema; los sistemas de transformación de programas pueden utilizar la información de coste para elegir entre transformaciones alternativas; los compiladores paralelizantes pueden utilizar las estimaciones de coste para realizar control de granularidad, el cual trata de equilibrar el coste debido a la creación y gestión de tareas, con los beneficios de la paralelización. En esta tesis hemos mejorado de manera significativa la implementación de un prototipo existente para el análisis del uso de recursos basado en interpretación abstracta, abordando diversos desafíos relevantes y superando numerosas limitaciones que éste presentaba. El objetivo de dicho prototipo era mostrar la viabilidad de definir el análisis de recursos como un dominio abstracto, y cómo se podían superar las limitaciones de otras herramientas similares que constituyen el estado del arte. Para ello, se implementó como un dominio abstracto en el marco de interpretación abstracta presente en el sistema CiaoPP, PLAI. Hemos mejorado tanto el diseño como la implementación del mencionado prototipo para posibilitar su evolución hacia una herramienta utilizable en el ámbito industrial. Las operaciones abstractas de dicha herramienta dependen en gran medida de la generación, y posterior búsqueda de soluciones en forma cerrada, de relaciones recurrentes, las cuales modelizan el comportamiento, respecto al consumo de recursos, de los componentes del programa y del programa completo. Si bien existen actualmente muchas herramientas capaces de encontrar soluciones en forma cerrada para ciertos tipos de recurrencias, tales como Sistemas de Computación Algebraicos (CAS) y librerías de programación, ninguna de dichas herramientas es capaz de tratar, por sí sola, todos los tipos de recurrencias que surgen durante el análisis de recursos. Existen incluso recurrencias que no las puede resolver ninguna herramienta actual. Esto constituye claramente un cuello de botella para este tipo de análisis del uso de recursos. Por lo tanto, uno de los principales desafíos que hemos abordado en esta tesis es el diseño y desarrollo de un novedoso marco modular para la resolución de relaciones recurrentes, combinando y aprovechando los resultados de resolutores existentes. Además de ello, hemos desarrollado e integrado en nuestro nuevo resolutor una técnica para la obtención de cotas superiores en forma cerrada de una clase característica de relaciones recurrentes que surgen durante el análisis de programas lógicos con parámetros de acumulación. Finalmente, hemos integrado el nuevo análisis de recursos con el marco general para verificación de recursos de CiaoPP, y hemos instanciado dicho marco para la verificación de especificaciones sobre el consumo de energía de programas imperativas embarcados, mostrando la viabilidad y utilidad de la herramienta resultante en una aplicación real

    Certified Abstract Cost Analysis

    A program containing placeholders for unspecified statements or expressions is called an abstract (or schematic) program. Placeholder symbols occur naturally in program transformation rules, as used in refactoring, compilation, optimization, or parallelization. We present a generalization of automated cost analysis that can handle abstract programs and, hence, can analyze the impact on the cost of program transformations. This kind of relational property requires provably precise cost bounds which are not always produced by cost analysis. Therefore, we certify by deductive verification that the inferred abstract cost bounds are correct and sufficiently precise. It is the first approach solving this problem. Both, abstract cost analysis and certification, are based on quantitative abstract execution (QAE) which in turn is a variation of abstract execution, a recently developed symbolic execution technique for abstract programs. To realize QAE the new concept of a cost invariant is introduced. QAE is implemented and runs fully automatically on a benchmark set consisting of representative optimization rules

    Efficient method for detection of periodic orbits in chaotic maps and flows

    Full text link
    An algorithm for detecting unstable periodic orbits in chaotic systems [Phys. Rev. E, 60 (1999), pp. 6172-6175] which combines the set of stabilising transformations proposed by Schmelcher and Diakonos [Phys. Rev. Lett., 78 (1997), pp. 4733-4736] with a modified semi-implicit Euler iterative scheme and seeding with periodic orbits of neighbouring periods, has been shown to be highly efficient when applied to low-dimensional system. The difficulty in applying the algorithm to higher dimensional systems is mainly due to the fact that the number of stabilising transformations grows extremely fast with increasing system dimension. In this thesis, we construct stabilising transformations based on the knowledge of the stability matrices of already detected periodic orbits (used as seeds). The advantage of our approach is in a substantial reduction of the number of transformations, which increases the efficiency of the detection algorithm, especially in the case of high-dimensional systems. The performance of the new approach is illustrated by its application to the four-dimensional kicked double rotor map, a six-dimensional system of three coupled H\'enon maps and to the Kuramoto-Sivashinsky system in the weakly turbulent regime.Comment: PhD thesis, 119 pages. Due to restrictions on the size of files uploaded, some of the figures are of rather poor quality. If necessary a quality copy may be obtained (approximately 1MB in pdf) by emailing me at [email protected]

    Hybrid analysis of memory references and its application to automatic parallelization

    Executing sequential code in parallel on a multithreaded machine has been an elusive goal of the academic and industrial research communities for many years. It has recently become more important due to the widespread introduction of multicores in PCs. Automatic multithreading has not been achieved because classic, static compiler analysis was not powerful enough and program behavior was found to be, in many cases, input dependent. Speculative thread level parallelization was a welcome avenue for advancing parallelization coverage but its performance was not always optimal due to the sometimes unnecessary overhead of checking every dynamic memory reference. In this dissertation we introduce a novel analysis technique, Hybrid Analysis, which unifies static and dynamic memory reference techniques into a seamless compiler framework which extracts almost maximum available parallelism from scientific codes and incurs close to the minimum necessary run time overhead. We present how to extract maximum information from the quantities that could not be sufficiently analyzed through static compiler methods, and how to generate sufficient conditions which, when evaluated dynamically, can validate optimizations. Our techniques have been fully implemented in the Polaris compiler and resulted in whole program speedups on a large number of industry standard benchmark applications

    Automatic test generation for the detection of performance bugs in code optimization

    Software is everywhere in our daily lives, and it is important that software behaves in ways it is expected to. Testing is a widely accepted method for improving software quality. Testing detects the presence of bugs by comparing the actual outcome to the expected outcome of a computation. Testing for correctness is a well-studied problem. Testing for correctness compares the actual outcome of computation against its expected output. Typically, the expected output of a computation is unambiguous, since computations in computer software typically have clear semantics defined by the programming language. However, testing for performance is less studied. The expected outcome of a test may require context-knowledge not apparent in the test program itself. For example, by simply inspecting the code of a web server, one cannot determine what is the expected throughput. This makes performance testing for performance a challenging task. Testing compilers adds another layer of complexity. For compilers, a correctness bug during compiler optimization may introduce a bug in the resulting binary, even though the bug was not present in the source code. Similarly, a performance bug during optimization may cause inconsistencies in the runtimes of equivalent programs, where equivalent programs are defined as programs with identical outcomes but whose sources may differ through semantic-preserving transformations. Performance bugs prevent compilers from producing efficient code when they have the ability to do so. Many testing techniques have been proposed. Random testing is a powerful testing technique often associated with test generation. It allows a large testing space to be explored efficiently through sampling and is suitable for large and complex software with a large testing space, such as compilers. Random test generation for compilers has been shown to be effective in detecting correctness bugs. However, to the best of our knowledge, there is no previous study on random test generation for performance bugs in compilers. We believe one of the main reasons is the context-dependent nature when quantifying performance headroom. We propose a random test generation infrastructure for evaluating the performance of compilers. We quantify the performance headroom of tests by borrowing existing ideas from previous studies. Namely, when a set of equivalent programs is compiled by a compiler, all programs should aim to perform as well as the best-performing program. Additionally, when a program is compiled by a set of compilers, all compilers should aim to generate code that performs as well as the code generated by the best-performing compiler. We define metrics to evaluate compilers based on these ideas. We used our system to evaluate four modern compilers -- Intel's ICC, GNU's GCC, the Portland Group Inc.'s PGI compiler, and Clang -- on how well they handle loop unrolling, loop interchange, and loop unroll-and-jam. Results suggest that ICC typically performs better than the other three compilers. On the other hand, our system also identified extreme outliers for ICC where, for example, one program becomes x180000 slower after unrolling a loop. Due to the nature of random testing, we also study the methodologies required to achieve reproducible results by using statistical methods. We apply these methodologies to our compiler evaluation and provide evidence that our experiments are reproducible across different randomly generated collections of code segments

    Programming Languages and Systems

    This open access book constitutes the proceedings of the 30th European Symposium on Programming, ESOP 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 24 papers included in this volume were carefully reviewed and selected from 79 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems
