280 research outputs found
End-to-End Translation Validation for the Halide Language
International audienceThis paper considers the correctness of domain-specific compilers for tensor programming languages through the study of Halide, a popular representative. It describes a translation validation algorithm for affine Halide specifications, independently of the scheduling language. The algorithm relies on "propheticž annotations added by the compiler to the generated array assignments. The annotations provide a refinement mapping from assignments in the generated code to the tensor definitions from the specification. Our implementation leverages an affine solver and a general SMT solver, and scales to complete Halide benchmarks
Análisis de recursos de programas enteros y abstractos
Tesis inédita de la Universidad Complutense de Madrid, Facultad de Informática, Departamento de Sistemas lnformáticos y de Computación, leÃda el 27-05-2022Since the beginning of automated computing in the middle of the last century, the development of computer science has been linked to an increasing importance in all areas of the current society. The inclusion of computer science processes in everyday life and, in particular, its inclusion in critical situations, cannot go linked only to the generation of hardware and software, but also to the analysis and verification of all its components. While hardware analysis is crucial for the generation and maintenance of the computation infrastructure, as it is able to detect or predict components that can have a wrong behavior, software analysis focuses on analyzing the behavior of computer programs to address properties such as security, correctness or optimality. Depending on the type of analysis applied to the software, we can detect potential vulnerabilities in the code, find incorrect specifications, apply optimizations based on the maximun and minimun cost of the programs, calculate the resource consumption of a program..Desde el comienzo de la computación automática a mediados del siglo pasado, el avance de la informática ha ido ligado a una cada vez mayor importancia en todos los ámbitos d ela sociedad actual. La inclusión de procesos informáticos en la vida cotidiana y, en particular, su inclusión en situaciones crÃticas, no puede ir ligada solo a la generación del hardware el software, sino también al análisis y verificación de todos sus componentes. Mientras que el análisis de hardware es crucial para la generación de la infraestructura informática y el mantenimiento de la misma, detectando o prediciendo componentes que puedan funcionar de manera errónea, el análisis de software se enfoca hacia el análisis del comportamiento de los programas informáticos para abordar propiedades como la seguridad, la corrección o la optimalidad. Dependiendo del tipo de análisis aplicado al software, podremos detectar fragmentos de código potencialmente vulnerables, especificaciones incorrectas, aplicar optimizaciones en base al coste máximo y mÃnimo de los programas, calcular el consumo de recursos de un programa...Fac. de InformáticaTRUEunpu
Logical Inference Techniques for Loop Parallelization
This paper presents a fully automatic approach to loop parallelization that integrates the use of static and run-time analysis and thus overcomes many known difficulties such as nonlinear and indirect array indexing and complex control flow. Our hybrid analysis framework validates the parallelization transformation by verifying the independence of the loop’s memory references. To this end it represents array references using the USR (uniform set representation) language and expresses the independence condition as an equation, S = ∅, where S is a set expression representing array indexes. Using a language instead of an array-abstraction representation for S results in a smaller number of conservative approximations but exhibits a potentially-high runtime cost. To alleviate this cost we introduce a language translation F from the USR set-expression language to an equally rich language of predicates (F(S) ⇒ S = ∅). Loop parallelization is then validated using a novel logic inference algorithm that factorizes the obtained complex predicates (F(S)) into a sequence of sufficient-independence conditions that are evaluated first statically and, when needed, dynamically, in increasing order of their estimated complexities. We evaluate our automated solution on 26 benchmarks from PERFECT-CLUB and SPEC suites and show that our approach is effective in parallelizing large, complex loops and obtains much better full program speedups than the Intel and IBM Fortran compilers
Improved static analysis and verification of energy consumption and other resources via abstract interpretation
Resource analysis aims at inferring the cost of executing programs for any possible input,
in terms of a given resource, such as the traditional execution steps, time ormemory,
and, more recently energy consumption or user defined resources (e.g., number of
bits sent over a socket, number of database accesses, number of calls to particular procedures,
etc.). This is performed statically, i.e., without actually running the programs.
Resource usage information is useful for a variety of optimization and verification
applications, as well as for guiding software design. For example, programmers
can use such information to choose different algorithmic solutions to a problem; program
transformation systems can use cost information to choose between alternative
transformations; parallelizing compilers can use cost estimates for granularity control,
which tries to balance the overheads of task creation and manipulation against the
benefits of parallelization.
In this thesis we have significatively improved an existing prototype implementation
for resource usage analysis based on abstract interpretation, addressing a number
of relevant challenges and overcoming many limitations it presented. The goal of that
prototype was to show the viability of casting the resource analysis as an abstract domain,
and howit could overcome important limitations of the state-of-the-art resource
usage analysis tools. For this purpose, it was implemented as an abstract domain in the
abstract interpretation framework of the CiaoPP system, PLAI.We have improved both
the design and implementation of the prototype, for eventually allowing an evolution
of the tool to the industrial application level.
The abstract operations of such tool heavily depend on the setting up and finding
closed-form solutions of recurrence relations representing the resource usage behavior
of program components and the whole program as well. While there exist many
tools, such as Computer Algebra Systems (CAS) and libraries able to find closed-form
solutions for some types of recurrences, none of them alone is able to handle all the
types of recurrences arising during program analysis. In addition, there are some types
of recurrences that cannot be solved by any existing tool. This clearly constitutes a bottleneck
for this kind of resource usage analysis. Thus, one of the major challenges we
have addressed in this thesis is the design and development of a novel modular framework
for solving recurrence relations, able to combine and take advantage of the results
of existing solvers. Additionally, we have developed and integrated into our novel
solver a technique for finding upper-bound closed-form solutions of a special class of
recurrence relations that arise during the analysis of programs with accumulating parameters.
Finally, we have integrated the improved resource analysis into the CiaoPP general
framework for resource usage verification, and specialized the framework for verifying
energy consumption specifications of embedded imperative programs in a real application,
showing the usefulness and practicality of the resulting tool.---ABSTRACT---El Análisis de recursos tiene como objetivo inferir el coste de la ejecución de programas
para cualquier entrada posible, en términos de algún recurso determinado, como
pasos de ejecución, tiempo o memoria, y, más recientemente, el consumo de energÃa
o recursos definidos por el usuario (por ejemplo, número de bits enviados a través de
un socket, el número de accesos a una base de datos, cantidad de llamadas a determinados
procedimientos, etc.). Ello se realiza estáticamente, es decir, sin necesidad de
ejecutar los programas.
La información sobre el uso de recursos resulta muy útil para una gran variedad
de aplicaciones de optimización y verificación de programas, asà como para asistir en
el diseño de los mismos. Por ejemplo, los programadores pueden utilizar dicha información
para elegir diferentes soluciones algorÃtmicas a un problema; los sistemas de
transformación de programas pueden utilizar la información de coste para elegir entre
transformaciones alternativas; los compiladores paralelizantes pueden utilizar las estimaciones
de coste para realizar control de granularidad, el cual trata de equilibrar el
coste debido a la creación y gestión de tareas, con los beneficios de la paralelización.
En esta tesis hemos mejorado de manera significativa la implementación de un
prototipo existente para el análisis del uso de recursos basado en interpretación abstracta,
abordando diversos desafÃos relevantes y superando numerosas limitaciones
que éste presentaba. El objetivo de dicho prototipo era mostrar la viabilidad de definir
el análisis de recursos como un dominio abstracto, y cómo se podÃan superar las limitaciones
de otras herramientas similares que constituyen el estado del arte. Para ello,
se implementó como un dominio abstracto en el marco de interpretación abstracta
presente en el sistema CiaoPP, PLAI. Hemos mejorado tanto el diseño como la implementación
del mencionado prototipo para posibilitar su evolución hacia una herramienta
utilizable en el ámbito industrial.
Las operaciones abstractas de dicha herramienta dependen en gran medida de la
generación, y posterior búsqueda de soluciones en forma cerrada, de relaciones recurrentes,
las cuales modelizan el comportamiento, respecto al consumo de recursos, de
los componentes del programa y del programa completo. Si bien existen actualmente
muchas herramientas capaces de encontrar soluciones en forma cerrada para ciertos
tipos de recurrencias, tales como Sistemas de Computación Algebraicos (CAS) y librerÃas
de programación, ninguna de dichas herramientas es capaz de tratar, por sà sola,
todos los tipos de recurrencias que surgen durante el análisis de recursos. Existen incluso
recurrencias que no las puede resolver ninguna herramienta actual. Esto constituye
claramente un cuello de botella para este tipo de análisis del uso de recursos. Por lo tanto, uno de los principales desafÃos que hemos abordado en esta tesis es el diseño
y desarrollo de un novedoso marco modular para la resolución de relaciones recurrentes,
combinando y aprovechando los resultados de resolutores existentes. Además
de ello, hemos desarrollado e integrado en nuestro nuevo resolutor una técnica para
la obtención de cotas superiores en forma cerrada de una clase caracterÃstica de relaciones
recurrentes que surgen durante el análisis de programas lógicos con parámetros
de acumulación.
Finalmente, hemos integrado el nuevo análisis de recursos con el marco general
para verificación de recursos de CiaoPP, y hemos instanciado dicho marco para la verificación
de especificaciones sobre el consumo de energÃa de programas imperativas
embarcados, mostrando la viabilidad y utilidad de la herramienta resultante en una
aplicación real
Certified Abstract Cost Analysis
A program containing placeholders for unspecified statements or expressions is called an abstract (or schematic) program. Placeholder symbols occur naturally in program transformation rules, as used in refactoring, compilation, optimization, or parallelization. We present a generalization of automated cost analysis that can handle abstract programs and, hence, can analyze the impact on the cost of program transformations. This kind of relational property requires provably precise cost bounds which are not always produced by cost analysis. Therefore, we certify by deductive verification that the inferred abstract cost bounds are correct and sufficiently precise. It is the first approach solving this problem. Both, abstract cost analysis and certification, are based on quantitative abstract execution (QAE) which in turn is a variation of abstract execution, a recently developed symbolic execution technique for abstract programs. To realize QAE the new concept of a cost invariant is introduced. QAE is implemented and runs fully automatically on a benchmark set consisting of representative optimization rules
Efficient method for detection of periodic orbits in chaotic maps and flows
An algorithm for detecting unstable periodic orbits in chaotic systems [Phys.
Rev. E, 60 (1999), pp. 6172-6175] which combines the set of stabilising
transformations proposed by Schmelcher and Diakonos [Phys. Rev. Lett., 78
(1997), pp. 4733-4736] with a modified semi-implicit Euler iterative scheme and
seeding with periodic orbits of neighbouring periods, has been shown to be
highly efficient when applied to low-dimensional system. The difficulty in
applying the algorithm to higher dimensional systems is mainly due to the fact
that the number of stabilising transformations grows extremely fast with
increasing system dimension. In this thesis, we construct stabilising
transformations based on the knowledge of the stability matrices of already
detected periodic orbits (used as seeds). The advantage of our approach is in a
substantial reduction of the number of transformations, which increases the
efficiency of the detection algorithm, especially in the case of
high-dimensional systems. The performance of the new approach is illustrated by
its application to the four-dimensional kicked double rotor map, a
six-dimensional system of three coupled H\'enon maps and to the
Kuramoto-Sivashinsky system in the weakly turbulent regime.Comment: PhD thesis, 119 pages. Due to restrictions on the size of files
uploaded, some of the figures are of rather poor quality. If necessary a
quality copy may be obtained (approximately 1MB in pdf) by emailing me at
[email protected]
Hybrid analysis of memory references and its application to automatic parallelization
Executing sequential code in parallel on a multithreaded machine has been an
elusive goal of the academic and industrial research communities for many years. It
has recently become more important due to the widespread introduction of multicores
in PCs. Automatic multithreading has not been achieved because classic, static
compiler analysis was not powerful enough and program behavior was found to be, in
many cases, input dependent. Speculative thread level parallelization was a welcome
avenue for advancing parallelization coverage but its performance was not always optimal
due to the sometimes unnecessary overhead of checking every dynamic memory
reference.
In this dissertation we introduce a novel analysis technique, Hybrid Analysis,
which unifies static and dynamic memory reference techniques into a seamless compiler
framework which extracts almost maximum available parallelism from scientific
codes and incurs close to the minimum necessary run time overhead. We present how
to extract maximum information from the quantities that could not be sufficiently
analyzed through static compiler methods, and how to generate sufficient conditions
which, when evaluated dynamically, can validate optimizations.
Our techniques have been fully implemented in the Polaris compiler and resulted
in whole program speedups on a large number of industry standard benchmark applications
Automatic test generation for the detection of performance bugs in code optimization
Software is everywhere in our daily lives, and it is important that software behaves in ways it is expected to. Testing is a widely accepted method for improving software quality. Testing detects the presence of bugs by comparing the actual outcome to the expected outcome of a computation.
Testing for correctness is a well-studied problem. Testing for correctness compares the actual outcome of computation against its expected output. Typically, the expected output of a computation is unambiguous, since computations in computer software typically have clear semantics defined by the programming language.
However, testing for performance is less studied. The expected outcome of a test may require context-knowledge not apparent in the test program itself. For example, by simply inspecting the code of a web server, one cannot determine what is the expected throughput. This makes performance testing for performance a challenging task.
Testing compilers adds another layer of complexity. For compilers, a correctness bug during compiler optimization may introduce a bug in the resulting binary, even though the bug was not present in the source code. Similarly, a performance bug during optimization may cause inconsistencies in the runtimes of equivalent programs, where equivalent programs are defined as programs with identical outcomes but whose sources may differ through semantic-preserving transformations. Performance bugs prevent compilers from producing efficient code when they have the ability to do so.
Many testing techniques have been proposed. Random testing is a powerful testing technique often associated with test generation. It allows a large testing space to be explored efficiently through sampling and is suitable for large and complex software with a large testing space, such as compilers.
Random test generation for compilers has been shown to be effective in detecting correctness bugs. However, to the best of our knowledge, there is no previous study on random test generation for performance bugs in compilers. We believe one of the main reasons is the context-dependent nature when quantifying performance headroom.
We propose a random test generation infrastructure for evaluating the performance of compilers. We quantify the performance headroom of tests by borrowing existing ideas from previous studies. Namely, when a set of equivalent programs is compiled by a compiler, all programs should aim to perform as well as the best-performing program. Additionally, when a program is compiled by a set of compilers, all compilers should aim to generate code that performs as well as the code generated by the best-performing compiler. We define metrics to evaluate compilers based on these ideas.
We used our system to evaluate four modern compilers -- Intel's ICC, GNU's GCC, the Portland Group Inc.'s PGI compiler, and Clang -- on how well they handle loop unrolling, loop interchange, and loop unroll-and-jam. Results suggest that ICC typically performs better than the other three compilers. On the other hand, our system also identified extreme outliers for ICC where, for example, one program becomes x180000 slower after unrolling a loop.
Due to the nature of random testing, we also study the methodologies required to achieve reproducible results by using statistical methods. We apply these methodologies to our compiler evaluation and provide evidence that our experiments are reproducible across different randomly generated collections of code segments
Programming Languages and Systems
This open access book constitutes the proceedings of the 30th European Symposium on Programming, ESOP 2021, which was held during March 27 until April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The 24 papers included in this volume were carefully reviewed and selected from 79 submissions. They deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems
- …