27 research outputs found
Code Generation = A* + BURS
A system called BURS that is based on term rewrite systems and a search algorithm A* are combined to produce a code generator that generates optimal code. The theory underlying BURS is re-developed, formalised and explained in this work. The search algorithm uses a cost heuristic that is derived from the termrewrite system to direct the search. The advantage of using a search algorithm is that we need to compute only those costs that may be part of an optimal rewrite sequence
Automatic Generation of Efficient Linear Algebra Programs
The level of abstraction at which application experts reason about linear
algebra computations and the level of abstraction used by developers of
high-performance numerical linear algebra libraries do not match. The former is
conveniently captured by high-level languages and libraries such as Matlab and
Eigen, while the latter expresses the kernels included in the BLAS and LAPACK
libraries. Unfortunately, the translation from a high-level computation to an
efficient sequence of kernels is a task, far from trivial, that requires
extensive knowledge of both linear algebra and high-performance computing.
Internally, almost all high-level languages and libraries use efficient
kernels; however, the translation algorithms are too simplistic and thus lead
to a suboptimal use of said kernels, with significant performance losses. In
order to both achieve the productivity that comes with high-level languages,
and make use of the efficiency of low level kernels, we are developing Linnea,
a code generator for linear algebra problems. As input, Linnea takes a
high-level description of a linear algebra problem and produces as output an
efficient sequence of calls to high-performance kernels. In 25 application
problems, the code generated by Linnea always outperforms Matlab, Julia, Eigen
and Armadillo, with speedups up to and exceeding 10x
REALISTIC CORRECT SYSTEMS IMPLEMENTATION
The present article and the forthcoming second part on Trusted Compiler Implementation\ud
address correct construction and functioning of large computer based systems. In view\ud
of so many annoying and dangerous system misbehaviors we ask: Can informaticians\ud
righteously be accounted for incorrectness of systems, will they be able to justify systems\ud
to work correctly as intended? We understand the word justification in the sense: design\ud
of computer based systems, formulation of mathematical models of information flows, and\ud
construction of controlling software are to be such that the expected system effects, the\ud
absence of internal failures, and the robustness towards misuses and malicious external attacks\ud
are foreseeable as logical consequences of the models.\ud
Since more than 40 years, theoretical informatics, software engineering and compiler\ud
construction have made important contributions to correct specification and also to correct\ud
high-level implementation of compilers. But the third step, translation - bootstrapping - of\ud
high level compiler programs to host machine code by existing host compilers, is as important.\ud
So far there are no realistic recipes to close this correctness gap, although it is known\ud
for some years that trust in executable code can dangerously be compromised by Trojan\ud
Horses in compiler executables, even if they pass strongest tests.\ud
In the present first article we will give a comprehensive motivation and develop\ud
a mathematical theory in order to conscientiously prove the correctness of an initial fully\ud
trusted compiler executable. The task will be modularized in three steps. The third step of\ud
machine level compiler implementation verification is the topic of the forthcoming second\ud
part on Trusted Compiler Implementation. It closes the implementation gap, not only for\ud
compilers but also for correct software-based systems in general. Thus, the two articles together\ud
give a rather confident answer to the question raised in the title
Retrospective on high-level language computer architecture
High-level language computers (HLLC) have attracted interest in the architectural and programming community during the last 15 years; proposals have been made for machines directed towards the execution of various languages such as ALGOL, 1,2 APL, 3,4,5 BASIC, 6.
Alocação global de registradores de endereçamento para referencias a vetores em DSPs
Orientador: Guido Costa Souza de AraujoDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O avanço tecnológico dos sistemas computacionais tem proporcionado o crescimento do mercado de sistemas dedicados, cada vez mais comuns no dia-a-dia das pessoas, como por exemplo em telefones celulares, palmtops e sistemas de controle automotivo. Devido às suas características, estas novas aplicações requerem sistemas que aliem baixo custo, alto desempenho e baixo consumo de potência. Uma das maneiras de atender a estes requisitos é utilizando processadores especializados. Contudo, a especialização na arquitetura dos processadores impõe novos desafios para o desenvolvimento de software para estes sistemas. Em especial, os compiladores - geralmente responsáveis pela otimização de código - precisam ser adaptados para produzir código eficiente para estes novos processadores. Na área de processamento de sinais digitais, como em telefonia celular, processadores especializados, denominados DSPs2, são amplamente utilizados. Estes processadores tipicamente possuem poucos registradores de propósito geral e modos de endereçamento bastante limitados. Além disso, muitas das suas aplicações envolvem o processamento de grandes seqüências de dados, as quais são geralmente armazenadas em vetores. Como resultado, o estudo de técnicas de otimização de referências a vetores tornou-se um problema central em compilação para DSPs. Este problema, denominado Global Array Reference Allocation (GARA), é o objeto central desta dissertação. O sub-problema central de GARA consiste em se determinar, para um dado conjunto de referências a vetores que serão alocadas a um mesmo registrador de endereçamento, o menor custo das instruções que são necessárias para manter este registrador com o endereço adequado em cada ponto do programa. Nesta dissertação, este sub-problema é modelado como um problema em grafos, e provado ser NP-difícil. Além disso, é proposto um algoritmo eficiente, baseado em programação dinâmica, para resolver este sub-problema de forma exata sob certas restrições. Com base neste algoritmo, duas técnicas são propostas para resolver o problema de GARA. Resultados experimentais, obtidos pela implementação destas técnicas no compilador GCC, comparam-nas com outros resultados da literatura. Os resultados demonstram a eficácia das técnicas propostas nesta dissertaçãoAbstract: The technological advances in computing systems have stimulated the growth of the embedded systems market, which is continuously becoming more ordinary in people's lives, for example in mobile phones, palmtops and automotive control systems. Because of their characteristics, these new applications demand the combination of low cost, high performance and low power consumption. One way to meet these constraints is through the design of specialized processors. However, processor specialization imposes new challenges to the development of software for these systems. In particular, compilers - generally responsible for code optimization - need to be adapted in order to produce efficient code for these new processors. In the digital signal processing arena, such as in cellular telephones, specialized processors, known as DSPs (Digital Signal Processors), are largely used. DSPs typically have few general purpose registers and very restricted addressing modes. In addition, many DSP applications include large data streams processing, which are usually stored in arrays. As a result, studing array reference optimization techniques became an important task in compiling for DSPs. This work studies this problem, known as Global Array Reference Allocation (GARA). The central GARA subproblem consists of determining, for a given set of array references to be allocated to the same address register, the minimum cost of the instructions required to keep this register with the correct address at alI program points. In this work, this subproblem is modeled as a graph theoretical problem and proved to be NP-hard. In addition, an efficient algorithm, based on dynamic programming, is proposed to optimally solve this subproblem under some restrictions. Based on this algorithm, two techniques to solve GARA are proposed. Experimental results, from the implementation of these techniques in the GCC compiler, compare them with previous work in the literature. The results show the effectiveness of the techniques proposed in this workMestradoMestre em Ciência da Computaçã
A Transformation-Based Foundation for Semantics-Directed Code Generation
Interpreters and compilers are two different ways of implementing
programming languages. An interpreter directly executes its program
input. It is a concise definition of the semantics of a programming
language and is easily implemented. A compiler translates its program
input into another language. It is more difficult to construct, but
the code that it generates runs faster than interpreted code.
In this dissertation, we propose a transformation-based foundation for
deriving compilers from semantic specifications in the form of four
rules. These rules give apriori advice for staging, and allow
explicit compiler derivation that would be less succinct with partial
evaluation. When applied, these rules turn an interpreter that
directly executes its program input into a compiler that emits the
code that the interpreter would have executed.
We formalize the language syntax and semantics to be used for the
interpreter and the compiler, and also specify a notion of equality.
It is then possible to precisely state the transformation rules and to
prove both local and global correctness theorems. And although the
transformation rules were developed so as to apply to an interpreter
written in a denotational style, we consider how to modify
non-denotational interpreters so that the rules apply. Finally, we
illustrate these ideas by considering a larger example: a Prolog
implementation
Übersetzerbau. Ein kleiner Überblick
Übersetzerbau-Forschung ist heutzutage hauptsächlich durch die
Entwicklung von Optimierungs- und Codeerzeugungs-Verfahren
geprägt. Dadurch sollen Übersetzer in die Lage versetzt werden,
die vielfältigen Eigenschaften moderner Prozessoren effizient zu
nutzen. Die Grundlagen moderner Übersetzer, insbesondere die
Überprüfung des Eingabeprogramms hinsichtlich syntaktischer und
semantischer Regeln sind wohlverstanden und beruhen auf
Erkenntnissen aus den 60er und 70er Jahren. Das Erzeugen von
lauffähigem Code ist ebenso gut erforscht. Diese Verfahren sind
auch über den Übersetzerbau hinaus grundlegend für die
Informatik.
In den folgenden Kapiteln geben wir eine kleine Einführung in
die Arbeitsweise von Übersetzern. Wir verzichten bewusst auf die
meisten (formalen) Details und Algorithmen, da wir Einsteigern
ermöglichen wollen, die verschiedenen Komponenten dieser
komplexen Systeme zu Überblicken und sich eine intuitive
Vorstellung von Übersetzern zu bilden.
Für das detaillierte Studium der vorgestellten Konzepte,
Verfahren und Algorithmen werden wir jeweils Literatur benennen.
Die im Anhang besprochenen Standardwerke sind von allgemeinem
Interesse.
Zwar ist die detaillierte Beschreibung der einzelnen Verfahren
unerlässlich für ihr genaues Verständnis, wir sind jedoch der
Meinung, dass eine ungefähre Vorstellung von dem, was ein
Verfahren bewirkt, das tiefere Verständnis erleichtert. Es
besteht dennoch die Gefahr, dass eine fehlgeleitete Intuition
dem tieferen Verstehen im Wege steht. Falls dies durch diesen
Text geschehen sein sollte, bitten wir um die Nachsicht des
Lesers. Wir sind für Kritik und Verbesserungsvorschläge äußerst
dankbar, da uns an einer ständigen Verbesserung dieses Textes
gelegen ist