28 research outputs found
Optimal program variant generation for hybrid manycore systems
Field Programmable Gate Arrays promise to deliver superior energy efficiency in heterogeneous high performance computing, as compared to multicore CPUs and GPUs. The rate of adoption is however hampered by the relative difficulty of programming FPGAs. High-level synthesis tools such as Xilinx Vivado, Altera OpenCL or Intel's HLS address a large part of the programmability issue by synthesizing a Hardware Description Languages representation from a high-level specification of the application, given in programming languages such as OpenCL C, typically used to program CPUs and GPUs. Although HLS solutions make programming easier, they fail to also lighten the burden of optimization. Application developers must rely on expert knowledge to manually optimize their applications for each target device, meaning that traditional HLS solutions do not offer a solution to the issue of performance portability. This state of fact prompted the development of compiler frameworks such as TyTra that operate at an even higher level of abstraction that is amenable to the use of Design Space Exploration (DSE). With DSE the initial program specification can be seen as the starting location in a search-space of correct-by-construction program transformations. In TyTra the search-space is generated from the transitive-closure of term-level transformations derived from type-level transformations. Compiler frameworks such as TyTra theoretically solve the issue of performance portability by providing a way to automatically generate alternative correct program variants. They however suffer from the very practical issue that the generated space is often too large to fully explore. As a consequence, the globally optimal solution may be overlooked.
In this work we provide a novel solution to issue performance portability by deriving an efficient yet effective DSE strategy for the TyTra compiler framework. We make use of categorical data types to derive categorical semantics for the formal languages that describe the terms, types, cost-performance estimates and their transformations. From these we define a category of interpretations for TyTra applications, from which we derive a DSE strategy that finds the globally optimal transformation sequence in polynomial time. This is achieved by reducing the size of the generated search space. We formally state and prove a theorem for this claim and then show that the polynomial run-time for our DSE strategy has practically negligible coefficients leading to sub-second exploration times for realistic applications
A design methodology for portable software on parallel computers
This final report for research that was supported by grant number NAG-1-995 documents our progress in addressing two difficulties in parallel programming. The first difficulty is developing software that will execute quickly on a parallel computer. The second difficulty is transporting software between dissimilar parallel computers. In general, we expect that more hardware-specific information will be included in software designs for parallel computers than in designs for sequential computers. This inclusion is an instance of portability being sacrificed for high performance. New parallel computers are being introduced frequently. Trying to keep one's software on the current high performance hardware, a software developer almost continually faces yet another expensive software transportation. The problem of the proposed research is to create a design methodology that helps designers to more precisely control both portability and hardware-specific programming details. The proposed research emphasizes programming for scientific applications. We completed our study of the parallelizability of a subsystem of the NASA Earth Radiation Budget Experiment (ERBE) data processing system. This work is summarized in section two. A more detailed description is provided in Appendix A ('Programming Practices to Support Eventual Parallelism'). Mr. Chrisman, a graduate student, wrote and successfully defended a Ph.D. dissertation proposal which describes our research associated with the issues of software portability and high performance. The list of research tasks are specified in the proposal. The proposal 'A Design Methodology for Portable Software on Parallel Computers' is summarized in section three and is provided in its entirety in Appendix B. We are currently studying a proposed subsystem of the NASA Clouds and the Earth's Radiant Energy System (CERES) data processing system. This software is the proof-of-concept for the Ph.D. dissertation. We have implemented and measured the performance of a portion of this subsystem on the Intel iPSC/2 parallel computer. These results are provided in section four. Our future work is summarized in section five, our acknowledgements are stated in section six, and references for published papers associated with NAG-1-995 are provided in section seven
Safe code transfromations for speculative execution in real-time systems
Although compiler optimization techniques are standard and successful in non-real-time systems, if naively applied, they can destroy safety guarantees and deadlines in hard real-time systems. For this reason, real-time systems developers have tended to avoid automatic compiler optimization of their code. However, real-time applications in several areas have been growing substantially in size and complexity in recent years. This size and complexity makes it impossible for real-time programmers to write optimal code, and consequently indicates a need for compiler optimization. Recently researchers have developed or modified analyses and transformations to improve performance without degrading worst-case execution times. Moreover, these optimization techniques can sometimes transform programs which may not meet constraints/deadlines, or which result in timeouts, into deadline-satisfying programs.
One such technique, speculative execution, also used for example in parallel computing and databases, can enhance performance by executing parts of the code whose execution may or may not be needed. In some cases, rollback is necessary if the computation turns out to be invalid. However, speculative execution must be applied carefully to real-time systems so that the worst-case execution path is not extended. Deterministic worst-case execution for satisfying hard real-time constraints, and speculative execution with rollback for improving average-case throughput, appear to lie on opposite ends of a spectrum of performance requirements and strategies.
Deterministic worst-case execution for satisfying hard real-time constraints, and speculative execution with rollback for improving average-case throughput, appear to lie on opposite ends of a spectrum of performance requirements and strategies. Nonetheless, this thesis shows that there are situations in which speculative execution can improve the performance of a hard real-time system, either by enhancing average performance while not affecting the worst-case, or by actually decreasing the worst-case execution time. The thesis proposes a set of compiler transformation rules to identify opportunities for speculative execution and to transform the code. Proofs for semantic correctness and timeliness preservation are provided to verify safety of applying transformation rules to real-time systems. Moreover, an extensive experiment using simulation of randomly generated real-time programs have been conducted to evaluate applicability and profitability of speculative execution. The simulation results indicate that speculative execution improves average execution time and program timeliness. Finally, a prototype implementation is described in which these transformations can be evaluated for realistic applications
Generating program analyzers
In this work the automatic generation of program analyzers from
concise specifications is presented. It focuses on provably correct
and complex interprocedural analyses for real world sized imperative
programs. Thus, a powerful and flexible specification mechanism
is required, enabling both correctness proofs and efficient
implementations. The generation process relies on the theory of
data flow analysis and on abstract interpretation. The theory of
data flow analysis provides methods to efficiently implement analyses.
Abstract interpretation provides the relation to the semantics
of the programming language. This allows the systematic derivation
of efficient provably correct, and terminating analyses. The
approach has been implemented in the program analyzer generator
PAG. It addresses analyses ranging from "simple\u27; intraprocedural
bit vector frameworks to complex interprocedural alias
analyses. A high level specialized functional language is used as
specification mechanism enabling elegant and concise specifications
even for complex analyses. Additionally, it allows the automatic
selection of efficient implementations for the underlying
abstract datatypes, such as balanced binary trees, binary decision
diagrams, bit vectors, and arrays. For the interprocedural analysis
the functional approach, the call string approach, and a novel
approach especially targeting on the precise analysis of loops can
be chosen. In this work the implementation of PAG as well as a
large number of applications of PAG are presented.Diese Arbeit befaĂt sich mit der automatischen Generierung von Programmanalysatoren aus prĂ€gnanten Spezifikationen. Dabei wird besonderer Wert auf die Generierung von beweisbar korrekten und komplexen interprozeduralen Analysen fĂŒr imperative Programme realer GröĂe gelegt. Um dies zu erreichen, ist ein leistungsfĂ€higer und flexibler Spezifikationsmechanismus erforderlich, der sowohl Korrektheitsbeweise, als auch effiziente Implementierungen ermöglicht. Die Generierung basiert auf den Theorien der DatenfluĂanalyse und der abstrakten Interpretation. Die DatenfluĂanalyse liefert Methoden zur effizienten Implementierung von Analysen. Die abstrakte Interpretation stellt den Bezug zur Semantik der Programmiersprache her und ermöglicht dadurch die systematische Ableitung beweisbar korrekter und terminierender Analysen. Dieser Ansatz wurde im Programmanalysatorgenerator PAG implementiert, der sowohl fĂŒr einfache intraprozedurale Bitvektor- Analysen, als auch fĂŒr komplexe interprozedurale Alias-Analysen geeignet ist. Als Spezifikationsmechanismus wird dabei eine spezialisierte funktionale Sprache verwendet, die es ermöglicht, auch komplexe Analysen kurz und prĂ€gnant zu spezifizieren. DarĂŒberhinaus ist es möglich, fĂŒr die zugrunde liegenden abstrakten Bereiche automatisch effiziente Implementierungen auszuwĂ€hlen, z.B. balancierte binĂ€re BĂ€ume, Binary Decision Diagrams, Bitvektoren oder Felder. FĂŒr die interprozedurale Analyse stehen folgende Möglichkeiten zur Auswahl: der funktionale Ansatz, der Call-String-Ansatz und ein neuer Ansatz, der besonders auf die prĂ€zise Analyse von Schleifen abzielt. Diese Arbeit beschreibt sowohl die Implementierung von PAG, als auch eine groĂe Anzahl von Anwendungen
Intensional Cyberforensics
This work focuses on the application of intensional logic to cyberforensic
analysis and its benefits and difficulties are compared with the
finite-state-automata approach. This work extends the use of the intensional
programming paradigm to the modeling and implementation of a cyberforensics
investigation process with backtracing of event reconstruction, in which
evidence is modeled by multidimensional hierarchical contexts, and proofs or
disproofs of claims are undertaken in an eductive manner of evaluation. This
approach is a practical, context-aware improvement over the finite state
automata (FSA) approach we have seen in previous work. As a base implementation
language model, we use in this approach a new dialect of the Lucid programming
language, called Forensic Lucid, and we focus on defining hierarchical contexts
based on intensional logic for the distributed evaluation of cyberforensic
expressions. We also augment the work with credibility factors surrounding
digital evidence and witness accounts, which have not been previously modeled.
The Forensic Lucid programming language, used for this intensional
cyberforensic analysis, formally presented through its syntax and operational
semantics. In large part, the language is based on its predecessor and
codecessor Lucid dialects, such as GIPL, Indexical Lucid, Lucx, Objective
Lucid, and JOOIP bound by the underlying intensional programming paradigm.Comment: 412 pages, 94 figures, 18 tables, 19 algorithms and listings; PhD
thesis; v2 corrects some typos and refs; also available on Spectrum at
http://spectrum.library.concordia.ca/977460
Proceedings of Monterey Workshop 2001 Engineering Automation for Sofware Intensive System Integration
The 2001 Monterey Workshop on Engineering Automation for Software Intensive System Integration was sponsored by the Office of Naval Research, Air Force Office of Scientific Research, Army Research Office and the Defense Advance Research Projects Agency. It is our pleasure to thank the workshop advisory and sponsors for their vision of a principled engineering solution for software and for their many-year tireless effort in supporting a series of workshops to bring everyone together.This workshop is the 8 in a series of International workshops. The workshop was held in Monterey Beach Hotel, Monterey, California during June 18-22, 2001. The general theme of the workshop has been to present and discuss research works that aims at increasing the practical impact of formal methods for software and systems engineering. The particular focus of this workshop was "Engineering Automation for Software Intensive System Integration". Previous workshops have been focused on issues including, "Real-time & Concurrent Systems", "Software Merging and Slicing", "Software Evolution", "Software Architecture", "Requirements Targeting Software" and "Modeling Software System Structures in a fastly moving scenario".Office of Naval ResearchAir Force Office of Scientific Research Army Research OfficeDefense Advanced Research Projects AgencyApproved for public release, distribution unlimite