28 research outputs found

    Optimal program variant generation for hybrid manycore systems

    Get PDF
    Field Programmable Gate Arrays promise to deliver superior energy efficiency in heterogeneous high performance computing, as compared to multicore CPUs and GPUs. The rate of adoption is however hampered by the relative difficulty of programming FPGAs. High-level synthesis tools such as Xilinx Vivado, Altera OpenCL or Intel's HLS address a large part of the programmability issue by synthesizing a Hardware Description Languages representation from a high-level specification of the application, given in programming languages such as OpenCL C, typically used to program CPUs and GPUs. Although HLS solutions make programming easier, they fail to also lighten the burden of optimization. Application developers must rely on expert knowledge to manually optimize their applications for each target device, meaning that traditional HLS solutions do not offer a solution to the issue of performance portability. This state of fact prompted the development of compiler frameworks such as TyTra that operate at an even higher level of abstraction that is amenable to the use of Design Space Exploration (DSE). With DSE the initial program specification can be seen as the starting location in a search-space of correct-by-construction program transformations. In TyTra the search-space is generated from the transitive-closure of term-level transformations derived from type-level transformations. Compiler frameworks such as TyTra theoretically solve the issue of performance portability by providing a way to automatically generate alternative correct program variants. They however suffer from the very practical issue that the generated space is often too large to fully explore. As a consequence, the globally optimal solution may be overlooked. In this work we provide a novel solution to issue performance portability by deriving an efficient yet effective DSE strategy for the TyTra compiler framework. We make use of categorical data types to derive categorical semantics for the formal languages that describe the terms, types, cost-performance estimates and their transformations. From these we define a category of interpretations for TyTra applications, from which we derive a DSE strategy that finds the globally optimal transformation sequence in polynomial time. This is achieved by reducing the size of the generated search space. We formally state and prove a theorem for this claim and then show that the polynomial run-time for our DSE strategy has practically negligible coefficients leading to sub-second exploration times for realistic applications

    A design methodology for portable software on parallel computers

    Get PDF
    This final report for research that was supported by grant number NAG-1-995 documents our progress in addressing two difficulties in parallel programming. The first difficulty is developing software that will execute quickly on a parallel computer. The second difficulty is transporting software between dissimilar parallel computers. In general, we expect that more hardware-specific information will be included in software designs for parallel computers than in designs for sequential computers. This inclusion is an instance of portability being sacrificed for high performance. New parallel computers are being introduced frequently. Trying to keep one's software on the current high performance hardware, a software developer almost continually faces yet another expensive software transportation. The problem of the proposed research is to create a design methodology that helps designers to more precisely control both portability and hardware-specific programming details. The proposed research emphasizes programming for scientific applications. We completed our study of the parallelizability of a subsystem of the NASA Earth Radiation Budget Experiment (ERBE) data processing system. This work is summarized in section two. A more detailed description is provided in Appendix A ('Programming Practices to Support Eventual Parallelism'). Mr. Chrisman, a graduate student, wrote and successfully defended a Ph.D. dissertation proposal which describes our research associated with the issues of software portability and high performance. The list of research tasks are specified in the proposal. The proposal 'A Design Methodology for Portable Software on Parallel Computers' is summarized in section three and is provided in its entirety in Appendix B. We are currently studying a proposed subsystem of the NASA Clouds and the Earth's Radiant Energy System (CERES) data processing system. This software is the proof-of-concept for the Ph.D. dissertation. We have implemented and measured the performance of a portion of this subsystem on the Intel iPSC/2 parallel computer. These results are provided in section four. Our future work is summarized in section five, our acknowledgements are stated in section six, and references for published papers associated with NAG-1-995 are provided in section seven

    Workshop - Systems Design Meets Equation-based Languages

    Get PDF

    Automatic and Explicit Parallelization Approaches for Mathematical Simulation Models

    Full text link

    Safe code transfromations for speculative execution in real-time systems

    Get PDF
    Although compiler optimization techniques are standard and successful in non-real-time systems, if naively applied, they can destroy safety guarantees and deadlines in hard real-time systems. For this reason, real-time systems developers have tended to avoid automatic compiler optimization of their code. However, real-time applications in several areas have been growing substantially in size and complexity in recent years. This size and complexity makes it impossible for real-time programmers to write optimal code, and consequently indicates a need for compiler optimization. Recently researchers have developed or modified analyses and transformations to improve performance without degrading worst-case execution times. Moreover, these optimization techniques can sometimes transform programs which may not meet constraints/deadlines, or which result in timeouts, into deadline-satisfying programs. One such technique, speculative execution, also used for example in parallel computing and databases, can enhance performance by executing parts of the code whose execution may or may not be needed. In some cases, rollback is necessary if the computation turns out to be invalid. However, speculative execution must be applied carefully to real-time systems so that the worst-case execution path is not extended. Deterministic worst-case execution for satisfying hard real-time constraints, and speculative execution with rollback for improving average-case throughput, appear to lie on opposite ends of a spectrum of performance requirements and strategies. Deterministic worst-case execution for satisfying hard real-time constraints, and speculative execution with rollback for improving average-case throughput, appear to lie on opposite ends of a spectrum of performance requirements and strategies. Nonetheless, this thesis shows that there are situations in which speculative execution can improve the performance of a hard real-time system, either by enhancing average performance while not affecting the worst-case, or by actually decreasing the worst-case execution time. The thesis proposes a set of compiler transformation rules to identify opportunities for speculative execution and to transform the code. Proofs for semantic correctness and timeliness preservation are provided to verify safety of applying transformation rules to real-time systems. Moreover, an extensive experiment using simulation of randomly generated real-time programs have been conducted to evaluate applicability and profitability of speculative execution. The simulation results indicate that speculative execution improves average execution time and program timeliness. Finally, a prototype implementation is described in which these transformations can be evaluated for realistic applications

    Generating program analyzers

    Get PDF
    In this work the automatic generation of program analyzers from concise specifications is presented. It focuses on provably correct and complex interprocedural analyses for real world sized imperative programs. Thus, a powerful and flexible specification mechanism is required, enabling both correctness proofs and efficient implementations. The generation process relies on the theory of data flow analysis and on abstract interpretation. The theory of data flow analysis provides methods to efficiently implement analyses. Abstract interpretation provides the relation to the semantics of the programming language. This allows the systematic derivation of efficient provably correct, and terminating analyses. The approach has been implemented in the program analyzer generator PAG. It addresses analyses ranging from "simple\u27; intraprocedural bit vector frameworks to complex interprocedural alias analyses. A high level specialized functional language is used as specification mechanism enabling elegant and concise specifications even for complex analyses. Additionally, it allows the automatic selection of efficient implementations for the underlying abstract datatypes, such as balanced binary trees, binary decision diagrams, bit vectors, and arrays. For the interprocedural analysis the functional approach, the call string approach, and a novel approach especially targeting on the precise analysis of loops can be chosen. In this work the implementation of PAG as well as a large number of applications of PAG are presented.Diese Arbeit befaßt sich mit der automatischen Generierung von Programmanalysatoren aus prĂ€gnanten Spezifikationen. Dabei wird besonderer Wert auf die Generierung von beweisbar korrekten und komplexen interprozeduralen Analysen fĂŒr imperative Programme realer GrĂ¶ĂŸe gelegt. Um dies zu erreichen, ist ein leistungsfĂ€higer und flexibler Spezifikationsmechanismus erforderlich, der sowohl Korrektheitsbeweise, als auch effiziente Implementierungen ermöglicht. Die Generierung basiert auf den Theorien der Datenflußanalyse und der abstrakten Interpretation. Die Datenflußanalyse liefert Methoden zur effizienten Implementierung von Analysen. Die abstrakte Interpretation stellt den Bezug zur Semantik der Programmiersprache her und ermöglicht dadurch die systematische Ableitung beweisbar korrekter und terminierender Analysen. Dieser Ansatz wurde im Programmanalysatorgenerator PAG implementiert, der sowohl fĂŒr einfache intraprozedurale Bitvektor- Analysen, als auch fĂŒr komplexe interprozedurale Alias-Analysen geeignet ist. Als Spezifikationsmechanismus wird dabei eine spezialisierte funktionale Sprache verwendet, die es ermöglicht, auch komplexe Analysen kurz und prĂ€gnant zu spezifizieren. DarĂŒberhinaus ist es möglich, fĂŒr die zugrunde liegenden abstrakten Bereiche automatisch effiziente Implementierungen auszuwĂ€hlen, z.B. balancierte binĂ€re BĂ€ume, Binary Decision Diagrams, Bitvektoren oder Felder. FĂŒr die interprozedurale Analyse stehen folgende Möglichkeiten zur Auswahl: der funktionale Ansatz, der Call-String-Ansatz und ein neuer Ansatz, der besonders auf die prĂ€zise Analyse von Schleifen abzielt. Diese Arbeit beschreibt sowohl die Implementierung von PAG, als auch eine große Anzahl von Anwendungen

    A Language for Specifying Compiler Optimizations for Generic Software

    Full text link

    Intensional Cyberforensics

    Get PDF
    This work focuses on the application of intensional logic to cyberforensic analysis and its benefits and difficulties are compared with the finite-state-automata approach. This work extends the use of the intensional programming paradigm to the modeling and implementation of a cyberforensics investigation process with backtracing of event reconstruction, in which evidence is modeled by multidimensional hierarchical contexts, and proofs or disproofs of claims are undertaken in an eductive manner of evaluation. This approach is a practical, context-aware improvement over the finite state automata (FSA) approach we have seen in previous work. As a base implementation language model, we use in this approach a new dialect of the Lucid programming language, called Forensic Lucid, and we focus on defining hierarchical contexts based on intensional logic for the distributed evaluation of cyberforensic expressions. We also augment the work with credibility factors surrounding digital evidence and witness accounts, which have not been previously modeled. The Forensic Lucid programming language, used for this intensional cyberforensic analysis, formally presented through its syntax and operational semantics. In large part, the language is based on its predecessor and codecessor Lucid dialects, such as GIPL, Indexical Lucid, Lucx, Objective Lucid, and JOOIP bound by the underlying intensional programming paradigm.Comment: 412 pages, 94 figures, 18 tables, 19 algorithms and listings; PhD thesis; v2 corrects some typos and refs; also available on Spectrum at http://spectrum.library.concordia.ca/977460

    Proceedings of Monterey Workshop 2001 Engineering Automation for Sofware Intensive System Integration

    Get PDF
    The 2001 Monterey Workshop on Engineering Automation for Software Intensive System Integration was sponsored by the Office of Naval Research, Air Force Office of Scientific Research, Army Research Office and the Defense Advance Research Projects Agency. It is our pleasure to thank the workshop advisory and sponsors for their vision of a principled engineering solution for software and for their many-year tireless effort in supporting a series of workshops to bring everyone together.This workshop is the 8 in a series of International workshops. The workshop was held in Monterey Beach Hotel, Monterey, California during June 18-22, 2001. The general theme of the workshop has been to present and discuss research works that aims at increasing the practical impact of formal methods for software and systems engineering. The particular focus of this workshop was "Engineering Automation for Software Intensive System Integration". Previous workshops have been focused on issues including, "Real-time & Concurrent Systems", "Software Merging and Slicing", "Software Evolution", "Software Architecture", "Requirements Targeting Software" and "Modeling Software System Structures in a fastly moving scenario".Office of Naval ResearchAir Force Office of Scientific Research Army Research OfficeDefense Advanced Research Projects AgencyApproved for public release, distribution unlimite
    corecore