23 research outputs found

    APOLLO: Automatic speculative POLyhedral Loop Optimizer

    Get PDF
    International audienceA few weeks ago, we were glad to announce the first release of Apollo, the Automatic speculative POLyhedral Loop Opti-mizer. Apollo applies polyhedral optimizations on-the-fly to loop nests, whose control flow and memory access patterns cannot be determined at compile-time. In contrast to existing tools, Apollo can handle any kind of loop nest, whose memory accesses can be performed through pointers and in-directions. At runtime, Apollo builds a predictive polyhedral model, which is used for speculative optimization including parallelization. Being a dynamic system, Apollo can even apply the polyhedral model to nonlinear loops. This paper describes Apollo from the perspective of a user, as well as some of its main contributions and mechanisms, including the just-in-time polyhedral compilation, that significantly extends the scope of polyhedral techniques

    Création d'une Représentation Polyédrique depuis une Exécution

    Get PDF
    The polyhedral model has been successfully used in production compilers. Nevertheless, only a very restricted class of applications can benefit from it. Recent proposals investigated how runtime information could be used to apply polyhedral optimization on applications that do not statically fit the model. In this work, we go one step further in that direction. We propose a dynamic analysis that builds a compact polyhedral representation from a program execution. It is able to accurately detect affine dependencies and fixed-stride memory accesses in programs. The analysis scales to real-life applications, which often include some non-affine dependencies and accesses in otherwise affine code. This is enabled by a safe fine-grain polyhedral over-approximation mechanism applied to each analyzed expression. We evaluate our analysis on the entire Rodinia benchmark suite, enabling accurate feedback about potential for complex polyhedral transformations.Le modèle polyédrique est aujourd'hui utilisé à grande échelle via son intégration dans des compilateurs très largement utilisés. Néanmoins, seule une classe très restreinte de programmes peut en bénéficier. Des travaux récents ont montré comment des informations provenant d'une exécution du programme pouvaient être utilisées afin d'étendre la portée dumodèle polyédrique. Ce travail s'inscrit dans ce contexte d'analyse dynamique de programmes pour appliquer le modèle polyédrique plus largement. Nous proposons une analyse dynamique capable de construire une représentation polyédrique d'un programme à partir d'une éxecution instrumentée. Cette analyse détecte de façon précise les dépendances affines ainsi que les accès mémoire avec incréments constants présents dans le programme. Notre analyse passe à l'échellesur de vraies applications qui contiennent souvent quelques dépendances et accès mémoire non affines. Ce passage à l'échelle est possible grâce à un mécanisme de sur-approximation. Nous évaluons notre analyse sur la suite de benchmarks Rodinia en montrant quel est le retour fourni à l'utilisateur en ce qui concerne de potentielles transformations polyédriques

    Disjunctive invariants for modular static analysis

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Formal modelling and analysis of broadcasting embedded control systems

    Get PDF
    PhD ThesisEmbedded systems are real-time, communicating systems, and the effective modelling and analysis of these aspects of their behaviour is regarded as essential for acquiring confidence in their correct operation. In practice, it is important to minimise the burden of model construction and to automate the analysis, if possible. Among the most promising techniques for real-time systems are reachability analysis and model-checking of networks of timed automata. We identify two obstacles to the application of these techniques to a large class of distributed embedded systems: firstly, the language of timed automata is too low-level for straightforward model construction, and secondly, the synchronous, handshake communication mechanism of the timed automata model does not fit well with the asynchronous, broadcast mechanism employed in many distributed embedded systems. As a result, the task of model construction can be unduly onerous. This dissertation proposes an expressive language for the construction of models of real-time, broadcasting control systems, and demonstrates how effi- cient analysis techniques can be applied to them. The dissertation is concerned in particular with the Controller Area Network (CAN) protocol which is emerging as a de facto standard in the automotive industry. An abstract formal model of CAN is developed. This model is adopted as the communication primitive in a new language, bCANDLE, which includes value passing, broadcast communication, message priorities and explicit time. A high-level language, CANDLE, is introduced and its semantics defined by translation to bCANDLE. We show how realistic CAN systems can be described in CANDLE and how a timed transition model of a system can be extracted for analysis. Finally, it is shown how efficient methods of analysis, such as 'on-the- fly' and symbolic techniques, can be applied to these models. The dissertation contributes to the practical application of formal methods within the domain of broadcasting, embedded control systemsSchool of Computing and Mathematics at the University of Northumbri

    Methods for Real-time Visualization and Interaction with Landforms

    Get PDF
    This thesis presents methods to enrich data modeling and analysis in the geoscience domain with a particular focus on geomorphological applications. First, a short overview of the relevant characteristics of the used remote sensing data and basics of its processing and visualization are provided. Then, two new methods for the visualization of vector-based maps on digital elevation models (DEMs) are presented. The first method uses a texture-based approach that generates a texture from the input maps at runtime taking into account the current viewpoint. In contrast to that, the second method utilizes the stencil buffer to create a mask in image space that is then used to render the map on top of the DEM. A particular challenge in this context is posed by the view-dependent level-of-detail representation of the terrain geometry. After suitable visualization methods for vector-based maps have been investigated, two landform mapping tools for the interactive generation of such maps are presented. The user can carry out the mapping directly on the textured digital elevation model and thus benefit from the 3D visualization of the relief. Additionally, semi-automatic image segmentation techniques are applied in order to reduce the amount of user interaction required and thus make the mapping process more efficient and convenient. The challenge in the adaption of the methods lies in the transfer of the algorithms to the quadtree representation of the data and in the application of out-of-core and hierarchical methods to ensure interactive performance. Although high-resolution remote sensing data are often available today, their effective resolution at steep slopes is rather low due to the oblique acquisition angle. For this reason, remote sensing data are suitable to only a limited extent for visualization as well as landform mapping purposes. To provide an easy way to supply additional imagery, an algorithm for registering uncalibrated photos to a textured digital elevation model is presented. A particular challenge in registering the images is posed by large variations in the photos concerning resolution, lighting conditions, seasonal changes, etc. The registered photos can be used to increase the visual quality of the textured DEM, in particular at steep slopes. To this end, a method is presented that combines several georegistered photos to textures for the DEM. The difficulty in this compositing process is to create a consistent appearance and avoid visible seams between the photos. In addition to that, the photos also provide valuable means to improve landform mapping. To this end, an extension of the landform mapping methods is presented that allows the utilization of the registered photos during mapping. This way, a detailed and exact mapping becomes feasible even at steep slopes

    Exploiting BSP Abstractions for Compiler Based Optimizations of GPU Applications on multi-GPU Systems

    Get PDF
    Graphics Processing Units (GPUs) are accelerators for computers and provide massive amounts of computational power and bandwidth for amenable applications. While effectively utilizing an individual GPU already requires a high level of skill, effectively utilizing multiple GPUs introduces completely new types of challenges. This work sets out to investigate how the hierarchical execution model of GPUs can be exploited to simplify the utilization of such multi-GPU systems. The investigation starts with an analysis of the memory access patterns exhibited by applications from common GPU benchmark suites. Memory access patterns are collected using custom instrumentation and a simple simulation then analyzes the patterns and identifies implicit communication across the different levels of the execution hierarchy. The analysis reveals that for most GPU applications memory accesses are highly localized and there exists a way to partition the workload so that the communication volume grows slower than the aggregated bandwidth for growing numbers of GPUs. Next, an application model based on Z-polyhedra is derived that formalizes the distribution of work across multiple GPUs and allows the identification of data dependencies. The model is then used to implement a prototype compiler that consumes single-GPU programs and produces executables that distribute GPU workloads across all available GPUs in a system. It uses static analysis to identify memory access patterns and polyhedral code generation in combination with a dynamic tracking system to efficiently resolve data dependencies. The prototype is implemented as an extension to the LLVM/Clang compiler and published in full source. The prototype compiler is then evaluated using a set of benchmark applications. While the prototype is limited in its applicability by technical issues, it provides impressive speedups of up to 12.4x on 16 GPUs for amenable applications. An in-depth analysis of the application runtime reveals that dependency resolution takes up less than 10% of the runtime, often significantly less. A discussion follows and puts the work into context by presenting and differentiating related work, reflecting critically on the work itself and an outlook of the aspects that could be explored as part of this research. The work concludes with a summary and a closing opinion

    Worst-Case Execution Time Analysis for C++ based Real-Time On-Board Software Systems

    Get PDF
    Autonomous systems are today’s trend in the aerospace domain. These systems require more on-board data processing capabilities. They follow data-flow programming, and have similar software architecture. Developing a framework that is applicable for these architectures reduces the development efforts and improves the re-usability. However, its design’s essential requirement is to use a programming language that can offer both abstraction and static memory capabilities. As a result, C++ was chosen to develop the Tasking Framework, which is used to develop on-board data-flow-oriented applications. Validating the timing requirements for such a framework is a long, complicated process. Estimating the worst-case execution time (WCET) is the first step within this process. Thus, in this thesis, we focus on performing WCET analysis for C++ model-based applications developed by the Tasking Framework. This work deals with two main challenges that emerged from using C++: using objects impose the need for a memory model and using virtual methods implicate indirect jumps. To this end, we developed a tool based on symbolic execution that can handle both challenges. The tool showed high precision of early 90 % in bounding loops of the Benchmark suit. We then integrated our advanced analysis with an open toolbox for adaptive WCET analysis. Finally, we evaluated our approach for estimating the WCET for tasks developed by the Tasking Framework
    corecore