363 research outputs found

    Optimal kernel development for real-time communications

    Get PDF
    The purpose of this research is to develop an optimal kernel which would be used in a real-time engineering and communications system. Since the application is a real-time system, relevant real-time issues are studied in conjunction with kernel related issues. The emphasis of the research is the development of a kernel which would not only adhere to the criteria of a real-time environment, namely determinism and performance, but also provide the flexibility and portability associated with non-real-time environments. The essence of the research is to study how the features found in non-real-time systems could be applied to the real-time system in order to generate an optimal kernel which would provide flexibility and architecture independence while maintaining the performance needed by most of the engineering applications. Traditionally, development of real-time kernels has been done using assembly language. By utilizing the powerful constructs of the C language, a real-time kernel was developed which addressed the goals of flexibility and portability while still meeting the real-time criteria. The implementation of the kernel is carried out using the powerful 68010/20/30/40 microprocessor based systems

    Isolating crosscutting concerns in system software

    Get PDF
    This paper reports upon our experience in automatically migrating the crosscutting concerns of a large-scale software system, written in C, to an aspect-oriented implementation. We zoom in on one particular crosscutting concern, and show how detailed information about it is extracted from the source code, and how this information enables us to characterise this code and define an appropriate aspect automatically. Additionally, we compare the already existing solution to the aspect-oriented solution, and discuss advantages as well as disadvantages of both in terms of selected quality attributes. Our results show that automated migration is feasible, and can lead to significant improvements in source code qualit

    Dynamic optimization through the use of automatic runtime specialization

    Get PDF
    Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (leaves 99-115).by John Whaley.S.B.and M.Eng

    Array bounds check elimination in the context of deoptimization

    Get PDF
    AbstractWhenever an array element is accessed, Java virtual machines execute a compare instruction to ensure that the index value is within the valid bounds. This reduces the execution speed of Java programs. Array bounds check elimination identifies situations in which such checks are redundant and can be removed. We present an array bounds check elimination algorithm for the Java HotSpot™ VM based on static analysis in the just-in-time compiler.The algorithm works on an intermediate representation in static single assignment form and maintains conditions for index expressions. It fully removes bounds checks if it can be proven that they never fail. Whenever possible, it moves bounds checks out of loops. The static number of checks remains the same, but a check inside a loop is likely to be executed more often. If such a check fails, the executing program falls back to interpreted mode, avoiding the problem that an exception is thrown at the wrong place.The evaluation shows a speedup near to the theoretical maximum for the scientific SciMark benchmark suite and also significant improvements for some Java Grande benchmarks. The algorithm slightly increases the execution speed for the SPECjvm98 benchmark suite. The evaluation of the DaCapo benchmarks shows that array bounds checks do not have a significant impact on the performance of object-oriented applications

    Ada (trademark) projects at NASA. Runtime environment issues and recommendations

    Get PDF
    Ada practitioners should use this document to discuss and establish common short term requirements for Ada runtime environments. The major current Ada runtime environment issues are identified through the analysis of some of the Ada efforts at NASA and other research centers. The runtime environment characteristics of major compilers are compared while alternate runtime implementations are reviewed. Modifications and extensions to the Ada Language Reference Manual to address some of these runtime issues are proposed. Three classes of projects focusing on the most critical runtime features of Ada are recommended, including a range of immediately feasible full scale Ada development projects. Also, a list of runtime features and procurement issues is proposed for consideration by the vendors, contractors and the government

    MetaBETA: Model and Implementation

    Get PDF
    Object-oriented programming languages are excellent for expressing abstractions in many application domains. The object-oriented programming methodology allows real-world concepts to modelled in an easy and direct fashion and it supports refinement of concepts. However, many object-oriented languages and their implementations fall short in two areas: dynamic extensibility and reflection.Dynamic extensibility is the ability to incorporate new classes into an application at runtime. Reflection makes it possible for a language to extend its own domain, e.g., to build type-orthogonal functionality. MetaBETA is an extension of the BETA language that supports dynamic extensibility and reflection. MetaBETA has a metalevel interface that provides access to the state of a running application and to the default implementation of language primities.This report presents the model behind MetaBETA. In particular, we discuss the execution model of a MetaBETA program and how type- orthogonal abstractions can be built. This includes precentation of dynamic slots, a mechanism that makes is possible ectend objects at runtime. The other main area covered in this report is the implementation of MetaBETA. The central component of the architecture is a runtime system, which is viewed as a virtual machine whose baselevel interface implements the functionality needed by the programming language

    Generation of Application Specific Hardware Extensions for Hybrid Architectures: The Development of PIRANHA - A GCC Plugin for High-Level-Synthesis

    Get PDF
    Architectures combining a field programmable gate array (FPGA) and a general-purpose processor on a single chip became increasingly popular in recent years. On the one hand, such hybrid architectures facilitate the use of application specific hardware accelerators that improve the performance of the software on the host processor. On the other hand, it obliges system designers to handle the whole process of hardware/software co-design. The complexity of this process is still one of the main reasons, that hinders the widespread use of hybrid architectures. Thus, an automated process that aids programmers with the hardware/software partitioning and the generation of application specific accelerators is an important issue. The method presented in this thesis neither requires restrictions of the used high-level-language nor special source code annotations. Usually, this is an entry barrier for programmers without deeper understanding of the underlying hardware platform. This thesis introduces a seamless programming flow that allows generating hardware accelerators for unrestricted, legacy C code. The implementation consists of a GCC plugin that automatically identifies application hot-spots and generates hardware accelerators accordingly. Apart from the accelerator implementation in a hardware description language, the compiler plugin provides the generation of a host processor interfaces and, if necessary, a prototypical integration with the host operating system. An evaluation with typical embedded applications shows general benefits of the approach, but also reveals limiting factors that hamper possible performance improvements

    Simulation, Analysis, and Optimization of Heterogeneous CPU-GPU Systems

    Get PDF
    With the computing industry\u27s recent adoption of the Heterogeneous System Architecture (HSA) standard, we have seen a rapid change in heterogeneous CPU-GPU processor designs. State-of-the-art heterogeneous CPU-GPU processors tightly integrate multicore CPUs and multi-compute unit GPUs together on a single die. This brings the MIMD processing capabilities of the CPU and the SIMD processing capabilities of the GPU together into a single cohesive package with new HSA features comprising better programmability, coherency between the CPU and GPU, shared Last Level Cache (LLC), and shared virtual memory address spaces. These advancements can potentially bring marked gains in heterogeneous processor performance and have piqued the interest of researchers who wish to unlock these potential performance gains. Therefore, in this dissertation I explore the heterogeneous CPU-GPU processor and application design space with the goal of answering interesting research questions, such as, (1) what are the architectural design trade-offs in heterogeneous CPU-GPU processors and (2) how do we best maximize heterogeneous CPU-GPU application performance on a given system. To enable my exploration of the heterogeneous CPU-GPU design space, I introduce a novel discrete event-driven simulation library called KnightSim and a novel computer architectural simulator called M2S-CGM. M2S-CGM includes all of the simulation elements necessary to simulate coherent execution between a CPU and GPU with shared LLC and shared virtual memory address spaces. I then utilize M2S-CGM for the conduct of three architectural studies. First, I study the architectural effects of shared LLC and CPU-GPU coherence on the overall performance of non-collaborative GPU-only applications. Second, I profile and analyze a set of collaborative CPU-GPU applications to determine how to best optimize them for maximum collaborative performance. Third, I study the impact of varying four key architectural parameters on collaborative CPU-GPU performance by varying GPU compute unit coalesce size, GPU to memory controller bandwidth, GPU frequency, and system wide switching fabric latency
    • …
    corecore