8 research outputs found

    Identifying and exploiting concurrency in object-based real-time systems

    Get PDF
    The use of object-based mechanisms, i.e., abstract data types (ADTs), for constructing software systems can help to decrease development costs, increase understandability and increase maintainability. However, execution efficiency may be sacrificed due to the large number of procedure calls, and due to contention for shared ADTs in concurrent systems. Such inefficiencies are a concern in real-time applications that have stringent timing requirements. To address these issues, the potentially inefficient procedure calls are turned into a source of concurrency via asynchronous procedure calls (ARPCs), and contention for shared ADTS is reduced via ADT cloning. A framework for concurrency analysis in object-based systems is developed, and compiler techniques for identifying potential concurrency via ARPCs and cloning are introduced. Exploitation of the parallelizing compiler techniques is illustrated in the context of an incremental schedule construction algorithm that enhances concurrency incrementally so that feasible real-time schedules can be constructed. Experimental results show large speedup gains with these techniques. Additionally, experiments show that the concurrency enhancement techniques are often useful in constructing feasible schedules for hard real-time systems

    Identifying reusable functions in code using specification driven techniques

    Get PDF
    The work described in this thesis addresses the field of software reuse. Software reuse is widely considered as a way to increase the productivity and improve the quality and reliability of new software systems. Identifying, extracting and reengineering software. components which implement abstractions within existing systems is a promising cost-effective way to create reusable assets. Such a process is referred to as reuse reengineering. A reference paradigm has been defined within the RE(^2) project which decomposes a reuse reengineering process in five sequential phases. In particular, the first phase of the reference paradigm, called Candidature phase, is concerned with the analysis of source code for the identification of software components implementing abstractions and which are therefore candidate to be reused. Different candidature criteria exist for the identification of reuse-candidate software components. They can be classified in structural methods (based on structural properties of the software) and specification driven methods (that search for software components implementing a given specification).In this thesis a new specification driven candidature criterion for the identification and the extraction of code fragments implementing functional abstractions is presented. The method is driven by a formal specification of the function to be isolated (given in terms of a precondition and a post condition) and is based on the theoretical frameworks of program slicing and symbolic execution. Symbolic execution and theorem proving techniques are used to map the specification of the functional abstractions onto a slicing criterion. Once the slicing criterion has been identified the slice is isolated using algorithms based on dependence graphs. The method has been specialised for programs written in the C language. Both symbolic execution and program slicing are performed by exploiting the Combined C Graph (CCG), a fine-grained dependence based program representation that can be used for several software maintenance tasks

    Array optimizations for high productivity programming languages

    Get PDF
    While the HPCS languages (Chapel, Fortress and X10) have introduced improvements in programmer productivity, several challenges still remain in delivering high performance. In the absence of optimization, the high-level language constructs that improve productivity can result in order-of-magnitude runtime performance degradations. This dissertation addresses the problem of efficient code generation for high-level array accesses in the X10 language. The X10 language supports rank-independent specification of loop and array computations using regions and points. Three aspects of high-level array accesses in X10 are important for productivity but also pose significant performance challenges: high-level accesses are performed through Point objects rather than integer indices, variables containing references to arrays are rank-independent, and array subscripts are verified as legal array indices during runtime program execution. Our solution to the first challenge is to introduce new analyses and transformations that enable automatic inlining and scalar replacement of Point objects. Our solution to the second challenge is a hybrid approach. We use an interprocedural rank analysis algorithm to automatically infer ranks of arrays in X10. We use rank analysis information to enable storage transformations on arrays. If rank-independent array references still remain after compiler analysis, the programmer can use X10's dependent type system to safely annotate array variable declarations with additional information for the rank and region of the variable, and to enable the compiler to generate efficient code in cases where the dependent type information is available. Our solution to the third challenge is to use a new interprocedural array bounds analysis approach using regions to automatically determine when runtime bounds checks are not needed. Our performance results show that our optimizations deliver performance that rivals the performance of hand-tuned code with explicit rank-specific loops and lower-level array accesses, and is up to two orders of magnitude faster than unoptimized, high-level X10 programs. These optimizations also result in scalability improvements of X10 programs as we increase the number of CPUs. While we perform the optimizations primarily in X10, these techniques are applicable to other high-productivity languages such as Chapel and Fortress

    Compiling for parallel multithreaded computation on symmetric multiprocessors

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 145-149).by Andrew Shaw.Ph.D

    Multipurpose short-term memory structures.

    Get PDF
    by Yung, Chan.Thesis (M.Phil.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 107-110).Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Cache --- p.1Chapter 1.1.1 --- Introduction --- p.1Chapter 1.1.2 --- Data Prefetching --- p.2Chapter 1.2 --- Register --- p.2Chapter 1.3 --- Problems and Challenges --- p.3Chapter 1.3.1 --- Overhead of registers --- p.3Chapter 1.3.2 --- EReg --- p.5Chapter 1.4 --- Organization of the Thesis --- p.6Chapter 2 --- Previous Studies --- p.8Chapter 2.1 --- Introduction --- p.8Chapter 2.2 --- Data aliasing --- p.9Chapter 2.3 --- Data prefetching --- p.12Chapter 2.3.1 --- Introduction --- p.12Chapter 2.3.2 --- Hardware Prefetching --- p.12Chapter 2.3.3 --- Prefetching with Software Support --- p.13Chapter 2.3.4 --- Reducing Cache Pollution --- p.14Chapter 3 --- BASIC and ADM Models --- p.15Chapter 3.1 --- Introduction of Basic Model --- p.15Chapter 3.2 --- Architectural and Operational Detail of Basic Model --- p.18Chapter 3.3 --- Discussion --- p.19Chapter 3.3.1 --- Implicit Storing --- p.19Chapter 3.3.2 --- Associative Logic --- p.22Chapter 3.4 --- Example for Basic Model --- p.22Chapter 3.5 --- Simulation Results --- p.23Chapter 3.6 --- Temporary Storage Problem in Basic Model --- p.29Chapter 3.6.1 --- Introduction --- p.29Chapter 3.6.2 --- Discussion on the Solutions --- p.31Chapter 3.7 --- Introduction of ADM Model --- p.35Chapter 3.8 --- Architectural and Operational Detail of ADM Model --- p.37Chapter 3.9 --- Discussion --- p.39Chapter 3.9.1 --- File Partition --- p.39Chapter 3.9.2 --- STORE Instruction --- p.39Chapter 3.10 --- Example for ADM Model --- p.40Chapter 3.11 --- Simulation Results --- p.40Chapter 3.12 --- Temporary storage Problem of ADM Model --- p.46Chapter 3.12.1 --- Introduction --- p.46Chapter 3.12.2 --- Discussion on the Solutions --- p.46Chapter 4 --- ADS Model and ADSM Model --- p.49Chapter 4.1 --- Introduction of ADS Model --- p.49Chapter 4.2 --- Architectural and Operational Detail of ADS Model --- p.50Chapter 4.3 --- Discussion --- p.52Chapter 4.3.1 --- Prefetching Priority --- p.52Chapter 4.3.2 --- Data Prefetching --- p.53Chapter 4.3.3 --- EReg File Splitting --- p.53Chapter 4.3.4 --- Compiling Procedure --- p.53Chapter 4.4 --- Example for ADS Model --- p.54Chapter 4.5 --- Simulation Results --- p.55Chapter 4.6 --- Discussion on the Architectural and Operational Variations for ADS Model --- p.62Chapter 4.6.1 --- Temporary storage Problem --- p.62Chapter 4.6.2 --- Operational variation for Data Prefetching --- p.63Chapter 4.7 --- Introduction of ADSM Model --- p.64Chapter 4.8 --- Architectural and Operational Detail of ADSM Model --- p.65Chapter 4.9 --- Discussion --- p.67Chapter 4.10 --- Example for ADSM Model --- p.67Chapter 4.11 --- Simulation Results --- p.68Chapter 4.12 --- Discussion on the Architectural and Operational Variations for ADSM Model --- p.71Chapter 4.12.1 --- Temporary storage Problem --- p.71Chapter 4.12.2 --- Operational variation for Data Prefetching --- p.73Chapter 5 --- IADSM Model and IADSMC&IDLC Model --- p.75Chapter 5.1 --- Introduction of IADSM Model --- p.75Chapter 5.2 --- Architectural and Operational Detail of IADSM Model --- p.76Chapter 5.3 --- Discussion --- p.79Chapter 5.3.1 --- Implicit Loading --- p.79Chapter 5.3.2 --- Compiling Procedure --- p.81Chapter 5.4 --- Example for IADSM Model --- p.81Chapter 5.5 --- Simulation Results --- p.84Chapter 5.6 --- Temporary Storage Problem of IADSM Model --- p.87Chapter 5.7 --- Introduction of IADSMC&IDLC Model..........: --- p.88Chapter 5.8 --- Architectural and Operational Detail of IADSMC & IDLC Model --- p.89Chapter 5.9 --- Discussion --- p.90Chapter 5.9.1 --- Additional Operations --- p.90Chapter 5.9.2 --- Compiling Procedure --- p.93Chapter 5.10 --- Example for IADSMC&IDLC Model --- p.93Chapter 5.11 --- Simulation Results --- p.94Chapter 5.12 --- Temporary Storage Problem of IADSMC&IDLC Model --- p.96Chapter 6 --- Compiler and Memory System Support for EReg --- p.99Chapter 6.1 --- Impact on Compiler --- p.99Chapter 6.1.1 --- Register Usage --- p.99Chapter 6.1.2 --- Effect of Unrolling --- p.100Chapter 6.1.3 --- Code Scheduling Algorithm --- p.101Chapter 6.2 --- Impact on Memory System --- p.102Chapter 6.2.1 --- Memory Bottleneck --- p.102Chapter 6.2.2 --- Size of EReg Files --- p.103Chapter 7 --- Conclusions --- p.104Chapter 7.1 --- Summary --- p.104Chapter 7.2 --- Future Research --- p.105Bibliography --- p.107Chapter A --- Source code of the Kernels --- p.111Chapter B --- Program Analysis --- p.126Chapter B.1 --- Program analysed by Basic Model --- p.126Chapter B.2 --- Program analysed by ADM Model --- p.133Chapter B.3 --- Program analysed by ADS Model --- p.140Chapter B.4 --- Program analysed by ADSM Model --- p.148Chapter B.5 --- Program analysed by IADSM Model --- p.156Chapter B.6 --- Program analysed by IADSMC&IDLC Model --- p.163Chapter C --- Cache Simulation on Prefetching of ADS model --- p.17
    corecore