7 research outputs found

    Interrupt-generating active data objects

    Get PDF
    An investigation is presented into an interrupt-generating object model which is designed to reduce the effort of programming distributed memory multicomputer networks. The object model is aimed at the natural modelling of problem domains in which a number of concurrent entities interrupt one another as they lay claim to shared resources. The proposed computational model provides for the safe encapsulation of shared data, and incorporates inherent arbitration for simultaneous access to the data. It supplies a predicate triggering mechanism for use in conditional synchronization and as an alternative mechanism to polling. Linguistic support for the proposal requires a novel form of control structure which is able to interface sensibly with interrupt-generating active data objects. The thesis presents the proposal as an elemental language structure, with axiomatic guarantees which enforce safety properties and aid in program proving. The established theory of CSP is used to reason about the object model and its interface. An overview is presented of a programming language called HUL, whose semantics reflect the proposed computational model. Using the syntax of HUL, the application of the interrupt-generating active data object is illustrated. A range of standard concurrent problems is presented to demonstrate the properties of the interrupt-generating computational model. Furthermore, the thesis discusses implementation considerations which enable the model to be mapped precisely onto multicomputer networks, and which sustain the abstract programming level provided by the interrupt-generating active data object in the wider programming structures of HUL

    Interrupt-generating active data objects

    Get PDF
    An investigation is presented into an interrupt-generating object model which is designed to reduce the effort of programming distributed memory multicomputer networks. The object model is aimed at the natural modelling of problem domains in which a number of concurrent entities interrupt one another as they lay claim to shared resources. The proposed computational model provides for the safe encapsulation of shared data, and incorporates inherent arbitration for simultaneous access to the data. It supplies a predicate triggering mechanism for use in conditional synchronization and as an alternative mechanism to polling. Linguistic support for the proposal requires a novel form of control structure which is able to interface sensibly with interrupt-generating active data objects. The thesis presents the proposal as an elemental language structure, with axiomatic guarantees which enforce safety properties and aid in program proving. The established theory of CSP is used to reason about the object model and its interface. An overview is presented of a programming language called HUL, whose semantics reflect the proposed computational model. Using the syntax of HUL, the application of the interrupt-generating active data object is illustrated. A range of standard concurrent problems is presented to demonstrate the properties of the interrupt-generating computational model. Furthermore, the thesis discusses implementation considerations which enable the model to be mapped precisely onto multicomputer networks, and which sustain the abstract programming level provided by the interrupt-generating active data object in the wider programming structures of HUL

    Optimizations and Cost Models for multi-core architectures: an approach based on parallel paradigms

    Get PDF
    The trend in modern microprocessor architectures is clear: multi-core chips are here to stay, and researchers expect multiprocessors with 128 to 1024 cores on a chip in some years. Yet the software community is slowly taking the path towards parallel programming: while some works target multi-cores, these are usually inherited from the previous tools for SMP architectures, and rarely exploit specific characteristics of multi-cores. But most important, current tools have no facilities to guarantee performance or portability among architectures. Our research group was one of the first to propose the structured parallel programming approach to solve the problem of performance portability and predictability. This has been successfully demonstrated years ago for distributed and shared memory multiprocessors, and we strongly believe that the same should be applied to multi-core architectures. The main problem with performance portability is that optimizations are effective only under specific conditions, making them dependent on both the specific program and the target architecture. For this reason in current parallel programming (in general, but especially with multi-cores) optimizations usually follows a try-and-decide approach: each one must be implemented and tested on the specific parallel program to understand its benefits. If we want to make a step forward and really achieve some form of performance portability, we require some kind of prediction of the expected performance of a program. The concept of performance modeling is quite old in the world of parallel programming; yet, in the last years, this kind of research saw small improvements: cost models to describe multi-cores are missing, mainly because of the increasing complexity of microarchitectures and the poor knowledge of specific implementation details of current processors. In the first part of this thesis we prove that the way of performance modeling is still feasible, by studying the Tilera TilePro64. The high number of cores on-chip in this processor (64) required the use of several innovative solutions, such as a complex interconnection network and the use of multiple memory interfaces per chip. For these features the TilePro64 can be considered an insight of what to expect in future multi-core processors. The availability of a cycle-accurate simulator and an extensive documentation allowed us to model the architecture, and in particular its memory subsystem, at the accuracy level required to compare optimizations In the second part, focused on optimizations, we cover one of the most important issue of multi-core architectures: the memory subsystem. In this area multi-core strongly differs in their structure w.r.t off-chip parallel architectures, both SMP and NUMA, thus opening new opportunities. In detail, we investigate the problem of data distribution over the memory controllers in several commercial multi-cores, and the efficient use of the cache coherency mechanisms offered by the TilePro64 processor. Finally, by using the performance model, we study different implementations, derived from the previous optimizations, of a simple test-case application. We are able to predict the best version using only profiled data from a sequential execution. The accuracy of the model has been verified by experimentally comparing the implementations on the real architecture, giving results within 1 − 2% of accuracy
    corecore