2,560 research outputs found

    The MANGO clockless network-on-chip: Concepts and implementation

    Get PDF

    Non-intrusive on-the-fly data race detection using execution replay

    Full text link
    This paper presents a practical solution for detecting data races in parallel programs. The solution consists of a combination of execution replay (RecPlay) with automatic on-the-fly data race detection. This combination enables us to perform the data race detection on an unaltered execution (almost no probe effect). Furthermore, the usage of multilevel bitmaps and snooped matrix clocks limits the amount of memory used. As the record phase of RecPlay is highly efficient, there is no need to switch it off, hereby eliminating the possibility of Heisenbugs because tracing can be left on all the time.Comment: In M. Ducasse (ed), proceedings of the Fourth International Workshop on Automated Debugging (AAdebug 2000), August 2000, Munich. cs.SE/001003

    Process Management and Exception Handling in Multiprocessor Operating Systems Using Object-Oriented Design Techniques

    Get PDF
    The programming of the interrupt handling mechanisms, process switching primitives, scheduling mechanism, and synchronization primitives of an operating system for a multiprocessor require both efficient code in order to support the needs of high- performance or real-time applications and careful organization to facilitate maintenance. Although many advantages have been claimed for object-oriented class hierarchical languages and their corresponding design methodologies, the application of these techniques to the design of the primitives within an operating system has not been widely demonstrated. To investigate the role of class hierarchical design in systems programming, the authors have constructed the Choices multiprocessor operating system architecture the C++ programming language. During the implementation, it was found that many operating system design concerns can be represented advantageously using a class hierarchical approach, including: the separation of mechanism and policy; the organization of an operating system into layers, each of which represents an abstract machine; and the notions of process and exception management. In this paper, we discuss an implementation of the low-level primitives of this system and outline the strategy by which we developed our solution

    A UNIFIED HARDWARE/SOFTWARE PRIORITY SCHEDULING MODEL FOR GENERAL PURPOSE SYSTEMS

    Get PDF
    Migrating functionality from software to hardware has historically held the promise of enhancing performance through exploiting the inherent parallel nature of hardware. Many early exploratory efforts in repartitioning traditional software based services into hardware were hampered by expensive ASIC development costs. Recent advancements in FPGA technology have made it more economically feasible to explore migrating functionality across the hardware/software boundary. The flexibility of the FPGA fabric and availability of configurable soft IP components has opened the potential to rapidly and economically investigate different hardware/software partitions. Within the real time operating systems community, there has been continued interest in applying hardware/software co-design approaches to address scheduling issues such as latency and jitter. Many hardware based approaches have been reported to reduce the latency of computing the scheduling decision function itself. However continued adherence to classic scheduler invocation mechanisms can still allow variable latencies to creep into the time taken to make the scheduling decision, and ultimately into application timelines. This dissertation explores how hardware/software co-design can be applied past the scheduling decision itself to also reduce the non-predictable delays associated with interrupts and timers. By expanding the window of hardware/software co-design to these invocation mechanisms, we seek to understand if the jitter introduced by classical hardware/software partitionings can be removed from the timelines of critical real time user processes. This dissertation makes a case for resetting the classic boundaries of software thread level scheduling, software timers, hardware timers and interrupts. We show that reworking the boundaries of the scheduling invocation mechanisms helps to rectify the current imbalance of traditional hardware invocation mechanisms (timers and interrupts) and software scheduling policy (operating system scheduler). We re-factor these mechanisms into a unified hardware software priority scheduling model to facilitate improvements in performance, timeliness and determinism in all domains of computing. This dissertation demonstrates and prototypes the creation of a new framework that effects this basic policy change. The advantage of this approach lies within it's ability to unify, simplify and allow for more control within the operating systems scheduling policy

    Digital signal conditioning on multiprocessor systems

    Get PDF
    An important application area of modem computer systems is that of digital signal processing. This discipline is concerned with the analysis or modification of digitally represented signals, through the use of simple mathematical operations. A primary need of such systems is that of high data throughput. Although optimised programmable processors are available, system designers are now looking towards parallel processing to gain further performance increases. Such parallel systems may be easily constructed using the transputer family of processors. However, although these devices are comparatively easy to program, they possess a general von Neumann core and so are relatively inefficient at implementing digital signal processing algorithms. The power of the transputer lies in its ability to communicate effectively, not in its computational capability. The converse is true of specialised digital signal processors. These devices have been designed specifically to implement the type of small data intensive operations required by digital signal processing algorithms, but have not been designed to operate efficiently in a multiprocessor environment. This thesis examines the performance of both types of processors with reference to a common signal processing application, multichannel filtering. The transputer is examined in both uniprocessor and multiprocessor configurations, and its performance analysed. A theoretical model of program behaviour is developed, in order to assess the performance benefits of particular code structures and the effects of such parameters as data block size. The transputer implementation is contrasted with that of the Motorola DSP56001 digital signal processor. This device is found to be much more efficient at implementing such algorithms on a single device, but provides limited multiprocessor support. Using the conclusions of this assessment, a hybrid multiprocessor has been designed. This consists of a transputer controlling a number of signal processors, communicating through shared memory, separating tiie tasks of computation and communication. Forcing the transputer to communicate through shared memory causes problems, and these have been addressed. A theoretical performance model of the system has been produced. A small system has been constructed, and is currently running performance test software

    Scheduling support for concurrency and parallelism in the Mach operating system

    Full text link
    • …
    corecore