1,040 research outputs found

    MURAC: A unified machine model for heterogeneous computers

    Get PDF
    Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination

    An automatic programming tool for heterogeneous

    Get PDF
    Recent advances in network technology and the higher levels of circuit integration due to VLSI have led to widespread interest in the use of multiprocessor systems in solving many practical problems. As the hardware continues to diminish in size and cost, new possibilities are being created for systems that are heterogeneous by design. Parallel multiprocessor architectures are now feasible and provide a valid solution to the throughput rates demands of the increasing sophistication of control and/or instrumentation systems. Increasing the number of processors and the complexity of the problems to be solved makes programming multiprocessor systems more difficult and error-prone. This paper describes some parts already implemented (mainly the scheduler) of a software development tool for heterogeneous multiprocessor system that will perform automatically: code generation, execution time estimation, scheduling and handles the communication primitive insertion

    An architecture for intelligent task interruption

    Get PDF
    In the design of real time systems the capability for task interruption is often considered essential. The problem of task interruption in knowledge-based domains is examined. It is proposed that task interruption can be often avoided by using appropriate functional architectures and knowledge engineering principles. Situations for which task interruption is indispensable, a preliminary architecture based on priority hierarchies is described

    Performance and energy footprint assessment of FPGAs and GPUs on HPC systems using Astrophysics application

    Full text link
    New challenges in Astronomy and Astrophysics (AA) are urging the need for a large number of exceptionally computationally intensive simulations. "Exascale" (and beyond) computational facilities are mandatory to address the size of theoretical problems and data coming from the new generation of observational facilities in AA. Currently, the High Performance Computing (HPC) sector is undergoing a profound phase of innovation, in which the primary challenge to the achievement of the "Exascale" is the power-consumption. The goal of this work is to give some insights about performance and energy footprint of contemporary architectures for a real astrophysical application in an HPC context. We use a state-of-the-art N-body application that we re-engineered and optimized to exploit the heterogeneous underlying hardware fully. We quantitatively evaluate the impact of computation on energy consumption when running on four different platforms. Two of them represent the current HPC systems (Intel-based and equipped with NVIDIA GPUs), one is a micro-cluster based on ARM-MPSoC, and one is a "prototype towards Exascale" equipped with ARM-MPSoCs tightly coupled with FPGAs. We investigate the behavior of the different devices where the high-end GPUs excel in terms of time-to-solution while MPSoC-FPGA systems outperform GPUs in power consumption. Our experience reveals that considering FPGAs for computationally intensive application seems very promising, as their performance is improving to meet the requirements of scientific applications. This work can be a reference for future platforms development for astrophysics applications where computationally intensive calculations are required.Comment: 15 pages, 4 figures, 3 tables; Preprint (V2) submitted to MDPI (Special Issue: Energy-Efficient Computing on Parallel Architectures

    Evaluation of hardware architectures for parallel execution of complex database operations

    Get PDF
    Abstract New database applications, primarily in the areas of engineering and knowledge-based systems, refer to complex objects (e.g. representation of a CAD workpiece or a VLSI chip) while performing their tasks. Retrieval, maintenance, and integrity checking of such complex objects consume substantial computing resources which were traditionally used by conventional database management systems in a sequential manner. Rigid performance goals dictated by interactive use and design environments imply new approaches to master the functionality of complex objects under satisfactory time restrictions. Because of the object granularity, the set orientation of the database interface, and the complicated algorithms for object handling, the exploitation of parallelism within such operations seems to be promising. Our main goal is the investigation and evaluation of different hardware architectures and their suitability to efficiently cope with workloads generated by database operations on complex objects. Apparently, employing just a number of processors is not a panacea for our database problem. The sheer horse power of machines does not help very much when data synchronization and event serialization requirements play a major role during object handling. What are the critical hardware architecture properties? How can the existing MIPS be best utilized for the data management functions when processing complex objects? To answer these questions and related issues, we discuss different kinds of architectures combining multiple processors: loosely-, tightly-, and closely-coupled. Furthermore, we consider parallelism at different levels of abstraction: the distribution of (sub-)queries or the decomposition of such queries and their concurrent evaluation at an inter-or intra-object level. Finally, we give some thoughts as to the problems of load control and transaction management
    corecore