1,040 research outputs found
MURAC: A unified machine model for heterogeneous computers
Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination
An automatic programming tool for heterogeneous
Recent advances in network technology and the higher levels of circuit integration due to VLSI have led to widespread interest in the use of multiprocessor systems in solving many practical problems. As the hardware continues to diminish in size and cost, new possibilities are being created for systems that are heterogeneous by design. Parallel multiprocessor architectures are now feasible and provide a valid solution to the throughput rates demands of the increasing sophistication of control and/or instrumentation systems. Increasing the number of processors and the complexity of the problems to be solved makes programming multiprocessor systems more difficult and error-prone. This paper describes some parts already implemented (mainly the scheduler) of a software development tool for heterogeneous multiprocessor system that will perform automatically: code generation, execution time estimation, scheduling and handles the communication primitive insertion
An architecture for intelligent task interruption
In the design of real time systems the capability for task interruption is often considered essential. The problem of task interruption in knowledge-based domains is examined. It is proposed that task interruption can be often avoided by using appropriate functional architectures and knowledge engineering principles. Situations for which task interruption is indispensable, a preliminary architecture based on priority hierarchies is described
Performance and energy footprint assessment of FPGAs and GPUs on HPC systems using Astrophysics application
New challenges in Astronomy and Astrophysics (AA) are urging the need for a
large number of exceptionally computationally intensive simulations. "Exascale"
(and beyond) computational facilities are mandatory to address the size of
theoretical problems and data coming from the new generation of observational
facilities in AA. Currently, the High Performance Computing (HPC) sector is
undergoing a profound phase of innovation, in which the primary challenge to
the achievement of the "Exascale" is the power-consumption. The goal of this
work is to give some insights about performance and energy footprint of
contemporary architectures for a real astrophysical application in an HPC
context. We use a state-of-the-art N-body application that we re-engineered and
optimized to exploit the heterogeneous underlying hardware fully. We
quantitatively evaluate the impact of computation on energy consumption when
running on four different platforms. Two of them represent the current HPC
systems (Intel-based and equipped with NVIDIA GPUs), one is a micro-cluster
based on ARM-MPSoC, and one is a "prototype towards Exascale" equipped with
ARM-MPSoCs tightly coupled with FPGAs. We investigate the behavior of the
different devices where the high-end GPUs excel in terms of time-to-solution
while MPSoC-FPGA systems outperform GPUs in power consumption. Our experience
reveals that considering FPGAs for computationally intensive application seems
very promising, as their performance is improving to meet the requirements of
scientific applications. This work can be a reference for future platforms
development for astrophysics applications where computationally intensive
calculations are required.Comment: 15 pages, 4 figures, 3 tables; Preprint (V2) submitted to MDPI
(Special Issue: Energy-Efficient Computing on Parallel Architectures
Evaluation of hardware architectures for parallel execution of complex database operations
Abstract New database applications, primarily in the areas of engineering and knowledge-based systems, refer to complex objects (e.g. representation of a CAD workpiece or a VLSI chip) while performing their tasks. Retrieval, maintenance, and integrity checking of such complex objects consume substantial computing resources which were traditionally used by conventional database management systems in a sequential manner. Rigid performance goals dictated by interactive use and design environments imply new approaches to master the functionality of complex objects under satisfactory time restrictions. Because of the object granularity, the set orientation of the database interface, and the complicated algorithms for object handling, the exploitation of parallelism within such operations seems to be promising. Our main goal is the investigation and evaluation of different hardware architectures and their suitability to efficiently cope with workloads generated by database operations on complex objects. Apparently, employing just a number of processors is not a panacea for our database problem. The sheer horse power of machines does not help very much when data synchronization and event serialization requirements play a major role during object handling. What are the critical hardware architecture properties? How can the existing MIPS be best utilized for the data management functions when processing complex objects? To answer these questions and related issues, we discuss different kinds of architectures combining multiple processors: loosely-, tightly-, and closely-coupled. Furthermore, we consider parallelism at different levels of abstraction: the distribution of (sub-)queries or the decomposition of such queries and their concurrent evaluation at an inter-or intra-object level. Finally, we give some thoughts as to the problems of load control and transaction management
Design and Performance Evaluation of Energy- Aware DVS- Based Scheduling Strategies for Hard Real- Time Embedded Multiprocessor Systems
Ph.DDOCTOR OF PHILOSOPH
- …