611 research outputs found
Efficient and portable multi-tasking for heterogeneous systems
Modern computing systems comprise heterogeneous designs which combine multiple
and diverse architectures on a single system. These designs provide potentials for
high performance under reduced power requirements but require advanced resource
management and workload scheduling across the available processors.
Programmability frameworks, such as OpenCL and CUDA, enable resource management
and workload scheduling on heterogeneous systems. These frameworks fully
assign the control of resource allocation and scheduling to the application. This design
sufficiently serves the needs of dedicated application systems but introduces significant
challenges for multi-tasking environments where multiple users and applications
compete for access to system resources.
This thesis considers these challenges and presents three major contributions that
enable efficient multi-tasking on heterogeneous systems. The presented contributions
are compatible with existing systems, remain portable across vendors and do not require
application changes or recompilation.
The first contribution of this thesis is an optimization technique that reduces host-device
communication overhead for OpenCL applications. It does this without modification
or recompilation of the application source code and is portable across platforms.
This work enables efficiency and performance improvements for diverse application
workloads found on multi-tasking systems.
The second contribution is the design and implementation of a secure, user-space
virtualization layer that integrates the accelerator resources of a system with the standard
multi-tasking and user-space virtualization facilities of the commodity Linux OS.
It enables fine-grained sharing of mixed-vendor accelerator resources and targets heterogeneous
systems found in data center nodes and requires no modification to the OS,
OpenCL or application.
Lastly, the third contribution is a technique and software infrastructure that enable
resource sharing control on accelerators, while supporting software managed scheduling
on accelerators. The infrastructure remains transparent to existing systems and
applications and requires no modifications or recompilation. In enforces fair accelerator
sharing which is required for multi-tasking purposes
MURAC: A unified machine model for heterogeneous computers
Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination
FairGV: Fair and Fast GPU Virtualization
Increasingly high-performance computing (HPC) application developers are opting to use cloud resources due to higher availability. Virtualized GPUs would be an obvious and attractive option for HPC application developers using cloud hosting services. Unfortunately, existing GPU virtualization software is not ready to address fairness, utilization, and performance limitations associated with consolidating mixed HPC workloads. This paper presents FairGV, a radically redesigned GPU virtualization system that achieves system-wide weighted fair sharing and strong performance isolation in mixed workloads that use GPUs with variable degrees of intensity. To achieve its objectives, FairGV introduces a trap-less GPU processing architecture, a new fair queuing method integrated with work-conserving and GPU-centric co-scheduling polices, and a collaborative scheduling method for non-preemptive GPUs. Our prototype implementation achieves near ideal fairness (? 0.97 Min-Max Ratio) with little performance degradation (? 1.02 aggregated overhead) in a range of mixed HPC workloads that leverage GPUs
- …