6,195 research outputs found

    A Review on Software Architectures for Heterogeneous Platforms

    Full text link
    The increasing demands for computing performance have been a reality regardless of the requirements for smaller and more energy efficient devices. Throughout the years, the strategy adopted by industry was to increase the robustness of a single processor by increasing its clock frequency and mounting more transistors so more calculations could be executed. However, it is known that the physical limits of such processors are being reached, and one way to fulfill such increasing computing demands has been to adopt a strategy based on heterogeneous computing, i.e., using a heterogeneous platform containing more than one type of processor. This way, different types of tasks can be executed by processors that are specialized in them. Heterogeneous computing, however, poses a number of challenges to software engineering, especially in the architecture and deployment phases. In this paper, we conduct an empirical study that aims at discovering the state-of-the-art in software architecture for heterogeneous computing, with focus on deployment. We conduct a systematic mapping study that retrieved 28 studies, which were critically assessed to obtain an overview of the research field. We identified gaps and trends that can be used by both researchers and practitioners as guides to further investigate the topic

    A Domain Specific Approach to High Performance Heterogeneous Computing

    Full text link
    Users of heterogeneous computing systems face two problems: firstly, in understanding the trade-off relationships between the observable characteristics of their applications, such as latency and quality of the result, and secondly, how to exploit knowledge of these characteristics to allocate work to distributed computing platforms efficiently. A domain specific approach addresses both of these problems. By considering a subset of operations or functions, models of the observable characteristics or domain metrics may be formulated in advance, and populated at run-time for task instances. These metric models can then be used to express the allocation of work as a constrained integer program, which can be solved using heuristics, machine learning or Mixed Integer Linear Programming (MILP) frameworks. These claims are illustrated using the example domain of derivatives pricing in computational finance, with the domain metrics of workload latency or makespan and pricing accuracy. For a large, varied workload of 128 Black-Scholes and Heston model-based option pricing tasks, running upon a diverse array of 16 Multicore CPUs, GPUs and FPGAs platforms, predictions made by models of both the makespan and accuracy are generally within 10% of the run-time performance. When these models are used as inputs to machine learning and MILP-based workload allocation approaches, a latency improvement of up to 24 and 270 times over the heuristic approach is seen.Comment: 14 pages, preprint draft, minor revisio

    3E: Energy-Efficient Elastic Scheduling for Independent Tasks in Heterogeneous Computing Systems

    Get PDF
    Reducing energy consumption is a major design constraint for modern heterogeneous computing systems to minimize electricity cost, improve system reliability and protect environment. Conventional energy-efficient scheduling strategies developed on these systems do not sufficiently exploit the system elasticity and adaptability for maximum energy savings, and do not simultaneously take account of user expected finish time. In this paper, we develop a novel scheduling strategy named energy-efficient elastic (3E) scheduling for aperiodic, independent and non-real-time tasks with user expected finish times on DVFS-enabled heterogeneous computing systems. The 3E strategy adjusts processors’ supply voltages and frequencies according to the system workload, and makes trade-offs between energy consumption and user expected finish times. Compared with other energy-efficient strategies, 3E significantly improves the scheduling quality and effectively enhances the system elasticity

    DNET: A communications facility for distributed heterogeneous computing

    Get PDF
    This document describes DNET, a heterogeneous data communications networking facility. DNET allows programs operating on hosts on dissimilar networks to communicate with one another without concern for computer hardware, network protocol, or operating system differences. The overall DNET network is defined as the collection of host machines/networks on which the DNET software is operating. Each underlying network is considered a DNET 'domain'. Data communications service is provided between any two processes on any two hosts on any of the networks (domains) that may be reached via DNET. DNET provides protocol transparent, reliable, streaming data transmission between hosts (restricted, initially to DECnet and TCP/IP networks). DNET also provides variable length datagram service with optional return receipts

    LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing

    Get PDF
    LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft

    Aggregating Heterogeneous Computing Resources

    Get PDF
    The project's core objective is to develop a sophisticated web portal that acts as a comprehensive platform for optimizing resource allocation and improving system administration tasks. Key goals include streamlining resource management processes, enhancing resource utilization, monitoring hardware performance, and boosting system reliability. The envisioned web portal is designed to merge data from multiple sources into a unified interface, granting administrators a real-time overview of the diverse computing nodes within the openlab network. Automation will play a significant role, enabling the portal to gather data from internal lists and deployment tools, potentially utilizing scripts and databases for comprehensive machine information. This consolidated data could be harnessed to generate valuable reports and notifications about user expirations, thereby aiding administrators in effective resource management. Moreover, the web portal will offer manual functionalities, allowing administrators to manually add or remove non-managed machines, include extra users, and engage directly with specific machine users via broadcast emails. This flexibility contributes to the network's adaptability to changing project needs and resource demands. The significance of this project stems from its potential to enhance the efficiency and effectiveness of system administrators. By centralizing resource management through the portal, administrators can align resources with project requirements, minimize wastage, optimize workload distribution, and ultimately bolster system stability. These enhancements, in turn, foster a more resilient and dependable computing environment, pivotal for the successful execution of projects within the intricate openlab network

    Dwarfs on Accelerators: Enhancing OpenCL Benchmarking for Heterogeneous Computing Architectures

    Full text link
    For reasons of both performance and energy efficiency, high-performance computing (HPC) hardware is becoming increasingly heterogeneous. The OpenCL framework supports portable programming across a wide range of computing devices and is gaining influence in programming next-generation accelerators. To characterize the performance of these devices across a range of applications requires a diverse, portable and configurable benchmark suite, and OpenCL is an attractive programming model for this purpose. We present an extended and enhanced version of the OpenDwarfs OpenCL benchmark suite, with a strong focus placed on the robustness of applications, curation of additional benchmarks with an increased emphasis on correctness of results and choice of problem size. Preliminary results and analysis are reported for eight benchmark codes on a diverse set of architectures -- three Intel CPUs, five Nvidia GPUs, six AMD GPUs and a Xeon Phi.Comment: 10 pages, 5 figure

    Efficient Machine Learning on Heterogeneous Computing Systems through a Coordinated Runtime System

    Get PDF
    Department of Computer Science and EngineeringAs machine learning grows, a heterogeneous computing system is actively used for a solution to increase the efficiency of machine learning. Although there are the prior studies for improving the efficiency of machine learning, the runtime support for heterogeneous computing system remains unexplored field. Our paper presents CEML, which is a runtime system to enhance the efficiency of machine learning on heterogeneous computing systems. CEML characterizes the machine-learning application in terms of the performance and power consumption at runtime, builds accurate the estimation models that estimate the performance and power consumption of the machine-learning application. CEML dynamically adapts the heterogeneous computing system to the efficient system state estimated to enhance the efficiency while satisfying constraints. We demonstrate the effectiveness of CEML by the evaluation in terms of the accuracy of estimators, the energy efficiency, the re-adaptation functionality, and runtime overheads on two full heterogeneous computing systems.clos
    • 

    corecore