10 research outputs found

    Quality of Service Driven Runtime Resource Allocation in Reconfigurable HPC Architectures

    Get PDF
    Heterogeneous System Architectures (HSA) are gaining importance in the High Performance Computing (HPC) domain due to increasing computational requirements coupled with energy consumption concerns, which conventional CPU architectures fail to effectively address. Systems based on Field Programmable Gate Array (FPGA) recently emerged as an effective alternative to Graphical Processing Units (GPUs) for demanding HPC applications, although they lack the abstractions available in conventional CPU-based systems. This work tackles the problem of runtime resource management of a system using FPGA-based co-processors to accelerate multi-programmed HPC workloads. We propose a novel resource manager able to dynamically vary the number of FPGAs allocated to each of the jobs running in a multi-accelerator system, with the goal of meeting a given Quality of Service metric for the running jobs measured in terms of deadline or throughput. We implement the proposed resource manager in a commercial HPC system, evaluating its behavior with representative workloads

    Scala-based domain-specific language for creating accelerator-based SoCs

    No full text
    Nowadays, thanks to technology miniaturization and industrial standards, it is possible to create System-on-Chip (SoC) architectures featuring a combination of many components, like processor cores and specialized hardware accelerators. However, designing an SoC to accelerate an embedded application is particularly complex. After decomposing this application into tasks and assigning each of them to a processing element, the designer must create the required hardware components and integrate them into the final system. Currently, this process is not well supported by commercial tool flows and has to be manually performed. This is time consuming and error prone. This paper proposes a Domain-Specific Language (DSL) based on Scala to specify the architecture of accelerator-based SoCs. We leverage this DSL to coordinate commercial High-Level Synthesis (HLS) tools in order to create the corresponding accelerators with proper standard interfaces for system-level integration

    A system-level simulation framework for evaluating resource management policies for heterogeneous system architectures

    No full text
    Nowadays, heterogeneous system architectures, integrating CPUs and one or more kinds of accelerators (e.g., GPUs or HW accelerators), are a promising solution to achieve high performance for data-intensive workloads while fulfilling other system-level requirements on the available power/energy budgets. However, heterogeneity comes at the cost of greater design and management complexity leading to an increasing quest for the definition of innovative runtime resource management policies. We propose a system-level simulation framework implemented in SystemC and Transaction Level Modeling for a fast evaluation of resource management policies for such systems to provide a quick feedback to the middleware designer. A set of case studies shows the efficiency of the proposed framework in supporting a fast analysis of the investigated policies

    An orchestrated approach to efficiently manage resources in heterogeneous system architectures

    No full text
    Nowadays, we are witnessing trends in technology, fabrication processes and computing architectures that lead to the design and development of processing systems constituted by a relevant number of independent, heterogeneous execution resources. The aim is to achieve high-performance while leveraging on other aspects, such as energy consumption. Indeed, heterogeneity comes at the cost of greater design and management complexity. To reach an optimal solution, system architects need to take into account the efficiency of systems' units, i.e., general purpose processors eventually with one or more kinds of accelerators (e.g., GPUs or FPGAs), as well as the workload. This often leads to inefficiency in the exploitation of such resources, and therefore in performance/energy. Within this context, we are proposing a runtime resource manager able to observe the system execution and to dynamically optimise its behaviour with respect to one or more identified functional parameters, according to the architectural characteristics, and the users' and the applications' needs. Such an adaptation characteristic is intrinsically embedded in the device as a software layer, called Orchestrator, able to adapt the runtime resource management according to the target objectives and to the inputs from the external environment

    Floorplanning Automation for Partial-Reconfigurable FPGAs via Feasible Placements Generation

    No full text
    When dealing with partially reconfigurable designs on field-programmable gate array, floorplanning represents a critical step that highly impacts system's performance and reconfiguration overhead. However, current vendor design tools still require the floorplan to be manually defined by the designer. Within this paper, we provide a novel floorplanning automation framework, integrated in the Xilinx tool chain, which is based on an explicit enumeration of the possible placements of each region. Moreover, we propose a genetic algorithm (GA), enhanced with a local search strategy, to automate the floorplanning activity on the defined direct problem representation. The proposed approach has been experimentally evaluated with a synthetic benchmark suite and real case studies. We compared the designed solution against both the state-of-the-art algorithms and alternative engines based on the same direct problem representation. Experimental results demonstrated the effectiveness of the proposed direct problem representation and the superiority of the defined GA engine with respect to the other approaches in terms of exploration time and identified solution

    On How to improve FPGA-based systems design productivity via SDAccel

    No full text
    Custom hardware accelerators are widely used to improve the performance of software applications in terms of execution times and to reduce energy consumption. However the realization of an hardware accelerator and its integration in the final system is a difficult and error prone task. For this reason, both Industry and Academy are continuously developing Computer Aided Design (CAD) tools to assist the designer in the development process. Even if many of the steps have been nowadays automated, system integration and SW/HW interfaces definition and drivers generation are still almost completely manual tasks. The last tool released by Xilinx, however, aims at improving the hardware design experience by leveraging the OpenCL standard to enhance the overall productivity and to enable code portability. This paper provides an overview of the SDAccel potentiality comparing its design flow with other methodologies using two case studies from the Bioinformatics field: brain network and protein folding analysis

    Workload-aware power optimization strategy for asymmetric multiprocessors

    No full text
    Asymmetric multi-core architectures, such as the ARM big.LITTLE, are emerging as successful solutions for the embedded and mobile markets due to their capabilities to trade-off performance and power consumption. However, both the Heterogeneous Multi-Processing (HMP) scheduler integrated in the commercial products and the previous research approaches are not able to fully exploit such potentiality. We propose a new runtime resource management policy for the big. LITTLE architecture integrated in Linux aimed at optimizing the power consumption while fulfilling performance requirements specified for the running applications. Experimental results show an improvement of the 11% on the performance and at the same time 8% in peak power consumption w.r.t. the current Linux HMP solution

    Runtime Resource Management in Heterogeneous System Architectures: The SAVE Approach

    No full text
    The increasing availability of different kinds of processing resources in heterogeneous system architectures associated with today's fast-changing, unpredictable workloads has propelled an interest towards systems able to dynamically and autonomously adapt how computing resources are exploited to optimize a given goal. Self-adaptiveness and hardware-assisted virtualization are the two key-enabling technologies for this kind of architectures, to allow the efficient exploitation of the available resources based on the current working context. The SAVE project will develop HW/SW/OS components that allow for deciding at runtime the mapping of the computation kernels on the appropriate type of resource, based on the current system context and requirements

    Using just-in-time code generation for transparent resource management in heterogeneous systems

    No full text
    Hardware accelerators are becoming popular in academia and industry. To move one step further from the state-of-the-art multicore plus accelerator approaches, we present in this paper our innovative SAVEHSA architecture. It comprises of a heterogeneous hardware platform with three different high-end accelerators attached over PCIe (GPGPU, FPGA and Intel MIC). Such systems can process parallel workloads very efficiently whilst being more energy efficient than regular CPU systems. To leverage the heterogeneity, the workload has to be distributed among the computing units in a way that each unit is well-suited for the assigned task and executable code must be available. To tackle this problem we present two software components; the first can perform resource allocation at runtime while respecting system and application goals (in terms of throughput, energy, latency, etc.) and the second is able to analyze an application and generate executable code for an accelerator at runtime. We demonstrate the first proof-of-concept implementation of our framework on the heterogeneous platform, discuss different runtime policies and measure the introduced overheads
    corecore