4 research outputs found

    Exploring heterogeneous scheduling for edge computing with CPU and FPGA MPSoCs

    Get PDF
    This paper presents a framework targeted to low-cost and low-power heterogeneous MultiProcessors that exploits FPGAs and multicore CPUs, with the overarching goal of providing developers with a productive programming model and runtime support to fully use all the processing resources available. FPGA productivity is achieved using a high-level programming model based on OpenCL, the standard for cross-platform parallel heterogeneous programming. In this work, we focus on the parallel for pattern, and as part of the runtime support for this pattern, we leverage a new scheduler that strives to maximize the number of iterations per joule by dynamically and adaptively partitioning the iteration space between the multicore and the accelerator when working simultaneously. A total of 7 benchmarks are ported and optimized for a low-cost DE1 board. The results show that the heterogeneous solution can improve performance up to 2.9x and increases energy efficiency up to 2.7x compared tothe traditional approach of keeping all the CPU cores idle while the accelerator computes the workload. Our results also demonstrate two interesting insights: First, an adaptive scheduler able to find at runtime the right chunk size for each type of application and device configuration is an essential component for these kinds of heterogeneous platforms, and second, device configurations that provide higher throughput do not always achieve better energy eciency when only the running power (excluding the idle power component) is considered

    Programação dataflow de aplicaçÔes de fluxo de dados contĂ­nuo para sistemas heterogĂȘneos

    Get PDF
    Stream processing applications have high-demanding performance requirements that are hard to tackle using traditional parallel models on modern many-core architectures, such as GPUs. On the other hand, recent dataflow computing models can naturally exploit parallelism for a wide class of applications. This work presents an extension to an existing dataflow library for Java. The library extension implements high-level constructs with multiple command queues to enable the superposition of memory operations and kernel executions on GPUs. Experimental results show that significant speedup can be achieved for a subset of well-known stream processing applications: Volume Ray-Casting, Path-Tracing and Sobel Filter.AplicaçÔes stream possuem demandam rigorosos requisitos de performance que sĂŁo difĂ­ceis de serem atingidos utilizando modelos paralelos tradicionais em arquiteturas many-cores como GPUs. Por outro lado, os recentes modelos de computação Dataflow podem naturalmente explorar paralelismo em uma abrangente classe de aplicaçÔes. Este trabalho apresenta uma extensĂŁo para uma biblioteca Dataflow em Java. Esta extensĂŁo implementa construçÔes em alto nĂ­vel com mĂșltiplas filas de comando que permitem a sobreposição de operaçÔes de memĂłria e execução de kernel em GPUs. Os resultados deste trabalho mostraram que um significante speedup pode ser atingido para um conjunto de aplicaçÔes bem conhecidas de processamento stream como: Ray-Casting, Path-Tracing e filtro Sobel
    corecore