633 research outputs found

    Empowering parallel computing with field programmable gate arrays

    Get PDF
    After more than 30 years, reconfigurable computing has grown from a concept to a mature field of science and technology. The cornerstone of this evolution is the field programmable gate array, a building block enabling the configuration of a custom hardware architecture. The departure from static von Neumannlike architectures opens the way to eliminate the instruction overhead and to optimize the execution speed and power consumption. FPGAs now live in a growing ecosystem of development tools, enabling software programmers to map algorithms directly onto hardware. Applications abound in many directions, including data centers, IoT, AI, image processing and space exploration. The increasing success of FPGAs is largely due to an improved toolchain with solid high-level synthesis support as well as a better integration with processor and memory systems. On the other hand, long compile times and complex design exploration remain areas for improvement. In this paper we address the evolution of FPGAs towards advanced multi-functional accelerators, discuss different programming models and their HLS language implementations, as well as high-performance tuning of FPGAs integrated into a heterogeneous platform. We pinpoint fallacies and pitfalls, and identify opportunities for language enhancements and architectural refinements

    Securing Critical Infrastructures

    Get PDF
    1noL'abstract è presente nell'allegato / the abstract is in the attachmentopen677. INGEGNERIA INFORMATInoopenCarelli, Albert

    LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing

    Get PDF
    LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft

    Enabling virtual radio functions on software defined radio for future wireless networks

    Get PDF
    Today's wired networks have become highly flexible, thanks to the fact that an increasing number of functionalities are realized by software rather than dedicated hardware. This trend is still in its early stages for wireless networks, but it has the potential to improve the network's flexibility and resource utilization regarding both the abundant computational resources and the scarce radio spectrum resources. In this work we provide an overview of the enabling technologies for network reconfiguration, such as Network Function Virtualization, Software Defined Networking, and Software Defined Radio. We review frequently used terminology such as softwarization, virtualization, and orchestration, and how these concepts apply to wireless networks. We introduce the concept of Virtual Radio Function, and illustrate how softwarized/virtualized radio functions can be placed and initialized at runtime, allowing radio access technologies and spectrum allocation schemes to be formed dynamically. Finally we focus on embedded Software-Defined Radio as an end device, and illustrate how to realize the placement, initialization and configuration of virtual radio functions on such kind of devices

    Design of OpenCL-compatible multithreaded hardware accelerators with dynamic support for embedded FPGAs

    Full text link
    ARTICo3 is an architecture that permits to dynamically set an arbitrary number of reconfigurable hardware accelerators, each containing a given number of threads fixed at design time according to High Level Synthesis constraints. However, the replication of these modules can be decided at runtime to accelerate kernels by increasing the overall number of threads, add modular redundancy to increase fault tolerance, or any combination of the previous. An execution scheduler is used at kernel invocation to deliver the appropriate data transfers, optimizing memory transactions, and sequencing or parallelizing execution according to the configuration specified by the resource manager of the architecture. The model of computation is compatible with the OpenCL kernel execution model, and memory transfers and architecture are arranged to match the same optimization criteria as for kernel execution in GPU architectures but, differently to other approaches, with dynamic hardware execution support. In this paper, a novel design methodology for multithreaded hardware accelerators is presented. The proposed framework provides OpenCL compatibility by implementing a memory model based on shared memory between host and compute device, which removes the overhead imposed by data transferences at global memory level, and local memories inside each accelerator, i.e. compute unit, which are connected to global memory through optimized DMA links. These local memories provide unified access, i.e. a continuous memory map, from the host side, but are divided in a configurable number of independent banks (to increase available ports) from the processing elements side to fully exploit data-level parallelism. Experimental results show OpenCL model compliance using multithreaded hardware accelerators and enhanced dynamic adaptation capabilities

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

    Enhancing Microcomputer Edge Computing for Autonomous IoT Motion Control

    Get PDF
    Devices microprocessors, microcontrollers, and Field Programmable Gate Arrays (FPGA) play the core rule at the IoT edge level and it should be right provisioned. For proper controller performance, control algorithms should be implemented near the actuator eliminating the delay effects. In the IoT domain, this means to implement the mentioned algorithm at the edge level and prior data transmitting. The efficient IoT-enabled motion control can be obtained by considering two main factors; the first factor is from the actuator design point of view and the second factor is from the controller performance point of view. Therefore, in this article, the two mentioned factors are treated concerning the microprocessor rule and importance as a core for proper prototype design and as the main platform to implement the control algorithms. A comparison of controller performance indices for both prototypes is done using previously distributed motion control schemes and newly developed schemes after tuning the respective schemes gains in an optimal manner. The scheme with better behavior of both prototypes are selected for the IoT integration process, this scheme ensures optimal edge computing for the distributed motion control, making the implementation of all control computation take place at the IoT-edge level. As a result, the dynamic pipeline stages (DPS) based prototype gives better controller performance indices for most strategies, less power consumption, and optimally utilized resources encouraging the use of the microprocessors with reconfigurable components at the IoT-edge level
    corecore