627 research outputs found

    Coarse-grained reconfigurable array architectures

    Get PDF
    Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that benefit from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efficiently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on flexibility, performance, and power-efficiency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code

    Homogeneous and heterogeneous MPSoC architectures with network-on-chip connectivity for low-power and real-time multimedia signal processing

    Get PDF
    Two multiprocessor system-on-chip (MPSoC) architectures are proposed and compared in the paper with reference to audio and video processing applications. One architecture exploits a homogeneous topology; it consists of 8 identical tiles, each made of a 32-bit RISC core enhanced by a 64-bit DSP coprocessor with local memory. The other MPSoC architecture exploits a heterogeneous-tile topology with on-chip distributed memory resources; the tiles act as application specific processors supporting a different class of algorithms. In both architectures, the multiple tiles are interconnected by a network-on-chip (NoC) infrastructure, through network interfaces and routers, which allows parallel operations of the multiple tiles. The functional performances and the implementation complexity of the NoC-based MPSoC architectures are assessed by synthesis results in submicron CMOS technology. Among the large set of supported algorithms, two case studies are considered: the real-time implementation of an H.264/MPEG AVC video codec and of a low-distortion digital audio amplifier. The heterogeneous architecture ensures a higher power efficiency and a smaller area occupation and is more suited for low-power multimedia processing, such as in mobile devices. The homogeneous scheme allows for a higher flexibility and easier system scalability and is more suited for general-purpose DSP tasks in power-supplied devices

    Blocks:Challenging SIMDs and VLIWs With a Reconfigurable Architecture

    Get PDF
    Demand for coarse grain reconfigurable architectures (CGRAs) has significantly increased in recent years as architectures need to be both energy efficient and flexible. However, most CGRAs are optimized for performance instead of energy efficiency. In this work, a novel paradigm for reconfigurable architectures, Blocks, is presented. Blocks uses two separate circuit-switched networks, one for control and one for the data path. This enables the runtime construction of energy-efficient application-specific VLIW-SIMD processors on a reconfigurable fabric. Its energy efficiency is demonstrated by comparing Blocks to four reference architectures, a VLIW, an SIMD, a commercial low-power microprocessor, and a traditional CGRA. All comparisons are based on commercial low-power 40-nm CMOS layout, including memories. Results show that Blocks can achieve a mean total energy reduction of 2.05 × , 1.84 × , 8.01 × , and 1.22 × over a VLIW, an SIMD, an energy-efficient microprocessor and a traditional CGRA, respectively. At the same time, Blocks delivers equal or higher performance per area due to its ability to adapt to applications by reconfiguration.</p

    VLSI architecture design approaches for real-time video processing

    Get PDF
    This paper discusses the programmable and dedicated approaches for real-time video processing applications. Various VLSI architecture including the design examples of both approaches are reviewed. Finally, discussions of several practical designs in real-time video processing applications are then considered in VLSI architectures to provide significant guidelines to VLSI designers for any further real-time video processing design works

    A pipelined configurable gate array for embedded processors

    Get PDF
    In recent years the challenge of high performance, low power retargettable embedded system has been faced with different technological and architectural solutions. In this paper we present a new configurable unit explicitly designed to imple-ment additional reconfigurable pipelined datapaths, suitable for the design of reconfigurable processors. A VLIW recon-figurable processor has been implemented on silicon in a standard 0.18 µm CMOS technology to prove the effective-ness of the proposed unit. Testing on a signal processing algorithms benchmark showed speedups from 4.3x to 13.5x and energy consumption reduction up to 92%

    Low power architectures for streaming applications

    Get PDF

    Domain-specific and reconfigurable instruction cells based architectures for low-power SoC

    Get PDF
    corecore