3,796 research outputs found

    MORA - an architecture and programming model for a resource efficient coarse grained reconfigurable processor

    Get PDF
    This paper presents an architecture and implementation details for MORA, a novel coarse grained reconfigurable processor for accelerating media processing applications. The MORA architecture involves a 2-D array of several such processors, to deliver low cost, high throughput performance in media processing applications. A distinguishing feature of the MORA architecture is the co-design of hardware architecture and low-level programming language throughout the design cycle. The implementation details for the single MORA processor, and benchmark evaluation using a cycle accurate simulator are presented

    Towards generic satellite payloads: software radio

    Get PDF
    Satellite payloads are becoming much more complex with the evolution towards multimedia applications. Moreover satellite lifetime increases while standard and services evolve faster, necessitating a hardware platform that can evolves for not developing new systems on each change. The same problem occurs in terrestrial systems like mobile networks and a foreseen solution is the software defined radio technology. In this paper we describe a way of introducing this concept at satellite level to offer to operators the required flexibility in the system. The digital functions enabling this technology, the hardware components implementing the functions and the reconfiguration processes are detailed. We show that elements of the software radio for satellites exist and that this concept is feasible

    A fully parameterized virtual coarse grained reconfigurable array for high performance computing applications

    Get PDF
    Field Programmable Gate Arrays (FPGAs) have proven their potential in accelerating High Performance Computing (HPC) Applications. Conventionally such accelerators predominantly use, FPGAs that contain fine-grained elements such as LookUp Tables (LUTs), Switch Blocks (SB) and Connection Blocks (CB) as basic programmable logic blocks. However, the conventional implementation suffers from high reconfiguration and development costs. In order to solve this problem, programmable logic components are defined at a virtual higher abstraction level. These components are called Processing Elements (PEs) and the group of PEs along with the inter-connection network form an architecture called a Virtual Coarse-Grained Reconfigurable Array (VCGRA). The abstraction helps to reconfigure the PEs faster at the intermediate level than at the lower-level of an FPGA. Conventional VCGRA implementations (built on top of the lower levels of the FPGA) use functional resources such as LUTs to establish required connections (intra-connect) within a PE. In this paper, we propose to use the parameterized reconfiguration technique to implement the intra-connections of each PE with the aim to reduce the FPGA resource utilization (LUTs). The technique is used to parameterize the intra-connections with parameters that only change their value infrequently (whenever a new VCGRA function has to be reconfigured) and that are implemented as constants. Since the design is optimized for these constants at every moment in time, this reduces the resource utilization. Further, interconnections (network between the multiple PEs) of the VCGRA grid can also be parameterized so that both the inter- and intraconnect network of the VCGRA grid can be mapped onto the physical switch blocks of the FPGA. For every change in parameter values a specialized bitstream is generated on the fly and the FPGA is reconfigured using the parameterized run-time reconfiguration technique. Our results show a drastic reduction in FPGA LUT resource utilization in the PE by at least 30% and in the intra-network of the PE by 31% when implementing an HPC application

    Fault-tolerant fpga for mission-critical applications.

    Get PDF
    One of the devices that play a great role in electronic circuits design, specifically safety-critical design applications, is Field programmable Gate Arrays (FPGAs). This is because of its high performance, re-configurability and low development cost. FPGAs are used in many applications such as data processing, networks, automotive, space and industrial applications. Negative impacts on the reliability of such applications result from moving to smaller feature sizes in the latest FPGA architectures. This increases the need for fault-tolerant techniques to improve reliability and extend system lifetime of FPGA-based applications. In this thesis, two fault-tolerant techniques for FPGA-based applications are proposed with a built-in fault detection region. A low cost fault detection scheme is proposed for detecting faults using the fault detection region used in both schemes. The fault detection scheme primarily detects open faults in the programmable interconnect resources in the FPGAs. In addition, Stuck-At faults and Single Event Upsets (SEUs) fault can be detected. For fault recovery, each scheme has its own fault recovery approach. The first approach uses a spare module and a 2-to-1 multiplexer to recover from any fault detected. On the other hand, the second approach recovers from any fault detected using the property of Partial Reconfiguration (PR) in the FPGAs. It relies on identifying a Partially Reconfigurable block (P_b) in the FPGA that is used in the recovery process after the first faulty module is identified in the system. This technique uses only one location to recover from faults in any of the FPGA’s modules and the FPGA interconnects. Simulation results show that both techniques can detect and recover from open faults. In addition, Stuck-At faults and Single Event Upsets (SEUs) fault can also be detected. Finally, both techniques require low area overhead

    Towards a triple mode common operator FFT for Software Radio systems

    Get PDF
    International audienceA scenario to design a Triple Mode FFT is addressed. Based on a Dual Mode FFT structure, we present a methodology to reach a triple mode FFT operator (TMFFT) able to operate over three different fields: complex number domain C, Galois Fields GF(Ft) and GF(2m). We propose a reconfigurable Triple mode Multiplier that constitutes the core of the Butterflybased FFT. A scalable and flexible unit for the polynomial reduction needed in the GF(2m) multiplication is also proposed. An FPGA implementation of the proposed multiplier is given and the measures show a gain of 18%in terms of performance-to-cost ratio compared to a "Velcro" approach where two self-contained operators are implemented separately

    A Micro Power Hardware Fabric for Embedded Computing

    Get PDF
    Field Programmable Gate Arrays (FPGAs) mitigate many of the problemsencountered with the development of ASICs by offering flexibility, faster time-to-market, and amortized NRE costs, among other benefits. While FPGAs are increasingly being used for complex computational applications such as signal and image processing, networking, and cryptology, they are far from ideal for these tasks due to relatively high power consumption and silicon usage overheads compared to direct ASIC implementation. A reconfigurable device that exhibits ASIC-like power characteristics and FPGA-like costs and tool support is desirable to fill this void. In this research, a parameterized, reconfigurable fabric model named as domain specific fabric (DSF) is developed that exhibits ASIC-like power characteristics for Digital Signal Processing (DSP) style applications. Using this model, the impact of varying different design parameters on power and performance has been studied. Different optimization techniques like local search and simulated annealing are used to determine the appropriate interconnect for a specific set of applications. A design space exploration tool has been developed to automate and generate a tailored architectural instance of the fabric.The fabric has been synthesized on 160 nm cell-based ASIC fabrication process from OKI and 130 nm from IBM. A detailed power-performance analysis has been completed using signal and image processing benchmarks from the MediaBench benchmark suite and elsewhere with comparisons to other hardware and software implementations. The optimized fabric implemented using the 130 nm process yields energy within 3X of a direct ASIC implementation, 330X better than a Virtex-II Pro FPGA and 2016X better than an Intel XScale processor

    A Multi-layer Fpga Framework Supporting Autonomous Runtime Partial Reconfiguration

    Get PDF
    Partial reconfiguration is a unique capability provided by several Field Programmable Gate Array (FPGA) vendors recently, which involves altering part of the programmed design within an SRAM-based FPGA at run-time. In this dissertation, a Multilayer Runtime Reconfiguration Architecture (MRRA) is developed, evaluated, and refined for Autonomous Runtime Partial Reconfiguration of FPGA devices. Under the proposed MRRA paradigm, FPGA configurations can be manipulated at runtime using on-chip resources. Operations are partitioned into Logic, Translation, and Reconfiguration layers along with a standardized set of Application Programming Interfaces (APIs). At each level, resource details are encapsulated and managed for efficiency and portability during operation. An MRRA mapping theory is developed to link the general logic function and area allocation information to the device related physical configuration level data by using mathematical data structure and physical constraints. In certain scenarios, configuration bit stream data can be read and modified directly for fast operations, relying on the use of similar logic functions and common interconnection resources for communication. A corresponding logic control flow is also developed to make the entire process autonomous. Several prototype MRRA systems are developed on a Xilinx Virtex II Pro platform. The Virtex II Pro on-chip PowerPC core and block RAM are employed to manage control operations while multiple physical interfaces establish and supplement autonomous reconfiguration capabilities. Area, speed and power optimization techniques are developed based on the developed Xilinx prototype. Evaluations and analysis of these prototype and techniques are performed on a number of benchmark and hashing algorithm case studies. The results indicate that based on a variety of test benches, up to 70% reduction in the resource utilization, up to 50% improvement in power consumption, and up to 10 times increase in run-time performance are achieved using the developed architecture and approaches compared with Xilinx baseline reconfiguration flow. Finally, a Genetic Algorithm (GA) for a FPGA fault tolerance case study is evaluated as a ultimate high-level application running on this architecture. It demonstrated that this is a hardware and software infrastructure that enables an FPGA to dynamically reconfigure itself efficiently under the control of a soft microprocessor core that is instantiated within the FPGA fabric. Such a system contributes to the observed benefits of intelligent control, fast reconfiguration, and low overhead

    Object-oriented domain specific compilers for programming FPGAs

    No full text
    Published versio
    • 

    corecore