1,137 research outputs found

    Virtual Runtime Application Partitions for Resource Management in Massively Parallel Architectures

    Get PDF
    This thesis presents a novel design paradigm, called Virtual Runtime Application Partitions (VRAP), to judiciously utilize the on-chip resources. As the dark silicon era approaches, where the power considerations will allow only a fraction chip to be powered on, judicious resource management will become a key consideration in future designs. Most of the works on resource management treat only the physical components (i.e. computation, communication, and memory blocks) as resources and manipulate the component to application mapping to optimize various parameters (e.g. energy efficiency). To further enhance the optimization potential, in addition to the physical resources we propose to manipulate abstract resources (i.e. voltage/frequency operating point, the fault-tolerance strength, the degree of parallelism, and the configuration architecture). The proposed framework (i.e. VRAP) encapsulates methods, algorithms, and hardware blocks to provide each application with the abstract resources tailored to its needs. To test the efficacy of this concept, we have developed three distinct self adaptive environments: (i) Private Operating Environment (POE), (ii) Private Reliability Environment (PRE), and (iii) Private Configuration Environment (PCE) that collectively ensure that each application meets its deadlines using minimal platform resources. In this work several novel architectural enhancements, algorithms and policies are presented to realize the virtual runtime application partitions efficiently. Considering the future design trends, we have chosen Coarse Grained Reconfigurable Architectures (CGRAs) and Network on Chips (NoCs) to test the feasibility of our approach. Specifically, we have chosen Dynamically Reconfigurable Resource Array (DRRA) and McNoC as the representative CGRA and NoC platforms. The proposed techniques are compared and evaluated using a variety of quantitative experiments. Synthesis and simulation results demonstrate VRAP significantly enhances the energy and power efficiency compared to state of the art.Siirretty Doriast

    Adaptive reconfigurable voting for enhanced reliability in medium-grained fault tolerant architectures

    Get PDF
    The impact of SRAM-based FPGAs is constantly growing in aerospace industry despite the fact that their volatile configuration memory is highly susceptible to radiation effects. Therefore, strong fault-handling mechanisms have to be developed in order to protect the design and make it capable of fighting against both soft and permanent errors. In this paper, a fully reconfigurable medium-grained triple modular redundancy (TMR) architecture which forms part of a runtime adaptive on-board processor (OBP) is presented. Fault mitigation is extended to the voting mechanism by applying our reconfiguration methodology not only to domain replicas but also to the voter itself. The proposed approach takes advantage of adaptive configuration placement and modular property of the OBP, thus allowing on-line creation of different medium-grained TMRs and selection of their granularity level. Consequently, we are able to narrow down the fault-affected area thus making the error recovery process faster and less power consuming. The conventional hardware based voting is supported by the ICAP-based one in order to additionally strengthen the reconfigurable intermediate voting. In addition, the implementation methodology ensures using only one memory footprint for all voters and their voting adaptations thus saving storing resources in expensive rad-hard memories

    DeSyRe: on-Demand System Reliability

    No full text
    The DeSyRe project builds on-demand adaptive and reliable Systems-on-Chips (SoCs). As fabrication technology scales down, chips are becoming less reliable, thereby incurring increased power and performance costs for fault tolerance. To make matters worse, power density is becoming a significant limiting factor in SoC design, in general. In the face of such changes in the technological landscape, current solutions for fault tolerance are expected to introduce excessive overheads in future systems. Moreover, attempting to design and manufacture a totally defect and fault-free system, would impact heavily, even prohibitively, the design, manufacturing, and testing costs, as well as the system performance and power consumption. In this context, DeSyRe delivers a new generation of systems that are reliable by design at well-balanced power, performance, and design costs. In our attempt to reduce the overheads of fault-tolerance, only a small fraction of the chip is built to be fault-free. This fault-free part is then employed to manage the remaining fault-prone resources of the SoC. The DeSyRe framework is applied to two medical systems with high safety requirements (measured using the IEC 61508 functional safety standard) and tight power and performance constraints

    Partially reconfigurable SDR solution on FPGA

    Get PDF
    Abstract. Software-defined radios (SDR) have become more common in order to answer the increasing complexity of wireless communication standards. The flexibility offered by SDR technology in return makes it possible to create and implement even more complex standards so there exists a mutual evolution cycle. One of the technological opportunities pursued on SDR is changing the waveforms on the fly. The standards challenge the SDR development. Computing throughput needs to be high enough, the end product has to be energy efficient, and all of this must be accomplished as cheaply as possible. SDRs have a wide range of implementation opportunities from complete software designs to more hardware oriented with higher level software control. The extreme ends of these approaches suffer from energy dissipation and design cost issues, respectively. The compromises include application specific architectures and reconfigurable hardware. Solutions vary from software to hardware between cases and depending on the needs. This thesis concentrates on investigating partial reconfigurability on a field-programmable gate array (FPGA) in an SDR application. Based on the results, partial reconfigurability is an attractive mean to bolster SDR functionalities. Although the energy efficiency of the employed FPGA solution is inferior to using an application-specific integrated circuit (ASIC), the flexibility and cost of design set them apart. This study focuses on partial reconfiguration on Xilinx FPGA devices but it may show benefits for other devices that can utilize partial reconfiguration on their designs.Osittain uudelleenohjelmoitava ohjelmistoradio FPGA-piirillĂ€. TiivistelmĂ€. Ohjelmistoradiot ovat yleistyneet entistĂ€ kehittyneempien langattomien kommunikointimenetelmien myötĂ€ ja tarpeesta vastata nĂ€iden vaatimuksiin. Samalla ohjelmistoradioiden joustavuus mahdollistaa uusien ja kompleksisempien standardien kehittĂ€misen. TĂ€tĂ€ voi pitÀÀ molemminpuolisena kehityssyklinĂ€. Aaltomuotojen nopea vaihtaminen lennosta ohjelmistoradion ollessa kĂ€ytössĂ€ on yksi kehityksen alla oleva teknologia. Kommunikointistandardit haastavat ohjelmistoradioiden kehityksen erilaisilla vaatimuksillaan. Esimerkiksi laskentatehon tulee olla korkea, lopputuotteen energiatehokas ja tĂ€mĂ€n tulee tapahtua mahdollisimman edullisesti. Ohjelmistoradioiden toteutukset vaihtelevat aina vahvoista ohjelmistopohjaisista arkkitehtuureista enemmĂ€n laitteistoon tukeutuviin versioihin. Ă„Ă€ripĂ€issĂ€ tĂ€ssĂ€ spektrissĂ€ ohjelmistoihin perustuvat toteutukset eivĂ€t ole riittĂ€vĂ€n energiatehokkaita ja laitteistoratkaisujen hinnat nousevat helposti korkealle. Keskitien ratkaisuja ovat sovelluskohtaiset arkkitehtuurit ja uudelleen ohjelmoitavat laitteistot. Implementaatiot vaihtelevat ohjelmisto-laitteisto skaalalla riippuen tarpeesta ja tilanteesta. TĂ€mĂ€ opinnĂ€ytetyö keskittyy tutkimaan osittaista uudelleenohjelmoimista FPGA-piireillĂ€ ohjelmistoradion yhteydessĂ€. Tulosten perusteella osittainen uudelleen ohjelmointi on houkutteleva tapa tehostaa ohjelmistoradioita. Vaikka FPGA-piirien energiatehokkuus ei ole yhtĂ€ hyvĂ€ kuin ASIC-toteutusten, niiden joustavuus ja suunnittelukustannukset ovat paremmat. Vaikka tĂ€mĂ€ työ keskittyy osittaiseen uudelleenohjelmointiin Xilinxin FPGA-piireillĂ€, voi siitĂ€ olla hyötyĂ€ muissa tutkimuksissa ja laitteissa

    Reconfigurable architectures for beyond 3G wireless communication systems

    Get PDF

    Improving low latency applications for reconfigurable devices

    Get PDF
    This thesis seeks to improve low latency application performance via architectural improvements in reconfigurable devices. This is achieved by improving resource utilisation and access, and by exploiting the different environments within which reconfigurable devices are deployed. Our first contribution leverages devices deployed at the network level to enable the low latency processing of financial market data feeds. Financial exchanges transmit messages via two identical data feeds to reduce the chance of message loss. We present an approach to arbitrate these redundant feeds at the network level using a Field-Programmable Gate Array (FPGA). With support for any messaging protocol, we evaluate our design using the NASDAQ TotalView-ITCH, OPRA, and ARCA data feed protocols, and provide two simultaneous outputs: one prioritising low latency, and one prioritising high reliability with three dynamically configurable windowing methods. Our second contribution is a new ring-based architecture for low latency, parallel access to FPGA memory. Traditional FPGA memory is formed by grouping block memories (BRAMs) together and accessing them as a single device. Our architecture accesses these BRAMs independently and in parallel. Targeting memory-based computing, which stores pre-computed function results in memory, we benefit low latency applications that rely on: highly-complex functions; iterative computation; or many parallel accesses to a shared resource. We assess square root, power, trigonometric, and hyperbolic functions within the FPGA, and provide a tool to convert Python functions to our new architecture. Our third contribution extends the ring-based architecture to support any FPGA processing element. We unify E heterogeneous processing elements within compute pools, with each element implementing the same function, and the pool serving D parallel function calls. Our implementation-agnostic approach supports processing elements with different latencies, implementations, and pipeline lengths, as well as non-deterministic latencies. Compute pools evenly balance access to processing elements across the entire application, and are evaluated by implementing eight different neural network activation functions within an FPGA.Open Acces
    • 

    corecore