4 research outputs found

    Desynchronization: Synthesis of asynchronous circuits from synchronous specifications

    Get PDF
    Asynchronous implementation techniques, which measure logic delays at run time and activate registers accordingly, are inherently more robust than their synchronous counterparts, which estimate worst-case delays at design time, and constrain the clock cycle accordingly. De-synchronization is a new paradigm to automate the design of asynchronous circuits from synchronous specifications, thus permitting widespread adoption of asynchronicity, without requiring special design skills or tools. In this paper, we first of all study different protocols for de-synchronization and formally prove their correctness, using techniques originally developed for distributed deployment of synchronous language specifications. We also provide a taxonomy of existing protocols for asynchronous latch controllers, covering in particular the four-phase handshake protocols devised in the literature for micro-pipelines. We then propose a new controller which exhibits provably maximal concurrency, and analyze the performance of desynchronized circuits with respect to the original synchronous optimized implementation. We finally prove the feasibility and effectiveness of our approach, by showing its application to a set of real designs, including a complete implementation of the DLX microprocessor architectur

    Asynchronous techniques for new generation variation-tolerant FPGA

    Get PDF
    PhD ThesisThis thesis presents a practical scenario for asynchronous logic implementation that would benefit the modern Field-Programmable Gate Arrays (FPGAs) technology in improving reliability. A method based on Asynchronously-Assisted Logic (AAL) blocks is proposed here in order to provide the right degree of variation tolerance, preserve as much of the traditional FPGAs structure as possible, and make use of asynchrony only when necessary or beneficial for functionality. The newly proposed AAL introduces extra underlying hard-blocks that support asynchronous interaction only when needed and at minimum overhead. This has the potential to avoid the obstacles to the progress of asynchronous designs, particularly in terms of area and power overheads. The proposed approach provides a solution that is complementary to existing variation tolerance techniques such as the late-binding technique, but improves the reliability of the system as well as reducing the design’s margin headroom when implemented on programmable logic devices (PLDs) or FPGAs. The proposed method suggests the deployment of configurable AAL blocks to reinforce only the variation-critical paths (VCPs) with the help of variation maps, rather than re-mapping and re-routing. The layout level results for this method's worst case increase in the CLB’s overall size only of 6.3%. The proposed strategy retains the structure of the global interconnect resources that occupy the lion’s share of the modern FPGA’s soft fabric, and yet permits the dual-rail iv completion-detection (DR-CD) protocol without the need to globally double the interconnect resources. Simulation results of global and interconnect voltage variations demonstrate the robustness of the method

    Architectural Exploration of KeyRing Self-Timed Processors

    Get PDF
    RÉSUMÉ Les dernières décennies ont vu l’augmentation des performances des processeurs contraintes par les limites imposées par la consommation d’énergie des systèmes électroniques : des très basses consommations requises pour les objets connectés, aux budgets de dépenses électriques des serveurs, en passant par les limitations thermiques et la durée de vie des batteries des appareils mobiles. Cette forte demande en processeurs efficients en énergie, couplée avec les limitations de la réduction d’échelle des transistors—qui ne permet plus d’améliorer les performances à densité de puissance constante—, conduit les concepteurs de circuits intégrés à explorer de nouvelles microarchitectures permettant d’obtenir de meilleures performances pour un budget énergétique donné. Cette thèse s’inscrit dans cette tendance en proposant une nouvelle microarchitecture de processeur, appelée KeyRing, conçue avec l’intention de réduire la consommation d’énergie des processeurs. La fréquence d’opération des transistors dans les circuits intégrés est proportionnelle à leur consommation dynamique d’énergie. Par conséquent, les techniques de conception permettant de réduire dynamiquement le nombre de transistors en opération sont très largement adoptées pour améliorer l’efficience énergétique des processeurs. La technique de clock-gating est particulièrement usitée dans les circuits synchrones, car elle réduit l’impact de l’horloge globale, qui est la principale source d’activité. La microarchitecture KeyRing présentée dans cette thèse utilise une méthode de synchronisation décentralisée et asynchrone pour réduire l’activité des circuits. Elle est dérivée du processeur AnARM, un processeur développé par Octasic sur la base d’une microarchitecture asynchrone ad hoc. Bien qu’il soit plus efficient en énergie que des alternatives synchrones, le AnARM est essentiellement incompatible avec les méthodes de synthèse et d’analyse temporelle statique standards. De plus, sa technique de conception ad hoc ne s’inscrit que partiellement dans les paradigmes de conceptions asynchrones. Cette thèse propose une approche rigoureuse pour définir les principes généraux de cette technique de conception ad hoc, en faisant levier sur la littérature asynchrone. La microarchitecture KeyRing qui en résulte est développée en association avec une méthode de conception automatisée, qui permet de s’affranchir des incompatibilités natives existant entre les outils de conception et les systèmes asynchrones. La méthode proposée permet de pleinement mettre à profit les flots de conception standards de l’industrie microélectronique pour réaliser la synthèse et la vérification des circuits KeyRing. Cette thèse propose également des protocoles expérimentaux, dont le but est de renforcer la relation de causalité entre la microarchitecture KeyRing et une réduction de la consommation énergétique des processeurs, comparativement à des alternatives synchrones équivalentes.----------ABSTRACT Over the last years, microprocessors have had to increase their performances while keeping their power envelope within tight bounds, as dictated by the needs of various markets: from the ultra-low power requirements of the IoT, to the electrical power consumption budget in enterprise servers, by way of passive cooling and day-long battery life in mobile devices. This high demand for power-efficient processors, coupled with the limitations of technology scaling—which no longer provides improved performances at constant power densities—, is leading designers to explore new microarchitectures with the goal of pulling more performances out of a fixed power budget. This work enters into this trend by proposing a new processor microarchitecture, called KeyRing, having a low-power design intent. The switching activity of integrated circuits—i.e. transistors switching on and off—directly affects their dynamic power consumption. Circuit-level design techniques such as clock-gating are widely adopted as they dramatically reduce the impact of the global clock in synchronous circuits, which constitutes the main source of switching activity. The KeyRing microarchitecture presented in this work uses an asynchronous clocking scheme that relies on decentralized synchronization mechanisms to reduce the switching activity of circuits. It is derived from the AnARM, a power-efficient ARM processor developed by Octasic using an ad hoc asynchronous microarchitecture. Although it delivers better power-efficiency than synchronous alternatives, it is for the most part incompatible with standard timing-driven synthesis and Static Timing Analysis (STA). In addition, its design style does not fit well within the existing asynchronous design paradigms. This work lays the foundations for a more rigorous definition of this rather unorthodox design style, using circuits and methods coming from the asynchronous literature. The resulting KeyRing microarchitecture is developed in combination with Electronic Design Automation (EDA) methods that alleviate incompatibility issues related to ad hoc clocking, enabling timing-driven optimizations and verifications of KeyRing circuits using industry-standard design flows. In addition to bridging the gap with standard design practices, this work also proposes comprehensive experimental protocols that aims to strengthen the causal relation between the reported asynchronous microarchitecture and a reduced power consumption compared with synchronous alternatives. The main achievement of this work is a framework that enables the architectural exploration of circuits using the KeyRing microarchitecture

    A Coarse-Grain Phased Logic CPU Abstract: A five-stage pipelined CPU based on the MIPs ISA

    No full text
    is mapped to a self-timed implementation scheme known as Phased Logic (PL). The mapping is performed automatically from a netlist of D-Flip-Flops (DFFs) and 4-input Lookup Tables (LUT4s) to a netlist of PL blocks. Each PL block is composed of control logic wrapped around a collection of DFFs and LUT4s to form a multi-input/output PL gate. PL offers a speedup technique known as early evaluation that can be used to boost performance at the cost of additional logic within each block. In addition to early evaluation, this implementation uses bypass paths in the ALU for shift and logical instructions and buffering stages for increased dataflow to further improve performance. Additional speedup is gained by reordering instructions to provide more opportunity for early evaluation. Simulation results show an average speedup of 41 % compared to the clocked netlist over a suite of five benchmark programs
    corecore