19 research outputs found

    Automatic rapid prototyping of semi-custom VLSI circuits using FPGAs

    Get PDF
    Journal ArticleWe describe a technique for translating semi-custom VLSI circuits automatically, integrating two design environments, into field programmable gate arrays (FPGAs) for rapid and inexpensive prototyping. The VLSI circuits are designed using a cell-matrix based environment that produces chips with density comparable to full custom VLSI design. These circuits are translated automatically into FPGAs for testing and system development. A four-bit pipelined array multiplier is used as an example of this translation. The multiplier is implemented in CMOS in both synchronous and asynchronous pipelined versions, and translated into Actel FPGAs both automatically, and by hand for comparison. The six test chips were all found to be fully functional, and the translation efficiency in terms of chip speed and area is shown. This result demonstrates the potential of this approach to system development

    Fred: an architecture for a self-timed decoupled computer

    Get PDF
    Journal ArticleDecoupled computer architectures provide an effective means of exploiting instruction level parallelism. Self-timed micropipeline systems are inherently decoupled due to the elastic nature of the basic FIFO structure, and may be ideally suited for constructing decoupled computer architectures. Fred is a self-timed decoupled, pipelined computer architecture based on micropipelines. We present the architecture of Fred, with specific details on a micropipelined implementation that includes support for multiple functional units and out-of- order instruction completion due to the self-timed decoupling

    Self-timed field programmmable gate array architectures

    Get PDF

    Practical advances in asynchronous design

    Get PDF
    Journal ArticleRecent practical advances in asynchronous circuit and system design have resulted in renewed interest by circuit designers. Asynchronous systems are being viewed as in increasingly viable alternative to globally synchronous system organization. This tutorial will present the current state of the art in asynchronous circuit and system design in three different areas. The first section details asynchronous control systems. The second describes a variety of approaches to asynchronous datapaths. The third section is on asynchronous and self-timed circuits applied to the design of general purpose processors

    Architectural Exploration of KeyRing Self-Timed Processors

    Get PDF
    RÉSUMÉ Les derniĂšres dĂ©cennies ont vu l’augmentation des performances des processeurs contraintes par les limites imposĂ©es par la consommation d’énergie des systĂšmes Ă©lectroniques : des trĂšs basses consommations requises pour les objets connectĂ©s, aux budgets de dĂ©penses Ă©lectriques des serveurs, en passant par les limitations thermiques et la durĂ©e de vie des batteries des appareils mobiles. Cette forte demande en processeurs efficients en Ă©nergie, couplĂ©e avec les limitations de la rĂ©duction d’échelle des transistors—qui ne permet plus d’amĂ©liorer les performances Ă  densitĂ© de puissance constante—, conduit les concepteurs de circuits intĂ©grĂ©s Ă  explorer de nouvelles microarchitectures permettant d’obtenir de meilleures performances pour un budget Ă©nergĂ©tique donnĂ©. Cette thĂšse s’inscrit dans cette tendance en proposant une nouvelle microarchitecture de processeur, appelĂ©e KeyRing, conçue avec l’intention de rĂ©duire la consommation d’énergie des processeurs. La frĂ©quence d’opĂ©ration des transistors dans les circuits intĂ©grĂ©s est proportionnelle Ă  leur consommation dynamique d’énergie. Par consĂ©quent, les techniques de conception permettant de rĂ©duire dynamiquement le nombre de transistors en opĂ©ration sont trĂšs largement adoptĂ©es pour amĂ©liorer l’efficience Ă©nergĂ©tique des processeurs. La technique de clock-gating est particuliĂšrement usitĂ©e dans les circuits synchrones, car elle rĂ©duit l’impact de l’horloge globale, qui est la principale source d’activitĂ©. La microarchitecture KeyRing prĂ©sentĂ©e dans cette thĂšse utilise une mĂ©thode de synchronisation dĂ©centralisĂ©e et asynchrone pour rĂ©duire l’activitĂ© des circuits. Elle est dĂ©rivĂ©e du processeur AnARM, un processeur dĂ©veloppĂ© par Octasic sur la base d’une microarchitecture asynchrone ad hoc. Bien qu’il soit plus efficient en Ă©nergie que des alternatives synchrones, le AnARM est essentiellement incompatible avec les mĂ©thodes de synthĂšse et d’analyse temporelle statique standards. De plus, sa technique de conception ad hoc ne s’inscrit que partiellement dans les paradigmes de conceptions asynchrones. Cette thĂšse propose une approche rigoureuse pour dĂ©finir les principes gĂ©nĂ©raux de cette technique de conception ad hoc, en faisant levier sur la littĂ©rature asynchrone. La microarchitecture KeyRing qui en rĂ©sulte est dĂ©veloppĂ©e en association avec une mĂ©thode de conception automatisĂ©e, qui permet de s’affranchir des incompatibilitĂ©s natives existant entre les outils de conception et les systĂšmes asynchrones. La mĂ©thode proposĂ©e permet de pleinement mettre Ă  profit les flots de conception standards de l’industrie microĂ©lectronique pour rĂ©aliser la synthĂšse et la vĂ©rification des circuits KeyRing. Cette thĂšse propose Ă©galement des protocoles expĂ©rimentaux, dont le but est de renforcer la relation de causalitĂ© entre la microarchitecture KeyRing et une rĂ©duction de la consommation Ă©nergĂ©tique des processeurs, comparativement Ă  des alternatives synchrones Ă©quivalentes.----------ABSTRACT Over the last years, microprocessors have had to increase their performances while keeping their power envelope within tight bounds, as dictated by the needs of various markets: from the ultra-low power requirements of the IoT, to the electrical power consumption budget in enterprise servers, by way of passive cooling and day-long battery life in mobile devices. This high demand for power-efficient processors, coupled with the limitations of technology scaling—which no longer provides improved performances at constant power densities—, is leading designers to explore new microarchitectures with the goal of pulling more performances out of a fixed power budget. This work enters into this trend by proposing a new processor microarchitecture, called KeyRing, having a low-power design intent. The switching activity of integrated circuits—i.e. transistors switching on and off—directly affects their dynamic power consumption. Circuit-level design techniques such as clock-gating are widely adopted as they dramatically reduce the impact of the global clock in synchronous circuits, which constitutes the main source of switching activity. The KeyRing microarchitecture presented in this work uses an asynchronous clocking scheme that relies on decentralized synchronization mechanisms to reduce the switching activity of circuits. It is derived from the AnARM, a power-efficient ARM processor developed by Octasic using an ad hoc asynchronous microarchitecture. Although it delivers better power-efficiency than synchronous alternatives, it is for the most part incompatible with standard timing-driven synthesis and Static Timing Analysis (STA). In addition, its design style does not fit well within the existing asynchronous design paradigms. This work lays the foundations for a more rigorous definition of this rather unorthodox design style, using circuits and methods coming from the asynchronous literature. The resulting KeyRing microarchitecture is developed in combination with Electronic Design Automation (EDA) methods that alleviate incompatibility issues related to ad hoc clocking, enabling timing-driven optimizations and verifications of KeyRing circuits using industry-standard design flows. In addition to bridging the gap with standard design practices, this work also proposes comprehensive experimental protocols that aims to strengthen the causal relation between the reported asynchronous microarchitecture and a reduced power consumption compared with synchronous alternatives. The main achievement of this work is a framework that enables the architectural exploration of circuits using the KeyRing microarchitecture

    Petri nets based components within globally asynchronous locally synchronous systems

    Get PDF
    Dissertação apresentada na Faculdade de CiĂȘncias e Tecnologias da Universidade Nova de Lisboa para a obtenção do grau de Mestre em Engenharia ElectrotĂ©cnica e ComputadoresThe main goal is to develop a solution for the interconnection of components constituent of a GALS - Globally Asynchronous, Locally Synchronous – system. The components are implemented in parallel obtained as a result of the partition of a model expressed a Petri net (PN), performed using the PNs editor SNOOPY-IOPT in conjunction with the Split tool and the tools to automatically generate the VHDL code from the representations of the PNML models resulting from the partition (these tools were developed under the project FORDESIGN and are available at http://www.uninova.pt/FORDESIGN). Typical solutions will be analyzed to ensure proper communication between components of the GALS system, as well as characterized and developed an appropriate solution for the interconnection of the components associated with the PN sub-models. The final goal (not attained with this thesis) would be to acquire a tool that allows generation of code for the interconnection solution from the associated components, considering a specific application. The solution proposed for componentes interconnection was coded in VHDL and the implementation platforms used for testing include the Xilinx FPGA Spartan-3 and Virtex-II

    Asynchronous techniques for new generation variation-tolerant FPGA

    Get PDF
    PhD ThesisThis thesis presents a practical scenario for asynchronous logic implementation that would benefit the modern Field-Programmable Gate Arrays (FPGAs) technology in improving reliability. A method based on Asynchronously-Assisted Logic (AAL) blocks is proposed here in order to provide the right degree of variation tolerance, preserve as much of the traditional FPGAs structure as possible, and make use of asynchrony only when necessary or beneficial for functionality. The newly proposed AAL introduces extra underlying hard-blocks that support asynchronous interaction only when needed and at minimum overhead. This has the potential to avoid the obstacles to the progress of asynchronous designs, particularly in terms of area and power overheads. The proposed approach provides a solution that is complementary to existing variation tolerance techniques such as the late-binding technique, but improves the reliability of the system as well as reducing the design’s margin headroom when implemented on programmable logic devices (PLDs) or FPGAs. The proposed method suggests the deployment of configurable AAL blocks to reinforce only the variation-critical paths (VCPs) with the help of variation maps, rather than re-mapping and re-routing. The layout level results for this method's worst case increase in the CLB’s overall size only of 6.3%. The proposed strategy retains the structure of the global interconnect resources that occupy the lion’s share of the modern FPGA’s soft fabric, and yet permits the dual-rail iv completion-detection (DR-CD) protocol without the need to globally double the interconnect resources. Simulation results of global and interconnect voltage variations demonstrate the robustness of the method

    Introducing KeyRing self‐timed microarchitecture and timing‐driven design flow

    Get PDF
    ABSTRACT: A self-timed microarchitecture called KeyRing is presented, and a method for implementing KeyRing circuits compatible with a timing-driven electronic design automation (EDA) flow is discussed. The KeyRing microarchitecture is derived from the AnARM, a low-power self-timed ARM processor based on ad hoc design principles. First, the unorthodox design style and circuit structures are revisited. A theoretical model that can support the design of generic circuits and the elaboration of EDA methods is then presented. Also addressed are the compatibility issues between KeyRing circuits and timing-driven EDA flows. The proposed method leverages relative timing constraints to translate the timing relations in a KeyRing circuit into a set of timing constraints that enable timing-driven synthesis and static timing analysis. Finally, two 32-bit RISC-V processors are presented; called KeyV and based on KeyRing microarchitectures, they are synthesized in a 65 nm technology using the proposed EDA flow. Postsynthesis results demonstrate the effectiveness of the design methodology and allow comparisons with a synchronous alternative called SynV. Performance and power consumption evaluations show that KeyV has a power efficiency that lies between SynV with clock-gating and SynV without clock-gating

    Design And Synthesis Of Clockless Pipelines Based On Self-resetting Stage Logic

    Get PDF
    For decades, digital design has been primarily dominated by clocked circuits. With larger scales of integration made possible by improved semiconductor manufacturing techniques, relying on a clock signal to orchestrate logic operations across an entire chip became increasingly difficult. Motivated by this problem, designers are currently considering circuits which can operate without a clock. However, the wide acceptance of these circuits by the digital design community requires two ingredients: (i) a unified design methodology supported by widely available CAD tools, and (ii) a granularity of design techniques suitable for synthesizing large designs. Currently, there is no unified established design methodology to support the design and verification of these circuits. Moreover, the majority of clockless design techniques is conceived at circuit level, and is subsequently so fine-grain, that their application to large designs can have unacceptable area costs. Given these considerations, this dissertation presents a new clockless technique, called self-resetting stage logic (SRSL), in which the computation of a block is reset periodically from within the block itself. SRSL is used as a building block for three coarse-grain pipelining techniques: (i) Stage-controlled self-resetting stage logic (S-SRSL) Pipelines: In these pipelines, the control of the communication between stages is performed locally between each pair of stages. This communication is performed in a uni-directional manner in order to simplify its implementation. (ii) Pipeline-controlled self-resetting stage logic (P-SRSL) Pipelines: In these pipelines, the communication between each pair of stages in the pipeline is driven by the oscillation of the last pipeline stage. Their communication scheme is identical to the one used in S-SRSL pipelines. (iii) Delay-tolerant self-resetting stage logic (D-SRSL) Pipelines: While communication in these pipelines is local in nature in a manner similar to the one used in S-SRL pipelines, this communication is nevertheless extended in both directions. The result of this bi-directional approach is an increase in the capability of the pipeline to handle stages with random delay. Based on these pipelining techniques, a new design methodology is proposed to synthesize clockless designs. The synthesis problem consists of synthesizing an SRSL pipeline from a gate netlist with a minimum area overhead given a specified data rate. A two-phase heuristic algorithm is proposed to solve this problem. The goal of the algorithm is to pipeline a given datapath by minimizing the area occupied by inter-stage latches without violating any timing constraints. Experiments with this synthesis algorithm show that while P-SRSL pipelines can reach high throughputs in shallow pipelines, D-SRSL pipelines can achieve comparable throughputs in deeper pipelines
    corecore