111 research outputs found

    Leakage Power Modeling and Reduction Techniques for Field Programmable Gate Arrays

    Get PDF
    FPGAs have become quite popular for implementing digital circuits and systems because of reduced costs and fast design cycles. This has led to increased complexity of FPGAs, and with technology scaling, many new challenges have come up for the FPGA industry, leakage power being one of the key challenges. The current generation FPGAs are being implemented in 90nm technology, therefore, managing leakage power in deep-submicron FPGAs has become critical for the FPGA industry to remain competitive in the semiconductor market and to enter the mobile applications domain. In this work an analytical state dependent leakage power model for FPGAs is developed, followed by dual-Vt based designs of the FPGA architecture for reducing leakage power. The leakage power model computes subthreshold and gate leakage in FPGAs, since these are the two dominant components of total leakage power in the scaled nanometer technologies. The leakage power model takes into account the dependency of gate and subthreshold leakage on the state of the circuit inputs. The leakage power model has two main components, one which computes the probability of a state for a particular FPGA circuit element, and the other which computes the leakage of the FPGA circuit element for a given input using analytical equations. This FPGA power model is particularly important for rapidly analyzing various FPGA architectures across different technology nodes. Dual-Vt based designs of the FPGA architecture are proposed, developed, and evaluated, for reducing the leakage power using a CAD framework. The logic and the routing resources of the FPGA are considered for dual-Vt assignment. The number of the logic elements that can be assigned high-Vt in the ideal case by using a dual-Vt assignment algorithm in the CAD framework is estimated. Based upon this estimate two kinds of architectures are developed and evaluated, homogeneous and heterogeneous architectures. Results indicate that leakage power savings of up to 50% can be obtained from these architectures. The analytical state dependent leakage power model developed has been used for estimating the leakage power savings from the dual-Vt FPGA architectures. The CAD framework that has been developed can also be used for developing and evaluating different dual-Vt FPGA architectures, other than the ones proposed in this work

    Predicting power scalability in a reconfigurable platform

    Get PDF
    This thesis focuses on the evolution of digital hardware systems. A reconfigurable platform is proposed and analysed based on thin-body, fully-depleted silicon-on-insulator Schottky-barrier transistors with metal gates and silicide source/drain (TBFDSBSOI). These offer the potential for simplified processing that will allow them to reach ultimate nanoscale gate dimensions. Technology CAD was used to show that the threshold voltage in TBFDSBSOI devices will be controllable by gate potentials that scale down with the channel dimensions while remaining within appropriate gate reliability limits. SPICE simulations determined that the magnitude of the threshold shift predicted by TCAD software would be sufficient to control the logic configuration of a simple, regular array of these TBFDSBSOI transistors as well as to constrain its overall subthreshold power growth. Using these devices, a reconfigurable platform is proposed based on a regular 6-input, 6-output NOR LUT block in which the logic and configuration functions of the array are mapped onto separate gates of the double-gate device. A new analytic model of the relationship between power (P), area (A) and performance (T) has been developed based on a simple VLSI complexity metric of the form ATσ = constant. As σ defines the performance “return” gained as a result of an increase in area, it also represents a bound on the architectural options available in power-scalable digital systems. This analytic model was used to determine that simple computing functions mapped to the reconfigurable platform will exhibit continuous power-area-performance scaling behavior. A number of simple arithmetic circuits were mapped to the array and their delay and subthreshold leakage analysed over a representative range of supply and threshold voltages, thus determining a worse-case range for the device/circuit-level parameters of the model. Finally, an architectural simulation was built in VHDL-AMS. The frequency scaling described by σ, combined with the device/circuit-level parameters predicts the overall power and performance scaling of parallel architectures mapped to the array

    An Ultra-Low-Energy, Variation-Tolerant FPGA Architecture Using Component-Specific Mapping

    Get PDF
    As feature sizes scale toward atomic limits, parameter variation continues to increase, leading to increased margins in both delay and energy. Parameter variation both slows down devices and causes devices to fail. For applications that require high performance, the possibility of very slow devices on critical paths forces designers to reduce clock speed in order to meet timing. For an important and emerging class of applications that target energy-minimal operation at the cost of delay, the impact of variation-induced defects at very low voltages mandates the sizing up of transistors and operation at higher voltages to maintain functionality. With post-fabrication configurability, FPGAs have the opportunity to self-measure the impact of variation, determining the speed and functionality of each individual resource. Given that information, a delay-aware router can use slow devices on non-critical paths, fast devices on critical paths, and avoid known defects. By mapping each component individually and customizing designs to a component's unique physical characteristics, we demonstrate that we can eliminate delay margins and reduce energy margins caused by variation. To quantify the potential benefit we might gain from component-specific mapping, we first measure the margins associated with parameter variation, and then focus primarily on the energy benefits of FPGA delay-aware routing over a wide range of predictive technologies (45 nm--12 nm) for the Toronto20 benchmark set. We show that relative to delay-oblivious routing, delay-aware routing without any significant optimizations can reduce minimum energy/operation by 1.72x at 22 nm. We demonstrate how to construct an FPGA architecture specifically tailored to further increase the minimum energy savings of component-specific mapping by using the following techniques: power gating, gate sizing, interconnect sparing, and LUT remapping. With all optimizations considered we show a minimum energy/operation savings of 2.66x at 22 nm, or 1.68--2.95x when considered across 45--12 nm. As there are many challenges to measuring resource delays and mapping per chip, we discuss methods that may make component-specific mapping more practical. We demonstrate that a simpler, defect-aware routing achieves 70% of the energy savings of delay-aware routing. Finally, we show that without variation tolerance, scaling from 16 nm to 12 nm results in a net increase in minimum energy/operation; component-specific mapping, however, can extend minimum energy/operation scaling to 12 nm and possibly beyond.</p

    Low power architectures for streaming applications

    Get PDF

    Adaptive Voltage Scaling with In-Situ Detectors in Commercial FPGAs

    Get PDF

    Demonstration of monolithically integrated graphene interconnects for low-power CMOS applications

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 129-141).In recent years, interconnects have become an increasingly difficult design challenge as their relative performance has not improved at the same pace with transistor scaling. The specifications for complex features, clock frequency, supply current, and number of I/O resources have added even greater demands for interconnect performance. Furthermore, the resistivity of copper begins to degrade at smaller line widths due to increased scattering effects. Graphene has gathered much interest as an interconnect material due to its high mobility, high current carrying capacity, and high thermal conductivity. DC characterization of sub-50 nm graphene interconnects has been reported but very few studies exist on evaluating their performance when integrated with CMOS. Integrating graphene with CMOS is a critical step in establishing a path for graphene electronics. In this thesis, we characterize the performance of integrated graphene interconnects and demonstrate two prototype CMOS chips. A 0.35 prm CMOS chip implements an array of transmitter/receivers to analyze end-to-end data communication on graphene wires. Graphene sheets are synthesized by chemical vapor deposition, which are then subsequently transferred and patterned into narrow wires up to 1 mm in length. A low-swing signaling technique is applied, which results in a transmitter energy of 0.3-0.7 pJ/bit/mm, and a total energy of 2.4-5.2 pJ/bit/mm. We demonstrate a minimum voltage swing of 100 mV and bit error rates below 2x10-10. Despite the high sheet resistivity of graphene, integrated graphene links run at speeds up to 50 Mbps. Finally, a subthreshold FPGA was implemented in 0.18 pm CMOS. We demonstrate reliable signal routing on 4-layer graphene wires which replaces parts of the interconnect fabric. The FPGA test chip includes a 5x5 logic array and a TDC-based tester to monitor the delay of graphene wires. The graphene wires have 2.8x lower capacitance than the reference metal wires, resulting in up to 2.11x faster speeds and 1.54x lower interconnect energy when driven by a low-swing voltage of 0.4 V. This work presents the first graphene-based system application and demonstrates the potential of using low capacitance graphene wires for ultra-low power electronics.by Kyeong-Jae Lee.Ph.D

    A neural probe with up to 966 electrodes and up to 384 configurable channels in 0.13 μm SOI CMOS

    Get PDF
    In vivo recording of neural action-potential and local-field-potential signals requires the use of high-resolution penetrating probes. Several international initiatives to better understand the brain are driving technology efforts towards maximizing the number of recording sites while minimizing the neural probe dimensions. We designed and fabricated (0.13-μm SOI Al CMOS) a 384-channel configurable neural probe for large-scale in vivo recording of neural signals. Up to 966 selectable active electrodes were integrated along an implantable shank (70 μm wide, 10 mm long, 20 μm thick), achieving a crosstalk of −64.4 dB. The probe base (5 × 9 mm2) implements dual-band recording and a 1

    Run-time power and performance scaling in 28 nm FPGAs

    Get PDF
    corecore