901 research outputs found

    Energy-efficient hardware architecture and vlsi implementation of a polyphase channelizer with applications to subband adaptive filtering

    Get PDF
    Abstract Polyphase channelizer is an important component of subband adaptive filtering systems. This paper presents an energy-efficient hardware architecture and VLSI implementation of polyphase channelizer, integrating algorithmic, architectural and circuit level design techniques. At algorithm level, low complexity polyphase channelizer architecture is derived using multirate signal processing approach. To reduce the computational complexity in polyphase filters, computation sharing differential coefficient (CSDC) method is effectively used as an architectural level technique. The main idea of CSDC is to combine the strength of augmented differential coefficient method and subexpression sharing. Efficient circuitlevel techniques: low power commutator implementation, dual-VDD scheme and novel level-converting flip-flop (LCFF), are also used to further reduce the power dissipation. The proposed polyphase channelizer consumes 352 mW power with throughput of 480 million samples per second (MSPS). A test chip has been fabricated in 0.18 μm CMOS technology and its functionality is verified. Chip measurement results show that the dual-VDD implementation achieves a total power saving of 2.7 X

    Design and Analysis of Metastable-Hardened, High-Performance, Low-Power Flip-Flops

    Get PDF
    With rapid technology scaling, flip-flops are becoming more susceptible to metastability due to tighter timing budgets and the more prominent effects of process, temperature, and voltage variation that can result in frequent setup and hold time violations. This thesis presents a detailed methodology and analysis on the design of metastable-hardened, high-performance, and low-power flip-flops. The design of metastable-hardened flip-flops is focused on optimizing the value of τ mainly due to its exponential relationship with the metastability window δ and the mean-time-between-failure (MTBF). Through small-signal modeling, τ is determined to be a function of the load capacitance and the transconductance in the cross-coupled inverter pair for a given flip-flop architecture. In most cases, the reduction of τ comes at the expense of increased delay and power. Hence, two new design metrics, the metastability-delay-product (MDP) and the metastability-power-delay-product (MPDP), are proposed to analyze the tradeoffs between delay, power and τ. Post-layout simulation results have shown that the proposed optimum MPDP design can reduce the metastability window δ by at least an order of magnitude depending on the value of the settling time and the flip-flop architecture. In this work, we have proposed two new flip-flop designs: the pre-discharge flip-flop (PDFF) and the sense-amplifier-transmission-gate (SATG) based flip-flop. Both flip-flop architectures facilitate the usage in both single and dual-supply systems as reduced clock-swing flip-flop and level-converting flip-flop. With a cross-coupled inverter in the master-stage that increases the overall transconductance and a small load transistor associated with the critical node, the architecture of both the PDFF and the SATG is very attractive for the design of metastable-hardened, high-performance, and low-power flip-flops. The amount of overhead in delay, power, and area is all less than 10% under the optimum MPDP design scheme when compared to the traditional optimum PDP design. In designing for metastable-hardened and soft-error tolerant flip-flops, the main methodology is to improve the metastability performance in the master-stage while applying the soft-error tolerant cell in the slave-stage for protection against soft-error. The proposed flip-flops, PDFF-SE and SATG-SE, both utilize a cross-coupled inverter on the critical path in the master-stage and generate the required differential signals to facilitate the usage of the Quatro soft-error tolerant cell in the slave-stage

    Low-power digital processor for wireless sensor networks

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2005.Includes bibliographical references (p. 69-72).In order to make sensor networks cost-effective and practical, the electronic components of a wireless sensor node need to run for months to years on the same battery. This thesis explores the design of a low-power digital processor for these sensor nodes, employing techniques such as hardwired algorithms, lowered supply voltages, clock gating and subsystem shutdown. Prototypes were built on both a FPGA and ASIC platform, in order to verify functionality and characterize power consumption. The resulting 0.18[micro]m silicon fabricated in National Semiconductor Corporation's process was operational for supply voltages ranging from 0.5V to 1.8V. At the lowest operating voltage of 0.5V and a frequency of 100KHz, the chip performs 8 full-accuracy FFT computations per second and draws 1.2nJ of total energy per cycle. Although this energy/cycle metric does not surpass existing low-energy processors demonstrated in literature or commercial products, several low-power techniques are suggested that could drastically improve the energy metrics of a future implementation.by Daniel Frederic Finchelstein.S.M

    Asynchronous Circuit Synthesis Using Multi-Threshold NULL Convention Logic

    Get PDF
    As the demand for an energy-efficient alternative to traditional synchronous circuit design grows, hardware designers must reconsider the traditional clock tree. By doing away with the constrains of a clock, asynchronous sequential circuit designs can achieve a much greater level of efficiency. The utilization of asynchronous logic synthesis flows has enabled researchers to better implement asynchronous circuit designs which have been optimized using the same industry standard tools that are already used in sequential synchronous designs. This thesis offers a new flow for such tools which implements the MTNCL asynchronous circuit architecture

    Design of variation-tolerant synchronizers for multiple clock and voltage domains

    Get PDF
    PhD ThesisParametric variability increasingly affects the performance of electronic circuits as the fabrication technology has reached the level of 32nm and beyond. These parameters may include transistor Process parameters (such as threshold voltage), supply Voltage and Temperature (PVT), all of which could have a significant impact on the speed and power consumption of the circuit, particularly if the variations exceed the design margins. As systems are designed with more asynchronous protocols, there is a need for highly robust synchronizers and arbiters. These components are often used as interfaces between communication links of different timing domains as well as sampling devices for asynchronous inputs coming from external components. These applications have created a need for new robust designs of synchronizers and arbiters that can tolerate process, voltage and temperature variations. The aim of this study was to investigate how synchronizers and arbiters should be designed to tolerate parametric variations. All investigations focused mainly on circuit-level and transistor level designs and were modeled and simulated in the UMC90nm CMOS technology process. Analog simulations were used to measure timing parameters and power consumption along with a “Monte Carlo” statistical analysis to account for process variations. Two main components of synchronizers and arbiters were primarily investigated: flip-flop and mutual-exclusion element (MUTEX). Both components can violate the input timing conditions, setup and hold window times, which could cause metastability inside their bistable elements and possibly end in failures. The mean-time between failures is an important reliability feature of any synchronizer delay through the synchronizer. The MUTEX study focused on the classical circuit, in addition to a number of tolerance, based on increasing internal gain by adding current sources, reducing the capacitive loading, boosting the transconductance of the latch, compensating the existing Miller capacitance, and adding asymmetry to maneuver the metastable point. The results showed that some circuits had little or almost no improvements, while five techniques showed significant improvements by reducing τ and maintaining high tolerance. Three design approaches are proposed to provide variation-tolerant synchronizers. wagging synchronizer proposed to First, the is significantly increase reliability over that of the conventional two flip-flop synchronizer. The robustness of the wagging technique can be enhanced by using robust τ latches or adding one more cycle of synchronization. The second approach is the Metastability Auto-Detection and Correction (MADAC) latch which relies on swiftly detecting a metastable event and correcting it by enforcing the previously stored logic value. This technique significantly reduces the resolution time down from uncertain synchronization technique is proposed to transfer signals between Multiple- Voltage Multiple-Clock Domains (MVD/MCD) that do not require conventional level-shifters between the domains or multiple power supplies within each domain. This interface circuit uses a synchronous set and feedback reset protocol which provides level-shifting and synchronization of all signals between the domains, from a wide range of voltage-supplies and clock frequencies. Overall, synchronizer circuits can tolerate variations to a greater extent by employing the wagging technique or using a MADAC latch, while MUTEX tolerance can suffice with small circuit modifications. Communication between MVD/MCD can be achieved by an asynchronous handshake without a need for adding level-shifters.The Saudi Arabian Embassy in London, Umm Al-Qura University, Saudi Arabi

    Latch-based RISC-V core with popcount instruction for CNN acceleration

    Get PDF
    Energy-efficiency is essential for vast majority of mobile and embedded battery-powered systems. Internet-of-Things paradigm combines requirements for high computational capabilities, extreme energy-efficiency and low-cost. Increasing manufacturing process variations pose formidable challenges for deep-submicron integrated circuit designs. The effects of variation are further exacerbated by lowered voltages in energy-efficient designs. Compared to traditional flip-flop-based design, latch-based design offers area, energy-efficiency and variation tolerance benefits at the cost of increased timing behavior complexity. A method for converting flip-flop-based processor core to latch-based core at register-transfer-level is presented in this work. Convolutional neural networks have enabled image recognition in the field of computer vision at unprecedented accuracy. Performance and memory requirements of canonical convolutional neural networks have been out of reach for low-cost IoT devices. In collaboration with Tampere University, a custom popcount instruction was added to the cores for accelerating IoT optimized vehicle classification convolutional neural network. This work compares simulation results from synthesized flip-flop-based and latch-based versions of a SCR1 RISC-V processor core and the effects of custom instruction for CNN acceleration. The latch core achieved roughly 50\% smaller energy per operation than the flip-flop core and 2.1x speedup was observed in the execution of the CNN when using the custom instruction

    Introduction to Logic Circuits & Logic Design with VHDL

    Get PDF
    The overall goal of this book is to fill a void that has appeared in the instruction of digital circuits over the past decade due to the rapid abstraction of system design. Up until the mid-1980s, digital circuits were designed using classical techniques. Classical techniques relied heavily on manual design practices for the synthesis, minimization, and interfacing of digital systems. Corresponding to this design style, academic textbooks were developed that taught classical digital design techniques. Around 1990, large-scale digital systems began being designed using hardware description languages (HDL) and automated synthesis tools. Broad-scale adoption of this modern design approach spread through the industry during this decade. Around 2000, hardware description languages and the modern digital design approach began to be taught in universities, mainly at the senior and graduate level. There were a variety of reasons that the modern digital design approach did not penetrate the lower levels of academia during this time. First, the design and simulation tools were difficult to use and overwhelmed freshman and sophomore students. Second, the ability to implement the designs in a laboratory setting was infeasible. The modern design tools at the time were targeted at custom integrated circuits, which are cost- and time-prohibitive to implement in a university setting. Between 2000 and 2005, rapid advances in programmable logic and design tools allowed the modern digital design approach to be implemented in a university setting, even in lower-level courses. This allowed students to learn the modern design approach based on HDLs and prototype their designs in real hardware, mainly field programmable gate arrays (FPGAs). This spurred an abundance of textbooks to be authored teaching hardware description languages and higher levels of design abstraction. This trend has continued until today. While abstraction is a critical tool for engineering design, the rapid movement toward teaching only the modern digital design techniques has left a void for freshman- and sophomore-level courses in digital circuitry. Legacy textbooks that teach the classical design approach are outdated and do not contain sufficient coverage of HDLs to prepare the students for follow-on classes. Newer textbooks that teach the modern digital design approach move immediately into high-level behavioral modeling with minimal or no coverage of the underlying hardware used to implement the systems. As a result, students are not being provided the resources to understand the fundamental hardware theory that lies beneath the modern abstraction such as interfacing, gate-level implementation, and technology optimization. Students moving too rapidly into high levels of abstraction have little understanding of what is going on when they click the “compile and synthesize” button of their design tool. This leads to graduates who can model a breadth of different systems in an HDL but have no depth into how the system is implemented in hardware. This becomes problematic when an issue arises in a real design and there is no foundational knowledge for the students to fall back on in order to debug the problem

    Low-Power and Low-Noise Clock Generator for High-Speed ADCs

    Get PDF
    The rapid development of high-performance communication technologies reflects a clear trend in demanding requirements imposed on analog-to-digital converters (ADCs). Thus, it appears that these requirements imply higher frequencies not only for the input signal but also higher sampling frequencies, which translates into a higher sensitivity of the circuit to thermal noise and consequent increase in phase-noise. This arises as to the main purpose of this document, which will seek, as its main objective, the development of an architecture that allows the generation of multiple clock signals at high input frequencies with low jitter and low power dissipation to make ADCs more efficient and faster. This dissertation proposes an architecture implemented by a Clock Buffer that converts a differential input signal into a single-ended output signal, a Digital Buffer that transforms a sine wave into a square wave, and finally a Multi Clock Phase Generator (MPCG), consisting of Shift Registers. Both architectures are implemented in 130 nm CMOS technology. The architecture is powered by a LVDS signal with an amplitude of 200 mV and a frequency of 1 GHz, in order to output 8 square wave clock signals with an amplitude of 1.2 V and with a frequency of 125 MHz. The signals obtained at the output later will feed an architecture of 8 Time-Interleaved ADCs. The total area of the implemented circuit is about 8054.3 μm2, for a dissipated power of 5.3 mW and a jitter value of 1.13 ps. This new architecture will be aimed at all types of entities that work with devices that are made up of high-speed performance ADCs, to improve the operation of these same devices, making the processing from a continuous signal to a discrete signal as efficiently as possible.O rápido desenvolvimento das tecnologias de comunicação de alto desempenho, reflete uma tendência clara na exigência dos requisitos impostos aos conversores analógico-digital (ADCs). Deste modo, verifica-se que estes requisitos implicam elevadas frequências não só sinal de entrada, como também frequências elevadas de amostragem o que se traduz numa maior sensibilidade do circuito ao ruído térmico e consequente aumento ruído de fase. Esta problemática, surge como propósito principal deste documento, no qual se procurará, como objetivo principal, o desenvolvimento de uma arquitetura que permita gerar múltiplos sinais de relógio a altas frequências de entrada e períodos de amostragem, com um baixo jitter e baixa energia consumida de forma a tornar mais eficiente e rápido o funcionamento de ADCs. Ruido térmico. Esta dissertação propõe uma arquitetura composta por um amplificador de sinal de relógio que converte o duplo sinal de entrada num único sinal de saída, um amplificador digital que transforma uma onda sinusoidal numa onda quadrada e por fim um gerador de fase múltipla de sinais de relógio (MPCG), constituído por registos de deslocamento. Ambas as arquiteturas são implementadas em tecnologia CMOS de 130 nm. A arquitetura é alimentada com um sinal LVDS de 200 mV de amplitude e com uma frequência de 1 GHz, de forma a obter à saída 8 sinais de relógio de onda quadrada com uma amplitude de 1,2 V e com 125 MHz de frequência. Os sinais obtidos à saída posteriormente alimentarão uma arquitetura de 8 canais com multiplexagem temporal. A área total do circuito implementado é cerca de 8054,3 μm2, para uma potência dissipada de 5,3 mW e para um valor de jitter de 1,13 ps. Esta nova arquitetura será direcionada para todo o tipo de entidades que trabalham com dispositivos que são constituídos por ADCs de alta velocidade de desempenho, de forma a poder melhorar o funcionamento desses mesmos dispositivos, tornando o processamento de sinal continuo para sinal discreto o mais eficiente possível

    CAD Tools for Synthesis of Sleep Convention Logic

    Get PDF
    This dissertation proposes an automated flow for the Sleep Convention Logic (SCL) asynchronous design style. The proposed flow synthesizes synchronous RTL into an SCL netlist. The flow utilizes commercial design tools, while supplementing missing functionality using custom tools. A method for determining the performance bottleneck in an SCL design is proposed. A constraint-driven method to increase the performance of linear SCL pipelines is proposed. Several enhancements to SCL are proposed, including techniques to reduce the number of registers and total sleep capacitance in an SCL design

    Introduction to Logic Circuits & Logic Design with Verilog

    Get PDF
    The overall goal of this book is to fill a void that has appeared in the instruction of digital circuits over the past decade due to the rapid abstraction of system design. Up until the mid-1980s, digital circuits were designed using classical techniques. Classical techniques relied heavily on manual design practices for the synthesis, minimization, and interfacing of digital systems. Corresponding to this design style, academic textbooks were developed that taught classical digital design techniques. Around 1990, large-scale digital systems began being designed using hardware description languages (HDL) and automated synthesis tools. Broad-scale adoption of this modern design approach spread through the industry during this decade. Around 2000, hardware description languages and the modern digital design approach began to be taught in universities, mainly at the senior and graduate level. There were a variety of reasons that the modern digital design approach did not penetrate the lower levels of academia during this time. First, the design and simulation tools were difficult to use and overwhelmed freshman and sophomore students. Second, the ability to implement the designs in a laboratory setting was infeasible. The modern design tools at the time were targeted at custom integrated circuits, which are cost- and time-prohibitive to implement in a university setting. Between 2000 and 2005, rapid advances in programmable logic and design tools allowed the modern digital design approach to be implemented in a university setting, even in lower-level courses. This allowed students to learn the modern design approach based on HDLs and prototype their designs in real hardware, mainly fieldprogrammable gate arrays (FPGAs). This spurred an abundance of textbooks to be authored, teaching hardware description languages and higher levels of design abstraction. This trend has continued until today. While abstraction is a critical tool for engineering design, the rapid movement toward teaching only the modern digital design techniques has left a void for freshman- and sophomore-level courses in digital circuitry. Legacy textbooks that teach the classical design approach are outdated and do not contain sufficient coverage of HDLs to prepare the students for follow-on classes. Newer textbooks that teach the modern digital design approach move immediately into high-level behavioral modeling with minimal or no coverage of the underlying hardware used to implement the systems. As a result, students are not being provided the resources to understand the fundamental hardware theory that lies beneath the modern abstraction such as interfacing, gate-level implementation, and technology optimization. Students moving too rapidly into high levels of abstraction have little understanding of what is going on when they click the “compile and synthesize” button of their design tool. This leads to graduates who can model a breadth of different systems in an HDL but have no depth into how the system is implemented in hardware. This becomes problematic when an issue arises in a real design and there is no foundational knowledge for the students to fall back on in order to debug the problem
    corecore