6,728 research outputs found

    YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration

    Get PDF
    Convolutional neural networks (CNNs) have revolutionized the world of computer vision over the last few years, pushing image classification beyond human accuracy. The computational effort of today's CNNs requires power-hungry parallel processors or GP-GPUs. Recent developments in CNN accelerators for system-on-chip integration have reduced energy consumption significantly. Unfortunately, even these highly optimized devices are above the power envelope imposed by mobile and deeply embedded applications and face hard limitations caused by CNN weight I/O and storage. This prevents the adoption of CNNs in future ultra-low power Internet of Things end-nodes for near-sensor analytics. Recent algorithmic and theoretical advancements enable competitive classification accuracy even when limiting CNNs to binary (+1/-1) weights during training. These new findings bring major optimization opportunities in the arithmetic core by removing the need for expensive multiplications, as well as reducing I/O bandwidth and storage. In this work, we present an accelerator optimized for binary-weight CNNs that achieves 1510 GOp/s at 1.2 V on a core area of only 1.33 MGE (Million Gate Equivalent) or 0.19 mm2^2 and with a power dissipation of 895 {\mu}W in UMC 65 nm technology at 0.6 V. Our accelerator significantly outperforms the state-of-the-art in terms of energy and area efficiency achieving 61.2 TOp/s/[email protected] V and 1135 GOp/s/[email protected] V, respectively

    CMOS design of chaotic oscillators using state variables: a monolithic Chua's circuit

    Get PDF
    This paper presents design considerations for monolithic implementation of piecewise-linear (PWL) dynamic systems in CMOS technology. Starting from a review of available CMOS circuit primitives and their respective merits and drawbacks, the paper proposes a synthesis approach for PWL dynamic systems, based on state-variable methods, and identifies the associated analog operators. The GmC approach, combining quasi-linear VCCS's, PWL VCCS's, and capacitors is then explored regarding the implementation of these operators. CMOS basic building blocks for the realization of the quasi-linear VCCS's and PWL VCCS's are presented and applied to design a Chua's circuit IC. The influence of GmC parasitics on the performance of dynamic PWL systems is illustrated through this example. Measured chaotic attractors from a Chua's circuit prototype are given. The prototype has been fabricated in a 2.4- mu m double-poly n-well CMOS technology, and occupies 0.35 mm/sup 2/, with a power consumption of 1.6 mW for a +or-2.5-V symmetric supply. Measurements show bifurcation toward a double-scroll Chua's attractor by changing a bias current

    Systematic Comparison of HF CMOS Transconductors

    Get PDF
    Transconductors are commonly used as active elements in high-frequency (HF) filters, amplifiers, mixers, and oscillators. This paper reviews transconductor design by focusing on the V-I kernel that determines the key transconductor properties. Based on bandwidth considerations, simple V-I kernels with few or no internal nodes are preferred. In a systematic way, virtually all simple kernels published in literature are generated. This is done in two steps: 1) basic 3-terminal transconductors are covered and 2) then five different techniques to combine two of them in a composite V-I kernel. In order to compare transconductors in a fair way, a normalized signal-to-noise ratio (NSNR) is defined. The basic V-I kernels and the five classes of composite V-I kernels are then compared, leading to insight in the key mechanisms that affect NSNR. Symbolic equations are derived to estimate NSNR, while simulations with more advanced MOSFET models verify the results. The results show a strong tradeoff between NSNR and transconductance tuning range. Resistively generated MOSFETs render the best NSNR results and are robust for future technology developments

    Integrated chaos generators

    Get PDF
    This paper surveys the different design issues, from mathematical model to silicon, involved on the design of integrated circuits for the generation of chaotic behavior.Comisión Interministerial de Ciencia y Tecnología 1FD97-1611(TIC)European Commission ESPRIT 3110

    CMOS OTA-C high-frequency sinusoidal oscillators

    Get PDF
    Several topology families are given to implement practical CMOS sinusoidal oscillators by using operational transconductance amplifier-capacitor (OTA-C) techniques. Design techniques are proposed taking into account the CMOS OTA's dominant nonidealities. Building blocks are presented for amplitude control, both by automatic gain control (AGC) schemes and by limitation schemes. Experimental results from 3- and 2- mu m CMOS (MOSIS) prototypes that exhibit oscillation frequencies of up to 69 MHz are obtained. The amplitudes can be adjusted between 1 V peak to peak and 100 mV peak to peak. Total harmonic distortions from 2.8% down to 0.2% have been measured experimentally.Comisión Interministerial de Ciencia y Tecnología ME87-000

    A 13-bit, 2.2-MS/s, 55-mW multibit cascade ΣΔ modulator in CMOS 0.7-μm single-poly technology

    Get PDF
    This paper presents a CMOS 0.7-μm ΣΔ modulator IC that achieves 13-bit dynamic range at 2.2 MS/s with an oversampling ratio of 16. It uses fully differential switched-capacitor circuits with a clock frequency of 35.2 MHz, and has a power consumption of 55 mW. Such a low oversampling ratio has been achieved through the combined usage of fourth-order filtering and multibit quantization. To guarantee stable operation for any input signal and/or initial condition, the fourth-order shaping function has been realized using a cascade architecture with three stages; the first stage is a second-order modulator, while the others are first-order modulators - referred to as a 2-1-1mb architecture. The quantizer of the last stage is 3 bits, while the other quantizers are single bit. The modulator architecture and coefficients have been optimized for reduced sensitivity to the errors in the 3-bit quantization process. Specifically, the 3-bit digital-to-analog converter tolerates 2.8% FS nonlinearity without significant degradation of the modulator performance. This makes the use of digital calibration unnecessary, which is a key point for reduced power consumption. We show that, for a given oversampling ratio and in the presence of 0.5% mismatch, the proposed modulator obtains a larger signal-to-noise-plus-distortion ratio than previous multibit cascade architectures. On the other hand, as compared to a 2-1-1single-bit modulator previously designed for a mixed-signal asymmetrical digital subscriber line modem in the same technology, the modulator in this paper obtains one more bit resolution, enhances the operating frequency by a factor of two, and reduces the power consumption by a factor of four.Comisión Interministerial de Ciencia y Tecnología TIC97-0580European Commission ESPRIT 879

    Statistical Mechanics and Visual Signal Processing

    Full text link
    The nervous system solves a wide variety of problems in signal processing. In many cases the performance of the nervous system is so good that it apporaches fundamental physical limits, such as the limits imposed by diffraction and photon shot noise in vision. In this paper we show how to use the language of statistical field theory to address and solve problems in signal processing, that is problems in which one must estimate some aspect of the environment from the data in an array of sensors. In the field theory formulation the optimal estimator can be written as an expectation value in an ensemble where the input data act as external field. Problems at low signal-to-noise ratio can be solved in perturbation theory, while high signal-to-noise ratios are treated with a saddle-point approximation. These ideas are illustrated in detail by an example of visual motion estimation which is chosen to model a problem solved by the fly's brain. In this problem the optimal estimator has a rich structure, adapting to various parameters of the environment such as the mean-square contrast and the correlation time of contrast fluctuations. This structure is in qualitative accord with existing measurements on motion sensitive neurons in the fly's brain, and we argue that the adaptive properties of the optimal estimator may help resolve conlficts among different interpretations of these data. Finally we propose some crucial direct tests of the adaptive behavior.Comment: 34pp, LaTeX, PUPT-143

    Nonlinear Performance of BAW Filters Including BST Capacitors

    Get PDF
    This paper evaluates the nonlinear effects occurring in a bulk acoustic wave (BAW) filter which includes barium strontium titanate (BST) capacitors to cancel the electrostatic capacitance of the BAW resonators. To do that we consider the nonlinear effects on the BAW resonators by use of a nonlinear Mason model. This model accounts for the distributed nonlinearities inherent in the materials forming the resonator. The whole filter is then implemented by properly connecting the resonators in a balanced configuration. Additional BST capacitors are included in the filter topology. The nonlinear behavior of the BST capacitors is also accounted in the overall nonlinear assessment. The whole circuit is then used to evaluate its nonlinear behavior. It is found that the nonlinear contribution arising from the ferroelectric nature of the BST capacitors makes it impractical to fulfill the linearity requirements of commercial filters

    Discrete-Time Mixing Receiver Architecture for RF-Sampling Software-Defined Radio

    Get PDF
    A discrete-time (DT) mixing architecture for RF-sampling receivers is presented. This architecture makes RF sampling more suitable for software-defined radio (SDR) as it achieves wideband quadrature demodulation and wideband harmonic rejection. The paper consists of two parts. In the first part, different downconversion techniques are classified and compared, leading to the definition of a DT mixing concept. The suitability of CT-mixing and RF-sampling receivers to SDR is also discussed. In the second part, we elaborate the DT-mixing architecture, which can be realized by de-multiplexing. Simulation shows a wideband 90° phase shift between I and Q outputs without systematic channel bandwidth limitation. Oversampling and harmonic rejection relaxes RF pre-filtering and reduces noise and interference folding. A proof-of-concept DT-mixing downconverter has been built in 65 nm CMOS, for 0.2 to 0.9 GHz RF band employing 8-times oversampling. It can reject 2nd to 6th harmonics by 40 dB typically and without systematic channel bandwidth limitation. Without an LNA, it achieves a gain of -0.5 to 2.5 dB, a DSB noise figure of 18 to 20 dB, an IIP3 = +10 dBm, and an IIP2 = +53 dBm, while consuming less than 19 mW including multiphase clock generation
    corecore