37 research outputs found

    Metal-Oxide-Semiconductor-Only Process Corner Monitoring Circuit

    Get PDF
    A process corner monitoring circuit (PCMC) is presented in this work. The circuit generates a signal, the logical value of which depends on the process corner only. The signal can be used in both digital and analog circuits for testing and compensation of process variations (PV). The presented circuit uses only metal-oxide-semiconductor (MOS) transistors, which allow increasing its detection accuracy, decrease power consumption and area. Due to its simplicity the presented circuit can be easily modified to monitor parametrical variations of only n-type and p-type MOS (NMOS and PMOS, respectively) transistors, resistors, as well as their combinations. Post-layout simulation results prove correct functionality of the proposed circuit, i.e. ability to monitor the process corner (equivalently die-to-die variations) even in the presence of within-die variations

    Study of Radiation Effects on 28nm UTBB FDSOI Technology

    Get PDF
    With the evolution of modern Complementary Metal-Oxide-Semiconductor (CMOS) technology, transistor feature size has been scaled down to nanometers. The scaling has resulted in tremendous advantages to the integrated circuits (ICs), such as higher speed, smaller circuit size, and lower operating voltage. However, it also creates some reliability concerns. In particular, small device dimensions and low operating voltages have caused nanoscale ICs to become highly sensitive to operational disturbances, such as signal coupling, supply and substrate noise, and single event effects (SEEs) caused by ionizing particles, like cosmic neutrons and alpha particles. SEEs found in ICs can introduce transient pulses in circuit nodes or data upsets in storage cells. In well-designed ICs, SEEs appear to be the most troublesome in a space environment or at high altitudes in terrestrial environment. Techniques from the manufacturing process level up to the system design level have been developed to mitigate radiation effects. Among them, silicon-on-insulator (SOI) technologies have proven to be an effective approach to reduce single-event effects in ICs. So far, 28nm ultra-thin body and buried oxide (UTBB) Fully Depleted SOI (FDSOI) by STMicroelectronics is one of the most advanced SOI technologies in commercial applications. Its resilience to radiation effects has not been fully explored and it is of prevalent interest in the radiation effects community. Therefore, two test chips, namely ST1 and AR0, were designed and tested to study SEEs in logic circuits fabricated with this technology. The ST1 test chip was designed to evaluate SET pulse widths in logic gates. Three kinds of the on-chip pulse-width measurement detectors, namely the Vernier detector, the Pulse Capture detector and the Pulse Filter detector, were implemented in the ST1 chip. Moreover, a Circuit for Radiation Effects Self-Test (CREST) chain with combinational logic was designed to study both SET and SEU effects. The ST1 chip was tested using a heavy ion irradiation beam source in Radiation Effects Facility (RADEF), Finland. The experiment results showed that the cross-section of the 28nm UTBB-FDSOI technology is two orders lower than its bulk competitors. Laser tests were also applied to this chip to research the pulse distortion effects and the relationship between SET, SEU and the clock frequency. Total Ionizing Dose experiments were carried out at the University of Saskatchewan and European Space Agency with Co-60 gammacell radiation sources. The test results showed the devices implemented in the 28nm UTBB-FDSOI technology can maintain its functionality up to 1 Mrad(Si). In the AR0 chip, we designed five ARM Cortex-M0 cores with different logic protection levels to investigate the performance of approximate logic protecting methods. There are three custom-designed SRAM blocks in the test chip, which can also be used to measure the SEU rate. From the simulation result, we concluded that the approximate logic methodology can protect the digital logic efficiently. This research comprehensively evaluates the radiation effects in the 28nm UTBB-FDSOI technology, which provides the baseline for later radiation-hardened system designs in this technology

    LOW POWER AND HIGH SIGNAL TO NOISE RATIO BIO-MEDICAL AFE DESIGN TECHNIQUES

    Get PDF
    The research work described in this thesis was focused on finding novel techniques to implement a low-power and noise Bio-Medical Analog Front End (BMEF) circuit technique to enable high-quality Electrocardiography (ECG) sensing. Usually, an ECG signal and several bio-medical signals are sensed from the human body through a pair of electrodes. The electrical characteristics of the very small amplitude (1u-10mV) signals are corrupted by random noise and have a significant dc offset. 50/60Hz power supply coupling noise is one of the biggest cross-talk signals compared to the thermally generated random noise. These signals are even AFE composed of an Instrumentation Amplifier (IA), which will have a better Common Mode rejection ratio (CMRR). The main function of the AFE is to convert the weak electrical Signal into large signals whose amplitude is large enough for an Analog Digital Converter (ADC) to detect without having any errors. A Variable Gain Amplifier (VGA) is sometimes required to adjust signal amplitude to maintain the dynamic range of the ADC. Also, the Bio-medical transceiver needs an accurate and temperature-independent reference voltage and current for the ADC, commonly known as Bandgap Reference Circuit (BGR). These circuits need to consume as low power as possible to enable these circuits to be powered from the battery. The work started with analysing the existing circuit techniques for the circuits mentioned above and finding the key important improvements required to reach the target specifications. Previously proposed IA is generated based on voltage mode signal processing. To improve the CMRR (119dB), we proposed a current mode-based IA with an embedded DC cancellation technique. State-of-the-art VGA circuits were built based on the degeneration principle of the differential pair, which will enable the variable gain purpose, but none of these techniques discussed linearity improvement, which is very important in modern CMOS technologies. This work enhances the total Harmonic distortion (THD) by 21dB in the worst case by exploiting the feedback techniques around the differential pair. Also, this work proposes a low power curvature compensated bandgap with 2ppm/0C temperature sensitivity while consuming 12.5uW power from a 1.2V dc power supply. All circuits were built in 45nm TSMC-CMOS technology and simulated with all the performance metrics with Cadence (spectre) simulator. The circuit layout was carried out to study post-layout parasitic effect sensitivity

    Energy-Efficient Neural Network Architectures

    Full text link
    Emerging systems for artificial intelligence (AI) are expected to rely on deep neural networks (DNNs) to achieve high accuracy for a broad variety of applications, including computer vision, robotics, and speech recognition. Due to the rapid growth of network size and depth, however, DNNs typically result in high computational costs and introduce considerable power and performance overheads. Dedicated chip architectures that implement DNNs with high energy efficiency are essential for adding intelligence to interactive edge devices, enabling them to complete increasingly sophisticated tasks by extending battery lie. They are also vital for improving performance in cloud servers that support demanding AI computations. This dissertation focuses on architectures and circuit technologies for designing energy-efficient neural network accelerators. First, a deep-learning processor is presented for achieving ultra-low power operation. Using a heterogeneous architecture that includes a low-power always-on front-end and a selectively-enabled high-performance back-end, the processor dynamically adjusts computational resources at runtime to support conditional execution in neural networks and meet performance targets with increased energy efficiency. Featuring a reconfigurable datapath and a memory architecture optimized for energy efficiency, the processor supports multilevel dynamic activation of neural network segments, performing object detection tasks with 5.3x lower energy consumption in comparison with a static execution baseline. Fabricated in 40nm CMOS, the processor test-chip dissipates 0.23mW at 5.3 fps. It demonstrates energy scalability up to 28.6 TOPS/W and can be configured to run a variety of workloads, including severely power-constrained ones such as always-on monitoring in mobile applications. To further improve the energy efficiency of the proposed heterogeneous architecture, a new charge-recovery logic family, called zero-short-circuit current (ZSCC) logic, is proposed to decrease the power consumption of the always-on front-end. By relying on dedicated circuit topologies and a four-phase clocking scheme, ZSCC operates with significantly reduced short-circuit currents, realizing order-of-magnitude power savings at relatively low clock frequencies (in the order of a few MHz). The efficiency and applicability of ZSCC is demonstrated through an ANSI S1.11 1/3 octave filter bank chip for binaural hearing aids with two microphones per ear. Fabricated in a 65nm CMOS process, this charge-recovery chip consumes 13.8µW with a 1.75MHz clock frequency, achieving 9.7x power reduction per input in comparison with a 40nm monophonic single-input chip that represents the published state of the art. The ability of ZSCC to further increase the energy efficiency of the heterogeneous neural network architecture is demonstrated through the design and evaluation of a ZSCC-based front-end. Simulation results show 17x power reduction compared with a conventional static CMOS implementation of the same architecture.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147614/1/hsiwu_1.pd

    Sincronização em sistemas integrados a alta velocidade

    Get PDF
    Doutoramento em Engenharia ElectrotécnicaA distribui ção de um sinal relógio, com elevada precisão espacial (baixo skew) e temporal (baixo jitter ), em sistemas sí ncronos de alta velocidade tem-se revelado uma tarefa cada vez mais demorada e complexa devido ao escalonamento da tecnologia. Com a diminuição das dimensões dos dispositivos e a integração crescente de mais funcionalidades nos Circuitos Integrados (CIs), a precisão associada as transições do sinal de relógio tem sido cada vez mais afectada por varia ções de processo, tensão e temperatura. Esta tese aborda o problema da incerteza de rel ogio em CIs de alta velocidade, com o objetivo de determinar os limites do paradigma de desenho sí ncrono. Na prossecu ção deste objectivo principal, esta tese propõe quatro novos modelos de incerteza com âmbitos de aplicação diferentes. O primeiro modelo permite estimar a incerteza introduzida por um inversor est atico CMOS, com base em parâmetros simples e su cientemente gen éricos para que possa ser usado na previsão das limitações temporais de circuitos mais complexos, mesmo na fase inicial do projeto. O segundo modelo, permite estimar a incerteza em repetidores com liga ções RC e assim otimizar o dimensionamento da rede de distribui ção de relógio, com baixo esfor ço computacional. O terceiro modelo permite estimar a acumula ção de incerteza em cascatas de repetidores. Uma vez que este modelo tem em considera ção a correla ção entre fontes de ruí do, e especialmente util para promover t ecnicas de distribui ção de rel ogio e de alimentação que possam minimizar a acumulação de incerteza. O quarto modelo permite estimar a incerteza temporal em sistemas com m ultiplos dom ínios de sincronismo. Este modelo pode ser facilmente incorporado numa ferramenta autom atica para determinar a melhor topologia para uma determinada aplicação ou para avaliar a tolerância do sistema ao ru ído de alimentação. Finalmente, usando os modelos propostos, são discutidas as tendências da precisão de rel ogio. Conclui-se que os limites da precisão do rel ogio são, em ultima an alise, impostos por fontes de varia ção dinâmica que se preveem crescentes na actual l ogica de escalonamento dos dispositivos. Assim sendo, esta tese defende a procura de solu ções em outros ní veis de abstração, que não apenas o ní vel f sico, que possam contribuir para o aumento de desempenho dos CIs e que tenham um menor impacto nos pressupostos do paradigma de desenho sí ncrono.Distributing a the clock simultaneously everywhere (low skew) and periodically everywhere (low jitter) in high-performance Integrated Circuits (ICs) has become an increasingly di cult and time-consuming task, due to technology scaling. As transistor dimensions shrink and more functionality is packed into an IC, clock precision becomes increasingly a ected by Process, Voltage and Temperature (PVT) variations. This thesis addresses the problem of clock uncertainty in high-performance ICs, in order to determine the limits of the synchronous design paradigm. In pursuit of this main goal, this thesis proposes four new uncertainty models, with di erent underlying principles and scopes. The rst model targets uncertainty in static CMOS inverters. The main advantage of this model is that it depends only on parameters that can easily be obtained. Thus, it can provide information on upcoming constraints very early in the design stage. The second model addresses uncertainty in repeaters with RC interconnects, allowing the designer to optimise the repeater's size and spacing, for a given uncertainty budget, with low computational e ort. The third model, can be used to predict jitter accumulation in cascaded repeaters, like clock trees or delay lines. Because it takes into consideration correlations among variability sources, it can also be useful to promote oorplan-based power and clock distribution design in order to minimise jitter accumulation. A fourth model is proposed to analyse uncertainty in systems with multiple synchronous domains. It can be easily incorporated in an automatic tool to determine the best topology for a given application or to evaluate the system's tolerance to power-supply noise. Finally, using the proposed models, this thesis discusses clock precision trends. Results show that limits in clock precision are ultimately imposed by dynamic uncertainty, which is expected to continue increasing with technology scaling. Therefore, it advocates the search for solutions at other abstraction levels, and not only at the physical level, that may increase system performance with a smaller impact on the assumptions behind the synchronous design paradigm

    Voltage stacking for near/sub-threshold operation

    Get PDF

    Design of High-Speed SerDes Transceiver for Chip-to-Chip Communications in CMOS Process

    Get PDF
    With the continuous increase of on-chip computation capacities and exponential growth of data-intensive applications, the high-speed data transmission through serial links has become the backbone for modern communication systems. To satisfy the massive data-exchanging requirement, the data rate of such serial links has been updated from several Gb/s to tens of Gb/s. Currently, the commercial standards such as Ethernet 400GbE, InfiniBand high data rate (HDR), and common electrical interface (CEI)-56G has been developing towards 40+ Gb/s. As the core component within these links, the transceiver chipset plays a fundamental role in balancing the operation speed, power consumption, area occupation, and operation range. Meanwhile, the CMOS process has become the dominant technology in modern transceiver chip fabrications due to its large-scale digital integration capability and aggressive pricing advantage. This research aims to explore advanced techniques that are capable of exploiting the maximum operation speed of the CMOS process, and hence provides potential solutions for 40+ Gb/s CMOS transceiver designs. The major contributions are summarized as follows. A low jitter ring-oscillator-based injection-locked clock multiplier (RILCM) with a hybrid frequency tracking loop that consists of a traditional phase-locked loop (PLL), a timing-adjusted loop, and a loop selection state-machine is implemented in 65-nm C-MOS process. In the ring voltage-controlled oscillator, a full-swing pseudo-differential delay cell is proposed to lower the device noise to phase noise conversion. To obtain high operation speed and high detection accuracy, a compact timing-adjusted phase detector tightly combined with a well-matched charge pump is designed. Meanwhile, a lock-loss detection and lock recovery is devised to endow the RILCM with a similar lock-acquisition ability as conventional PLL, thus excluding the initial frequency set- I up aid and preventing the potential lock-loss risk. The experimental results show that the figure-of-merit of the designed RILCM reaches -247.3 dB, which is better than previous RILCMs and even comparable to the large-area LC-ILCMs. The transmitter (TX) and receiver (RX) chips are separately designed and fab- ricated in 65-nm CMOS process. The transmitter chip employs a quarter-rate multi-multiplexer (MUX)-based 4-tap feed-forward equalizer (FFE) to pre-distort the output. To increase the maximum operating speed, a bandwidth-enhanced 4:1 MUX with the capability of eliminating charge-sharing effect is proposed. To produce the quarter-rate parallel data streams with appropriate delays, a compact latch array associated with an interleaved-retiming technique is designed. The receiver chip employs a two-stage continuous-time linear equalizer (CTLE) as the analog front-end and integrates an improved clock data recovery to extract the sampling clocks and retime the incoming data. To automatically balance the jitter tracking and jitter suppression, passive low-pass filters with adaptively-adjusted bandwidth are introduced into the data-sampling path. To optimize the linearity of the phase interpolation, a time-averaging-based compensating phase interpolator is proposed. For equalization, a combined TX-FFE and RX-CTLE is applied to compensate for the channel loss, where a low-cost edge-data correlation-based sign zero-forcing adaptation algorithm is proposed to automatically adjust the TX-FFE’s tap weights. Measurement results show that the fabricated transmitter/receiver chipset can deliver 40 Gb/s random data at a bit error rate of 16 dB loss at the half-baud frequency, while consuming a total power of 370 mW

    Design and realization of a 2.4 Gbps - 3.2 Gbps clock and data recovery circuit

    Get PDF
    This thesis presents the design, verification, system integration and the physical realization of a high-speed monolithic phase-locked loop (PLL) based clock and data recovery (CDR) circuit. The architecture of the CDR has been realized as a two-loop structure consisting of coarse and fine loops, each of which is capable of processing the incoming low-speed reference clock and high-speed random data. At start up, the coarse loop provides fast locking to the system frequency with the help of the reference clock. After the VCO clock reaches a proximity of system frequency , the LOCK signal is generated and the coarse loop is tumed off, while the fine loop is tumed on. Fine loop tracks the phase of the generated clock with respect to the data and aligns the VCO clock such that its rising edge is in the middle of data eye. The speed and symmetry of sub-blocks in fine loop are extremely important, since all asymmetric charging effects, skew and setup/hold problems in this loop translate into a static phase error at the clock output. The entire circuit architecture is built with a special low-voltage circuit design technique. All analogue as well as digital sub-blocks of the CDR architecture presented in this work operate on a differential signalling, which significantly makes the design more complex while ensuring a more robust perforrnance. Other important features of this CDR include small area, single power supply, low power consumption, capability to operate at very high data rates, and the ability to handle between 2.4 Gbps and 3.2 Gbps data rate. The CDR architecture was realized using a conventional 0.13-mikrometer digital CMOS technology (Foundry: UMC), which ensures a lower overall cost and better portability for the design. The CDR architecture presented in this work is capable of operating at sampling frequencies of up to 3.2 GHz, and still can achieve the robust phase alignrnent. The entire circuit is designed with single 1.2 V power supply .The overall power consumption is estimated as 18.6 mW at 3.2 GHz sampling rate. The overall silicon area of the CDR is approximately 0.3 mm^2 with its internal loop filter capacitors. Other researchers have reported similar featured PLL-based clock and data recovery circuits in terms of operating data rate, architecture and jitter performance. To the best of our knowledge, this clock recovery uses the advantage of being the first high-speed CDR designed in CMOS 0.13 mikrometer technology with the superiority on power consumption and area considerations among others. The CDR architecture presented in this thesis is intended, as a state-of-the-art clock recovery for high-speed applications such as optical communications or high bandwidth serial wireline communication needs. It can be used either as a stand-alone single-chip unit, or as an embedded intellectual property (IP) block that can be integrated with other modules on chip

    Design of Frequency divider with voltage vontrolled oscillator for 60 GHz low power phase-locked loops in 65 nm RF CMOS

    Get PDF
    Increasing memory capacity in mobile devices, is driving the need of high-data rates equipment. The 7 GHz band around 60 GHz provides the opportunity for multi-gigabit/sec wireless communication. It is a real opportunity for developing next generation of High-Definition (HD) devices. In the last two decades there was a great proliferation of Voltage Controlled Oscillator (VCO) and Frequency Divider (FD) topologies in RF ICs on silicon, but reaching high performance VCOs and FDs operating at 60 GHz is in today's technology a great challenge. A key reason is the inaccuracy of CMOS active and passive device models at mm-W. Three critical issues still constitute research objectives at 60 GHz in CMOS: generation of the Local Oscillator (LO) signal (1), division of the LO signal for the Phase-Locked Loop (PLL) closed loop (2) and distribution of the LO signal (3). In this Thesis, all those three critical issues are addressed and experimentally faced-up: a divide-by-2 FD for a PLL of a direct-conversion transceiver operating at mm-W frequencies in 65 nm RF CMOS technology has been designed. Critical issues such as Process, Voltage and Temperature (PVT) variations, Electromagnetic (EM) simulations and power consumption are addressed to select and design a FD with high frequency dividing range. A 60 GHz VCO is co-designed and integrated in the same die, in order to provide the FD with mm-W input signal. VCOs and FDs play critical roles in the PLL. Both of them constitute the PLL core components and they would need co-design, having a big impact in the overall performance especially because they work at the highest frequency in the PLL. Injection Locking FD (ILFD) has been chosen as the optimum FD topology to be inserted in the control loop of mm-W PLL for direct-conversion transceiver, due to the high speed requirements and the power consumption constraint. The drawback of such topology is the limited bandwidth, resulting in narrow Locking Range (LR) for WirelessHDTM applications considering the impact of PVT variations. A simulation methodology is presented in order to analyze the ILFD locking state, proposing a first divide-by-2 ILFD design with continuous tuning. In order to design a wide LR, low power consumption ILFD, the impacts of various alternatives of low/high Q tank and injection scheme are deeply analysed, since the ILFD locking range depends on the Q of the tank and injection efficiency. The proposed 3-bit dual-mixing 60 GHz divide-by-2 LC-ILFD is designed with an accumulation of switching varactors binary scaled to compensate PVT variations. It is integrated in the same die with a 4-bit 60 GHz LC-VCO. The overall circuit is designed to allow measurements of the singles blocks stand-alone and working together. The co-layout is carried on with the EM modelling process of passives devices, parasitics and transmission lines extracted from the layout. The inductors models provided by the foundry are qualified up to 40 GHz, therefore the EM analysis is a must for post-layout simulation. The PVT variations have been simulated before manufacturing and, based on the results achieved, a PLL scheme PVT robust, considering frequency calibration, has been patented. The test chip has been measured in the CEA-Leti (Grenoble) during a stay of one week. The operation principle and the optimization trade-offs among power consumption, and locking ranges of the final selected ILFD topology have been demonstrated. Even if the experimental results are not completely in agreement with the simulations, due to modelling error and inaccuracy, the proposed technique has been validated with post-measurement simulations. As demonstrated, the locking range of a low-power, discrete tuned divide-by-2 ILFD can be enhanced by increasing the injection efficiency, without the drawbacks of higher power consumption and chip area. A 4-bits wide tuning range LC-VCO for mm-W applications has been co-designed using the selected 65 nm CMOS process.Postprint (published version
    corecore