49 research outputs found

    Phase Noise in CMOS Phase-Locked Loop Circuits

    Get PDF
    Phase-locked loops (PLLs) have been widely used in mixed-signal integrated circuits. With the continuously increasing demand of market for high speed, low noise devices, PLLs are playing a more important role in communications. In this dissertation, phase noise and jitter performances are investigated in different types of PLL designs. Hot carrier and negative bias temperature instability effects are analyzed from simulations and experiments. Phase noise of a CMOS phase-locked loop as a frequency synthesizer circuit is modeled from the superposition of noises from its building blocks: voltage-controlled oscillator, frequency divider, phase-frequency detector, loop filter and auxiliary input reference clock. A linear time invariant model with additive noise sources in frequency domain is presented to analyze the phase noise. The modeled phase noise results are compared with the corresponding experimentally measured results on phase-locked loop chips fabricated in 0.5 m n-well CMOS process. With the scaling of CMOS technology and the increase of electrical field, MOS transistors have become very sensitive to hot carrier effect (HCE) and negative bias temperature instability (NBTI). These two reliability issues pose challenges to designers for designing of chips in deep submicron CMOS technologies. A new strategy of switchable CMOS phase-locked loop frequency synthesizer is proposed to increase its tuning range. The switchable PLL which integrates two phase-locked loops with different tuning frequencies are designed and fabricated in 0.5 µm CMOS process to analyze the effects under HCE and NBTI. A 3V 1.2 GHz programmable phase-locked loop frequency synthesizer is designed in 0.5 μm CMOS technology. The frequency synthesizer is implemented using LC voltage-controlled oscillator (VCO) and a low power dual-modulus prescaler. The LC VCO working range is from 900MHz to 1.4GHz. Current mode logic (CML) is used in designing high speed D flip-flop in the dual-modulus prescaler circuits for low power consumption. The power consumption of the PLL chip is under 30mW. Fully differential LC VCO is used to provide high oscillation frequency. A new design of LC VCO using carbon nanotube (CNT) wire inductor has been proposed. The PLL design using CNT-LC VCO shows significant improvement in phase noise due to high-Q LC circuit

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    Manifold Learning Side-Channel Attacks against Masked Cryptographic Implementations

    Get PDF
    Masking, as a common countermeasure, has been widely utilized to protect cryptographic implementations against power side-channel attacks. It significantly enhances the difficulty of attacks, as the sensitive intermediate values are randomly partitioned into multiple parts and executed on different times. The adversary must amalgamate information across diverse time samples before launching an attack, which is generally accomplished by feature extraction (e.g., Points-Of-Interest (POIs) combination and dimensionality reduction). However, traditional POIs combination methods, machine learning and deep learning techniques are often too time consuming, and necessitate a significant amount of computational resources. In this paper, we undertake the first study on manifold learning and their applications against masked cryptographic implementations. The leaked information, which manifests as the manifold of high-dimensional power traces, is mapped into a low-dimensional space and achieves feature extraction through manifold learning techniques like ISOMAP, Locally Linear Embedding (LLE), and Laplacian Eigenmaps (LE). Moreover, to reduce the complexity, we further construct explicit polynomial mappings for manifold learning to facilitate the dimensionality reduction. Compared to the classical machine learning and deep learning techniques, our schemes built from manifold learning techniques are faster, unsupervised, and only require very simple parameter tuning. Their effectiveness has been fully validated by our detailed experiments

    Single event upset hardened embedded domain specific reconfigurable architecture

    Get PDF

    A Survey of Fault-Tolerance Techniques for Embedded Systems from the Perspective of Power, Energy, and Thermal Issues

    Get PDF
    The relentless technology scaling has provided a significant increase in processor performance, but on the other hand, it has led to adverse impacts on system reliability. In particular, technology scaling increases the processor susceptibility to radiation-induced transient faults. Moreover, technology scaling with the discontinuation of Dennard scaling increases the power densities, thereby temperatures, on the chip. High temperature, in turn, accelerates transistor aging mechanisms, which may ultimately lead to permanent faults on the chip. To assure a reliable system operation, despite these potential reliability concerns, fault-tolerance techniques have emerged. Specifically, fault-tolerance techniques employ some kind of redundancies to satisfy specific reliability requirements. However, the integration of fault-tolerance techniques into real-time embedded systems complicates preserving timing constraints. As a remedy, many task mapping/scheduling policies have been proposed to consider the integration of fault-tolerance techniques and enforce both timing and reliability guarantees for real-time embedded systems. More advanced techniques aim additionally at minimizing power and energy while at the same time satisfying timing and reliability constraints. Recently, some scheduling techniques have started to tackle a new challenge, which is the temperature increase induced by employing fault-tolerance techniques. These emerging techniques aim at satisfying temperature constraints besides timing and reliability constraints. This paper provides an in-depth survey of the emerging research efforts that exploit fault-tolerance techniques while considering timing, power/energy, and temperature from the real-time embedded systems’ design perspective. In particular, the task mapping/scheduling policies for fault-tolerance real-time embedded systems are reviewed and classified according to their considered goals and constraints. Moreover, the employed fault-tolerance techniques, application models, and hardware models are considered as additional dimensions of the presented classification. Lastly, this survey gives deep insights into the main achievements and shortcomings of the existing approaches and highlights the most promising ones

    Configurable pseudo noise radar imaging system enabling synchronous MIMO channel extension

    Get PDF
    In this article, we propose an evolved system design approach to ultra-wideband (UWB) radar based on pseudo-random noise (PRN) sequences, the key features of which are its user-adaptability to meet the demands provided by desired microwave imaging applications and its multichannel scalability. In light of providing a fully synchronized multichannel radar imaging system for short-range imaging as mine detection, non-destructive testing (NDT) or medical imaging, the advanced system architecture is presented with a special focus put on the implemented synchronization mechanism and clocking scheme. The core of the targeted adaptivity is provided by means of hardware, such as variable clock generators and dividers as well as programmable PRN generators. In addition to adaptive hardware, the customization of signal processing is feasible within an extensive open-source framework using the Red Pitaya ® data acquisition platform. A system benchmark in terms of signal-to-noise ratio (SNR), jitter, and synchronization stability is conducted to determine the achievable performance of the prototype system put into practice. Furthermore, an outlook on the planned future development and performance improvement is provided
    corecore