12,352 research outputs found

    ポータビリティを意識したCMOSミックスドシグナルVLSI回路設計手法に関する研究

    Get PDF
    本研究は、半導体上に集積されたアナログ・ディジタル・メモリ回路から構成されるミクストシグナルシステムを別の製造プロセスへ移行することをポーティングとして定義し、効率的なポーティングを行うための設計方式と自動回路合成アルゴリズムを提案し、いくつかの典型的な回路に対する設計事例を示し、提案手法の妥当性を立証している。北九州市立大

    Within-Die Delay Variation Measurement And Analysis For Emerging Technologies Using An Embedded Test Structure

    Get PDF
    Both random and systematic within-die process variations (PV) are growing more severe with shrinking geometries and increasing die size. Escalation in the variations in delay and power with reductions in feature size places higher demands on the accuracy of variation models. Their availability can be used to improve yield, and the corresponding profitability and product quality of the fabricated integrated circuits (ICs). Sources of within-die variations include optical source limitations, and layout-based systematic effects (pitch, line-width variability, and microscopic etch loading). Unfortunately, accurate models of within-die PVs are becoming more difficult to derive because of their increasingly sensitivity to design-context. Embedded test structures (ETS) continue to play an important role in the development of models of PVs and as a mechanism to improve correlations between hardware and models. Variations in path delays are increasing with scaling, and are increasingly affected by neighborhood\u27 interactions. In order to fully characterize within-die variations, delays must be measured in the context of actual core-logic macros. Doing so requires the use of an embedded test structure, as opposed to traditional scribe line test structures such as ring oscillators (RO). Accurate measurements of within-die variations can be used, e.g., to better tune models to actual hardware (model-to-hardware correlations). In this research project, I propose an embedded test structure called REBEL (Regional dELay BEhavior) that is designed to measure path delays in a minimally invasive fashion; and its architecture measures the path delays more accurately. Design for manufacture-ability (DFM) analysis is done on the on 90 nm ASIC chips and 28nm Zynq 7000 series FPGA boards. I present ASIC results on within-die path delay variations in a floating-point unit (FPU) fabricated in IBM\u27s 90 nm technology, with 5 pipeline stages, used as a test vehicle in chip experiments carried out at nine different temperature/voltage (TV) corners. Also experimental data has been analyzed for path delay variations in short vs long paths. FPGA results on within-die variation and die-to-die variations on Advanced Encryption System (AES) using single pipelined stage are also presented. Other analysis that have been performed on the calibrated path delays are Flip Flop propagation delays for both rising and falling edge (tpHL and tpLH), uncertainty analysis, path distribution analysis, short versus long path variations and mid-length path within-die variation. I also analyze the impact on delay when the chips are subjected to industrial-level temperature and voltage variations. From the experimental results, it has been established that the proposed REBEL provides capabilities similar to an off-chip logic analyzer, i.e., it is able to capture the temporal behavior of the signal over time, including any static and dynamic hazards that may occur on the tested path. The ASIC results further show that path delays are correlated to the launch-capture (LC) interval used to time them. Therefore, calibration as proposed in this work must be carried out in order to obtain an accurate analysis of within-die variations. Results on ASIC chips show that short paths can vary up to 35% on average, while long paths vary up to 20% at nominal temperature and voltage. A similar trend occurs for within-die variations of mid-length paths where magnitudes reduced to 20% and 5%, respectively. The magnitude of delay variations in both these analyses increase as temperature and voltage are changed to increase performance. The high level of within-die delay variations are undesirable from a design perspective, but they represent a rich source of entropy for applications that make use of \u27secrets\u27 such as authentication, hardware metering and encryption. Physical unclonable functions (PUFs) are a class of primitives that leverage within-die-variations as a means of generating random bit strings for these types of applications, including hardware security and trust. Zynq FPGAs Die-to-Die and within-die variation study shows that on average there is 5% of within-Die variation and the range of die-to-Die variation can go upto 3ns. The die-to-Die variations can be explored in much further detail to study the variations spatial dependance. Additionally, I also carried out research in the area data mining to cater for big data by focusing the work on decision tree classification (DTC) to speed-up the classification step in hardware implementation. For this purpose, I devised a pipelined architecture for the implementation of axis parallel binary decision tree classification for meeting up with the requirements of execution time and minimal resource usage in terms of area. The motivation for this work is that analyzing larger data-sets have created abundant opportunities for algorithmic and architectural developments, and data-mining innovations, thus creating a great demand for faster execution of these algorithms, leading towards improving execution time and resource utilization. Decision trees (DT) have since been implemented in software programs. Though, the software implementation of DTC is highly accurate, the execution times and the resource utilization still require improvement to meet the computational demands in the ever growing industry. On the other hand, hardware implementation of DT has not been thoroughly investigated or reported in detail. Therefore, I propose a hardware acceleration of pipelined architecture that incorporates the parallel approach in acquiring the data by having parallel engines working on different partitions of data independently. Also, each engine is processing the data in a pipelined fashion to utilize the resources more efficiently and reduce the time for processing all the data records/tuples. Experimental results show that our proposed hardware acceleration of classification algorithms has increased throughput, by reducing the number of clock cycles required to process the data and generate the results, and it requires minimal resources hence it is area efficient. This architecture also enables algorithms to scale with increasingly large and complex data sets. We developed the DTC algorithm in detail and explored techniques for adapting it to a hardware implementation successfully. This system is 3.5 times faster than the existing hardware implementation of classification.\u2

    Reconfigurable negative bit line collapsed supply write-assist for 9T-ST static random access memory cell

    Get PDF
    This paper presents a reconfigurable negative bit line collapsed supply (RNBLCS) write driver circuit for the 9T Schmitt trigger-based static random-access memory (SRAM) cell (9T-ST), significantly improving write performance for real-time memory applications. In deep sub-micron technology, increasing device parameter deviations significantly reduce SRAM cells' write-ability. The proposed RNBLCS write-assist driver for 9T-ST SRAM cell has 0.84×, 0.48×, 0.27× optimized write access delay and 1.05×, 1.08×, 1.19× improvement in write static noise margin (WSNM), 1.05×, 1.13×, and 1.39× improvement in write margin (WM), 0.96×, 0.89× and 0.72× minimum write trip-point (WTP) from transient-negative bit line (Tran-NBL), capacitive charge sharing (CCS), and conventional write circuits respectively. The proposed RNBLCS is functionally verified using a synopsys custom compiler with a 16 nm BSIM4 model card for bulk complementary metal-oxide semiconductor (CMOS)

    Static noise margin analysis for CMOS logic cells in near-threshold

    Get PDF
    The advancement of semiconductor technology enabled the fabrication of devices with faster switching activity and chips with higher integration density. However, these advances are facing new impediments related to energy and power dissipation. Besides, the increasing demand for portable devices leads the circuit design paradigm to prioritize energy efficiency instead of performance. Altogether, this scenario motivates engineers towards reducing the supply voltage to the near and subthreshold regime to increase the lifespan of battery-powered devices. Even though operating in these regime offer interesting energy-frequency trade-offs, it brings challenges concerning noise tolerance. As the supply voltage reduces, the available noise margins decrease, and circuits become more prone to functional failures. In addition, near and subthreshold circuits are more susceptible to manufacturing variability, hence further aggravating noise issues. Other issues, such as wire minimization and gate fan-out, also contribute to the relevance of evaluating the noise margin of circuits early in the design Accordingly, this work investigates how to improve the static noise margin of digital synchronous circuits that will operate at the near/subthreshold regime. This investigation produces a set of three original contributions. The first is an automated tool to estimate the static noise margin of CMOS combinational cells. The second contribution is a realistic static noise margin estimation methodology that considers process-voltage-temperature variations. Results show that the proposed methodology allows to reduce up to 70% of the static noise margin pessimism. Finally, the third contribution is the noise-aware cell design methodology and the inclusion of a noise evaluation of complex circuits during the logic synthesis. The resulting library achieved higher static noise margin (up to 24%) and less spread among different cells (up to 62%).Os avanços na tecnologia de semicondutores possibilitou que se fabricasse dispositivos com atividade de chaveamento mais rápida e com maior capacidade de integração de transistores. Estes avanços, todavia, impuseram novos empecilhos relacionados com a dissipação de potência e energia. Além disso, a crescente demanda por dispositivos portáteis levaram à uma mudança no paradigma de projeto de circuitos para que se priorize energia ao invés de desempenho. Este cenário motivou à reduzir a tensão de alimentação com qual os dispositivos operam para um regime próximo ou abaixo da tensão de limiar, com o objetivo de aumentar sua duração de bateria. Apesar desta abordagem balancear características de performance e energia, ela traz novos desafios com relação a tolerância à ruído. Ao reduzirmos a tensão de alimentação, também reduz-se a margem de ruído disponível e, assim, os circuitos tornam-se mais suscetíveis à falhas funcionais. Somado à este efeito, circuitos com tensões de alimentação nestes regimes são mais sensíveis à variações do processo de fabricação, logo agravando problemas com ruído. Existem também outros aspectos, tais como a miniaturização das interconexões e a relação de fan-out de uma célula digital, que incentivam a avaliação de ruído nas fases iniciais do projeto de circuitos integrados Por estes motivos, este trabalho investiga como aprimorar a margem de ruído estática de circuitos síncronos digitais que irão operar em tensões no regime de tensão próximo ou abaixo do limiar. Esta investigação produz um conjunto de três contribuições originais. A primeira é uma ferramenta capaz de avaliar automaticamente a margem de ruído estática de células CMOS combinacionais. A segunda contribuição é uma metodologia realista para estimar a margem de ruído estática considerando variações de processo, tensão e temperatura. Os resultados obtidos mostram que a metodologia proposta permitiu reduzir até 70% do pessimismo das margens de ruído estática, Por último, a terceira contribuição é um fluxo de projeto de células combinacionais digitais considerando ruído, e uma abordagem para avaliar a margem de ruído estática de circuitos complexos durante a etapa de síntese lógica. A biblioteca de células resultante deste fluxo obteve maior margem de ruído (até 24%) e menor variação entre diferentes células (até 62%)

    Statistical circuit simulations - from ‘atomistic’ compact models to statistical standard cell characterisation

    Get PDF
    This thesis describes the development and application of statistical circuit simulation methodologies to analyse digital circuits subject to intrinsic parameter fluctuations. The specific nature of intrinsic parameter fluctuations are discussed, and we explain the crucial importance to the semiconductor industry of developing design tools which accurately account for their effects. Current work in the area is reviewed, and three important factors are made clear: any statistical circuit simulation methodology must be based on physically correct, predictive models of device variability; the statistical compact models describing device operation must be characterised for accurate transient analysis of circuits; analysis must be carried out on realistic circuit components. Improving on previous efforts in the field, we posit a statistical circuit simulation methodology which accounts for all three of these factors. The established 3-D Glasgow atomistic simulator is employed to predict electrical characteristics for devices aimed at digital circuit applications, with gate lengths from 35 nm to 13 nm. Using these electrical characteristics, extraction of BSIM4 compact models is carried out and their accuracy in performing transient analysis using SPICE is validated against well characterised mixed-mode TCAD simulation results for 35 nm devices. Static d.c. simulations are performed to test the methodology, and a useful analytic model to predict hard logic fault limitations on CMOS supply voltage scaling is derived as part of this work. Using our toolset, the effect of statistical variability introduced by random discrete dopants on the dynamic behaviour of inverters is studied in detail. As devices scaled, dynamic noise margin variation of an inverter is increased and higher output load or input slew rate improves the noise margins and its variation. Intrinsic delay variation based on CV/I delay metric is also compared using ION and IEFF definitions where the best estimate is obtained when considering ION and input transition time variations. Critical delay distribution of a path is also investigated where it is shown non-Gaussian. Finally, the impact of the cell input slew rate definition on the accuracy of the inverter cell timing characterisation in NLDM format is investigated

    Identification and Rejuvenation of NBTI-Critical Logic Paths in Nanoscale Circuits

    Get PDF
    The Negative Bias Temperature Instability (NBTI) phenomenon is agreed to be one of the main reliability concerns in nanoscale circuits. It increases the threshold voltage of pMOS transistors, thus, slows down signal propagation along logic paths between flip-flops. NBTI may cause intermittent faults and, ultimately, the circuit’s permanent functional failures. In this paper, we propose an innovative NBTI mitigation approach by rejuvenating the nanoscale logic along NBTI-critical paths. The method is based on hierarchical identification of NBTI-critical paths and the generation of rejuvenation stimuli using an Evolutionary Algorithm. A new, fast, yet accurate model for computation of NBTI-induced delays at gate-level is developed. This model is based on intensive SPICE simulations of individual gates. The generated rejuvenation stimuli are used to drive those pMOS transistors to the recovery phase, which are the most critical for the NBTI-induced path delay. It is intended to apply the rejuvenation procedure to the circuit, as an execution overhead, periodically. Experimental results performed on a set of designs demonstrate reduction of NBTI-induced delays by up to two times with an execution overhead of 0.1 % or less. The proposed approach is aimed at extending the reliable lifetime of nanoelectronics

    Embracing Low-Power Systems with Improvement in Security and Energy-Efficiency

    Get PDF
    As the economies around the world are aligning more towards usage of computing systems, the global energy demand for computing is increasing rapidly. Additionally, the boom in AI based applications and services has already invited the pervasion of specialized computing hardware architectures for AI (accelerators). A big chunk of research in the industry and academia is being focused on providing energy efficiency to all kinds of power hungry computing architectures. This dissertation adds to these efforts. Aggressive voltage underscaling of chips is one the effective low power paradigms of providing energy efficiency. This dissertation identifies and deals with the reliability and performance problems associated with this paradigm and innovates novel energy efficient approaches. Specifically, the properties of a low power security primitive have been improved and, higher performance has been unlocked in an AI accelerator (Google TPU) in an aggressively voltage underscaled environment. And, novel power saving opportunities have been unlocked by characterizing the usage pattern of a baseline TPU with rigorous mathematical analysis
    corecore