1,444 research outputs found

    Layout regularity metric as a fast indicator of process variations

    Get PDF
    Integrated circuits design faces increasing challenge as we scale down due to the increase of the effect of sensitivity to process variations. Systematic variations induced by different steps in the lithography process affect both parametric and functional yields of the designs. These variations are known, themselves, to be affected by layout topologies. Design for Manufacturability (DFM) aims at defining techniques that mitigate variations and improve yield. Layout regularity is one of the trending techniques suggested by DFM to mitigate process variations effect. There are several solutions to create regular designs, like restricted design rules and regular fabrics. These regular solutions raised the need for a regularity metric. Metrics in literature are insufficient for different reasons; either because they are qualitative or computationally intensive. Furthermore, there is no study relating either lithography or electrical variations to layout regularity. In this work, layout regularity is studied in details and a new geometrical-based layout regularity metric is derived. This metric is verified against lithographic simulations and shows good correlation. Calculation of the metric takes only few minutes on 1mm x 1mm design, which is considered fast compared to the time taken by simulations. This makes it a good candidate for pre-processing the layout data and selecting certain areas of interest for lithographic simulations for faster throughput. The layout regularity metric is also compared against a model that measures electrical variations due to systematic lithographic variations. The validity of using the regularity metric to flag circuits that have high variability using the developed electrical variations model is shown. The regularity metric results compared to the electrical variability model results show matching percentage that can reach 80%, which means that this metric can be used as a fast indicator of designs more susceptible to lithography and hence electrical variations

    Addressing Manufacturing Challenges in NoC-based ULSI Designs

    Full text link
    Hernández Luz, C. (2012). Addressing Manufacturing Challenges in NoC-based ULSI Designs [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/1669

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

    Driving the Network-on-Chip Revolution to Remove the Interconnect Bottleneck in Nanoscale Multi-Processor Systems-on-Chip

    Get PDF
    The sustained demand for faster, more powerful chips has been met by the availability of chip manufacturing processes allowing for the integration of increasing numbers of computation units onto a single die. The resulting outcome, especially in the embedded domain, has often been called SYSTEM-ON-CHIP (SoC) or MULTI-PROCESSOR SYSTEM-ON-CHIP (MP-SoC). MPSoC design brings to the foreground a large number of challenges, one of the most prominent of which is the design of the chip interconnection. With a number of on-chip blocks presently ranging in the tens, and quickly approaching the hundreds, the novel issue of how to best provide on-chip communication resources is clearly felt. NETWORKS-ON-CHIPS (NoCs) are the most comprehensive and scalable answer to this design concern. By bringing large-scale networking concepts to the on-chip domain, they guarantee a structured answer to present and future communication requirements. The point-to-point connection and packet switching paradigms they involve are also of great help in minimizing wiring overhead and physical routing issues. However, as with any technology of recent inception, NoC design is still an evolving discipline. Several main areas of interest require deep investigation for NoCs to become viable solutions: • The design of the NoC architecture needs to strike the best tradeoff among performance, features and the tight area and power constraints of the onchip domain. • Simulation and verification infrastructure must be put in place to explore, validate and optimize the NoC performance. • NoCs offer a huge design space, thanks to their extreme customizability in terms of topology and architectural parameters. Design tools are needed to prune this space and pick the best solutions. • Even more so given their global, distributed nature, it is essential to evaluate the physical implementation of NoCs to evaluate their suitability for next-generation designs and their area and power costs. This dissertation performs a design space exploration of network-on-chip architectures, in order to point-out the trade-offs associated with the design of each individual network building blocks and with the design of network topology overall. The design space exploration is preceded by a comparative analysis of state-of-the-art interconnect fabrics with themselves and with early networkon- chip prototypes. The ultimate objective is to point out the key advantages that NoC realizations provide with respect to state-of-the-art communication infrastructures and to point out the challenges that lie ahead in order to make this new interconnect technology come true. Among these latter, technologyrelated challenges are emerging that call for dedicated design techniques at all levels of the design hierarchy. In particular, leakage power dissipation, containment of process variations and of their effects. The achievement of the above objectives was enabled by means of a NoC simulation environment for cycleaccurate modelling and simulation and by means of a back-end facility for the study of NoC physical implementation effects. Overall, all the results provided by this work have been validated on actual silicon layout

    Techniques of Energy-Efficient VLSI Chip Design for High-Performance Computing

    Get PDF
    How to implement quality computing with the limited power budget is the key factor to move very large scale integration (VLSI) chip design forward. This work introduces various techniques of low power VLSI design used for state of art computing. From the viewpoint of power supply, conventional in-chip voltage regulators based on analog blocks bring the large overhead of both power and area to computational chips. Motivated by this, a digital based switchable pin method to dynamically regulate power at low circuit cost has been proposed to make computing to be executed with a stable voltage supply. For one of the widely used and time consuming arithmetic units, multiplier, its operation in logarithmic domain shows an advantageous performance compared to that in binary domain considering computation latency, power and area. However, the introduced conversion error reduces the reliability of the following computation (e.g. multiplication and division.). In this work, a fast calibration method suppressing the conversion error and its VLSI implementation are proposed. The proposed logarithmic converter can be supplied by dc power to achieve fast conversion and clocked power to reduce the power dissipated during conversion. Going out of traditional computation methods and widely used static logic, neuron-like cell is also studied in this work. Using multiple input floating gate (MIFG) metal-oxide semiconductor field-effect transistor (MOSFET) based logic, a 32-bit, 16-operation arithmetic logic unit (ALU) with zipped decoding and a feedback loop is designed. The proposed ALU can reduce the switching power and has a strong driven-in capability due to coupling capacitors compared to static logic based ALU. Besides, recent neural computations bring serious challenges to digital VLSI implementation due to overload matrix multiplications and non-linear functions. An analog VLSI design which is compatible to external digital environment is proposed for the network of long short-term memory (LSTM). The entire analog based network computes much faster and has higher energy efficiency than the digital one

    Manufacturability Aware Design.

    Full text link
    The aim of this work is to provide solutions that optimize the tradeoffs among design, manufacturability, and cost of ownership posed by technology scaling and sub-wavelength lithography. These solutions may take the form of robust circuit designs, cost-effective resolution technologies, accurate modeling considering process variations, and design rules assessment. We first establish a framework for assessing the impact of process variation on circuit performance, product value and return on investment on alternative processes. Key features include comprehensive modeling and different handling on die-to-die and within-die variation, accurate models of correlations of variation, realistic and quantified projection to future process nodes, and performance sensitivity analysis to improved control of individual device parameter and variation sources. Then we describe a novel minimum cost of correction methodology which determines the level of correction of each layout feature such that the prescribed parametric yield is attained with minimum RET (Resolution Enhancement Technology) cost. This timing driven OPC (Optical Proximity Correction) insertion flow uses a mathematical programming based slack budgeting algorithm to determine OPC level for all polysilicon gate geometries. Designs adopting this methodology show up to 20% MEBES (Manufacturing Electron Beam Exposure System) data volume reduction and 39% OPC runtime improvement. When the systematic correction residual errors become unavoidable, we analyze their impact on a state-of-art microprocessor's speedpath skew. A platform is created for diagnosing and improving OPC quality on gates with specific functionality such as critical gates or matching transistors. Significant changes in full-chip timing analysis indicate the necessity of a post-OPC performance verification design flow. Finally, we quantify the performance, manufacturability and mask cost impact of globally applying several common restrictive design rules. Novel approaches such as locally adapting FDRs (flexible design rules) based on image parameters range, and DRC Plus (preferred design rule enforcement with 2D pattern matching) are also described.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/57676/2/jiey_1.pd

    Power efficient, event driven data acquisition and processing using asynchronous techniques

    Get PDF
    PhD ThesisData acquisition systems used in remote environmental monitoring equipment and biological sensor nodes rely on limited energy supply soured from either energy harvesters or battery to perform their functions. Among the building blocks of these systems are power hungry Analogue to Digital Converters and Digital Signal Processors which acquire and process samples at predetermined rates regardless of the monitored signal’s behavior. In this work we investigate power efficient event driven data acquisition and processing techniques by implementing an asynchronous ADC and an event driven power gated Finite Impulse Response (FIR) filter. We present an event driven single slope ADC capable of generating asynchronous digital samples based on the input signal’s rate of change. It utilizes a rate of change detection circuit known as the slope detector to determine at what point the input signal is to be sampled. After a sample has been obtained it’s absolute voltage value is time encoded and passed on to a Time to Digital Converter (TDC) as part of a pulse stream. The resulting digital samples generated by the TDC are produced at a rate that exhibits the same rate of change profile as that of the input signal. The ADC is realized in 0.35mm CMOS process, covers a silicon area of 340mm by 218mm and consumes power based on the input signal’s frequency. The samples from the ADC are asynchronous in nature and exhibit random time periods between adjacent samples. In order to process such asynchronous samples we present a FIR filter that is able to successfully operate on the samples and produce the desired result. The filter also poses the ability to turn itself off in-between samples that have longer sample periods in effect saving power in the process

    Regular cell design approach considering lithography-induced process variations

    Get PDF
    The deployment delays for EUVL, forces IC design to continue using 193nm wavelength lithography with innovative and costly techniques in order to faithfully print sub-wavelength features and combat lithography induced process variations. The effect of the lithography gap in current and upcoming technologies is to cause severe distortions due to optical diffraction in the printed patterns and thus degrading manufacturing yield. Therefore, a paradigm shift in layout design is mandatory towards more regular litho-friendly cell designs in order to improve line pattern resolution. However, it is still unclear the amount of layout regularity that can be introduced and how to measure the benefits and weaknesses of regular layouts. This dissertation is focused on searching the degree of layout regularity necessary to combat lithography variability and outperform the layout quality of a design. The main contributions that have been addressed to accomplish this objective are: (1) the definition of several layout design guidelines to mitigate lithography variability; (2) the proposal of a parametric yield estimation model to evaluate the lithography impact on layout design; (3) the development of a global Layout Quality Metric (LQM) including a Regularity Metric (RM) to capture the degree of layout regularity of a layout implementation and; (4) the creation of different layout architectures exploiting the benefits of layout regularity to outperform line-pattern resolution, referred as Adaptive Lithography Aware Regular Cell Designs (ALARCs). The first part of this thesis provides several regular layout design guidelines derived from lithography simulations so that several important lithography related variation sources are minimized. Moreover, a design level methodology, referred as gate biasing, is proposed to overcome systematic layout dependent variations, across-field variations and the non-rectilinear gate effect (NRG) applied to regular fabrics by properly configuring the drawn transistor channel length. The second part of this dissertation proposes a lithography yield estimation model to predict the amount of lithography distortion expected in a printed layout due to lithography hotspots with a reduced set of lithography simulations. An efficient lithography hotspot framework to identify the different layout pattern configurations, simplify them to ease the pattern analysis and classify them according to the lithography degradation predicted using lithography simulations is presented. The yield model is calibrated with delay measurements of a reduced set of identical test circuits implemented in a CMOS 40nm technology and thus actual silicon data is utilized to obtain a more realistic yield estimation. The third part of this thesis presents a configurable Layout Quality Metric (LQM) that considering several layout aspects provides a global evaluation of a layout design with a single score. The LQM can be leveraged by assigning different weights to each evaluation metric or by modifying the parameters under analysis. The LQM is here configured following two different set of partial metrics. Note that the LQM provides a regularity metric (RM) in order to capture the degree of layout regularity applied in a layout design. Lastly, this thesis presents different ALARC designs for a 40nm technology using different degrees of layout regularity and different area overheads. The quality of the gridded regular templates is demonstrated by automatically creating a library containing 266 cells including combinational and sequential cells and synthesizing several ITC'99 benchmark circuits. Note that the regular cell libraries only presents a 9\% area penalty compared to the 2D standard cell designs used for comparison and thus providing area competitive designs. The layout evaluation of benchmark circuits considering the LQM shows that regular layouts can outperform other 2D standard cell designs depending on the layout implementation.Los continuos retrasos en la implementación de la EUVL, fuerzan que el diseño de IC se realice mediante litografía de longitud de onda de 193 nm con innovadoras y costosas técnicas para poder combatir variaciones de proceso de litografía. La gran diferencia entre la longitud de onda y el tamaño de los patrones causa severas distorsiones debido a la difracción óptica en los patrones impresos y por lo tanto degradando el yield. En consecuencia, es necesario realizar un cambio en el diseño de layouts hacia diseños más regulares para poder mejorar la resolución de los patrones. Sin embargo, todavía no está claro el grado de regularidad que se debe introducir y como medir los beneficios y los perjuicios de los diseños regulares. El objetivo de esta tesis es buscar el grado de regularidad necesario para combatir las variaciones de litografía y mejorar la calidad del layout de un diseño. Las principales contribuciones para conseguirlo son: (1) la definición de diversas reglas de diseño de layout para mitigar las variaciones de litografía; (2) la propuesta de un modelo para estimar el yield paramétrico y así evaluar el impacto de la litografía en el diseño de layout; (3) el diseño de una métrica para analizar la calidad de un layout (LQM) incluyendo una métrica para capturar el grado de regularidad de un diseño (RM) y; (4) la creación de diferentes tipos de layout explotando los beneficios de la regularidad, referidos como Adaptative Lithography Aware Regular Cell Designs (ALARCs). La primera parte de la tesis, propone las diversas reglas de diseño para layouts regulares derivadas de simulaciones de litografía de tal manera que las fuentes de variación de litografía son minimizadas. Además, se propone una metodología de diseño para layouts regulares, referida como "gate biasing" para contrarrestar las variaciones sistemáticas dependientes del layout, las variaciones en la ventana de proceso del sistema litográfico y el efecto de puerta no rectilínea para configurar la longitud del canal del transistor correctamente. La segunda parte de la tesis, detalla el modelo de estimación del yield de litografía para predecir mediante un número reducido de simulaciones de litografía la cantidad de distorsión que se espera en un layout impreso debida a "hotspots". Se propone una eficiente metodología que identifica los distintos patrones de un layout, los simplifica para facilitar el análisis de los patrones y los clasifica en relación a la degradación predecida mediante simulaciones de litografía. El modelo de yield se calibra utilizando medidas de tiempo de un número reducido de idénticos circuitos de test implementados en una tecnología CMOS de 40nm y de esta manera, se utilizan datos de silicio para obtener una estimación realista del yield. La tercera parte de este trabajo, presenta una métrica para medir la calidad del layout (LQM), que considera diversos aspectos para dar una evaluación global de un diseño mediante un único valor. La LQM puede ajustarse mediante la asignación de diferentes pesos para cada métrica de evaluación o modificando los parámetros analizados. La LQM se configura mediante dos conjuntos de medidas diferentes. Además, ésta incluye una métrica de regularidad (RM) para capturar el grado de regularidad que se aplica en un diseño. Finalmente, esta disertación presenta los distintos diseños ALARC para una tecnología de 40nm utilizando diversos grados de regularidad y diferentes impactos en área. La calidad de estos diseños se demuestra creando automáticamente una librería de 266 celdas incluyendo celdas combinacionales y secuenciales y, sintetizando diversos circuitos ITC'99. Las librerías regulares solo presentan un 9% de impacto en área comparado con diseños de celdas estándar 2D y por tanto proponiendo diseños competitivos en área. La evaluación de los circuitos considerando la LQM muestra que los diseños regulares pueden mejorar otros diseños 2D dependiendo de la implementación del layout

    POWER AND PERFORMANCE STUDIES OF THE EXPLICIT MULTI-THREADING (XMT) ARCHITECTURE

    Get PDF
    Power and thermal constraints gained critical importance in the design of microprocessors over the past decade. Chipmakers failed to keep power at bay while sustaining the performance growth of serial computers at the rate expected by consumers. As an alternative, they turned to fitting an increasing number of simpler cores on a single die. While this is a step forward for relaxing the constraints, the issue of power is far from resolved and it is joined by new challenges which we explain next. As we move into the era of many-cores, processors consisting of 100s, even 1000s of cores, single-task parallelism is the natural path for building faster general-purpose computers. Alas, the introduction of parallelism to the mainstream general-purpose domain brings another long elusive problem to focus: ease of parallel programming. The result is the dual challenge where power efficiency and ease-of-programming are vital for the prevalence of up and coming many-core architectures. The observations above led to the lead goal of this dissertation: a first order validation of the claim that even under power/thermal constraints, ease-of-programming and competitive performance need not be conflicting objectives for a massively-parallel general-purpose processor. As our platform, we choose the eXplicit Multi-Threading (XMT) many-core architecture for fine grained parallel programs developed at the University of Maryland. We hope that our findings will be a trailblazer for future commercial products. XMT scales up to thousand or more lightweight cores and aims at improving single task execution time while making the task for the programmer as easy as possible. Performance advantages and ease-of-programming of XMT have been shown in a number of publications, including a study that we present in this dissertation. Feasibility of the hardware concept has been exhibited via FPGA and ASIC (per our partial involvement) prototypes. Our contributions target the study of power and thermal envelopes of an envisioned 1024-core XMT chip (XMT1024) under programs that exist in popular parallel benchmark suites. First, we compare XMT against an area and power equivalent commercial high-end many-core GPU. We demonstrate that XMT can provide an average speedup of 8.8x in irregular parallel programs that are common and important in general purpose computing. Even under the worst-case power estimation assumptions for XMT, average speedup is only reduced by half. We further this study by experimentally evaluating the performance advantages of Dynamic Thermal Management (DTM), when applied to XMT1024. DTM techniques are frequently used in current single and multi-core processors, however until now their effects on single-tasked many-cores have not been examined in detail. It is our purpose to explore how existing techniques can be tailored for XMT to improve performance. Performance improvements up to 46% over a generic global management technique has been demonstrated. The insights we provide can guide designers of other similar many-core architectures. A significant infrastructure contribution of this dissertation is a highly configurable cycle-accurate simulator, XMTSim. To our knowledge, XMTSim is currently the only publicly-available shared-memory many-core simulator with extensive capabilities for estimating power and temperature, as well as evaluating dynamic power and thermal management algorithms. As a major component of the XMT programming toolchain, it is not only used as the infrastructure in this work but also contributed to other publications and dissertations
    • …
    corecore