12 research outputs found

    An Improved Lagrangian Relaxation Method for VLSI Combinational Circuit Optimization

    Get PDF
    Gate sizing and threshold voltage (Vt) assignment are very popular and useful techniques in current very large scale integration (VLSI) design flow for timing and power optimization. Lagrangian relaxation (LR) is a common method for handling multi-objectives and proven to reach optimal solution under continuous solution space. However, it is more complex to use Lagrangian relaxation under discrete solution space. The Lagrangian dual problem is non-convex and previously a sub-gradient method was used to solve it. The sub-gradient method is a greedy approach for substituting gradient method in the deepest descent method, and has room for further improvement. In addition, Lagrangian sub-problem cannot be solved directly by mathematical approaches under discrete solution space. Here we propose a new Lagrangian relaxation-based method for simultaneous gate sizing and Vt assignment under discrete solution space. In this work, some new approaches are provided to solve the Lagrangian dual problem considering not only slack but also the relationship between Lagrangian multipliers and circuit timing. We want to solve the Lagrangian dual problem more precisely than did previous methods, such as the sub-gradient method. In addition, a table-lookup method is provided to replace mathematical approaches for solving the Lagrangian sub-problem under discrete size and Vt options. The experimental results show that our method can lead to about 50 percent and 58 percent power reduction subject to the same timing constraints compared with a Lagrangian relaxation method using sub-gradient method and a state-of-the-art previous work. These two methods are implemented by us for comparison. Our method also results in better circuit timing subject to tight timing constraints

    Lagrangian relaxation-based multi-threaded discrete gate sizer

    Get PDF
    In integrated circuit design gate sizing is one of the key optimization techniques which is repeatedly invoked to trade-off delays for area and/or power of the gates during logic design and physical design stages. With increasing design sizes of a million gates and larger, discrete gate sizes and non-convex delay models the gate sizing algorithms that were designed for continuous sizes and convex delay models are slow and timing inaccurate. Of the several published discrete gate sizing algorithms, recent works have shown that Lagrangian relaxation based gate sizers have produced designs with the lowest power on average with high timing accuracy. But they are also very slow due to a large number of expensive timing updates spread across hundreds of iterations of solving the Lagrangian sub-problem. In this thesis we present a Lagrangian relaxation based multi-threaded discrete gate sizer for fast timing and power reduction by swapping the gate sizes and the threshold voltages. We developed two parallelization enabling techniques to reduce the runtime of Lagrangian sub-problem solver, namely, mutual exclusion edge (MEE) assignment and directed acyclic graph (DAG) based netlist traversal. MEEs are dummy edges assigned to reduce computational dependencies among gates sharing one or more common fan-ins. DAG based netlist traversal facilitates simultaneous resizing of gates belonging to different topological levels. We designed a Lagrange multiplier update framework that enables rapid convergence of the timing recovery and power recovery algorithms. To reduce the runtime of timing updates, we proposed a simple and fast-to-compute effective capacitance model and several mechanisms to calibrate the timing models to improve their accuracy. Compared to the state-of-the-art gate sizer, our proposed gate sizer is on average 15x faster and the optimized designs have only 1.7\% higher power. In digital synchronous designs simultaneous gate sizing and clock skew scheduling provides significantly more power saving. We extend the gate sizer to simultaneously schedule the clock skew. It can achieve an average of 18.8\% more reduction in power with only 20\% increase in the runtime

    Parallel Acceleration for Timing Analysis and Optimization of Adaptive Integrated Circuits

    Get PDF
    Adaptive circuit design is a power-efficient approach to handle variations. Compared to conventional circuits, its implementation is more complicated especially when we deal with the fine-grained adaptivity. The unconventional and sophisticated nature of adaptive design further requires timing verification to validate the design. However timing analysis becomes more complicated due to complexities arising from nanometer VLSI technologies. A well-known challenge is process variations, which need to be addressed in timing analysis at least by considering different process corners. Adaptive circuit design further needs statistical static timing analysis (SSTA), which is much more time consuming than variation-oblivious timing analysis. Besides timing analysis, gate implementation selections of the adaptive design process are also computational expensive. This research focuses on parallel acceleration techniques for timing analysis and optimization of adaptive circuit. General purpose graphic processing units (GPGPU) and multithreading techniques are used in this work. Previous works on GPU acceleration for SSTA are mostly based on Monte Carlo based SSTA. By contrast, the parallelization techniques for principle component analysis (PCA) based SSTA are explored in this work, which is intrinsically more efficient. We develop a batch-based scheduling algorithm to partition the circuit graph into topological levels for GPU processing and investigate other techniques such as latency hiding. We propose a multithreading based acceleration method for the process of gate implementation selection and use the same batch-based scheduling result. The experiment result shows effectiveness of our parallel acceleration for timing analysis and for optimization with the performance up to 130× and 5× speedup respectively

    Discrete Gate Sizing Methodologies for Delay, Area and Power Optimization

    Get PDF
    The modeling of an individual gate and the optimization of circuit performance has long been a critical issue in the VLSI industry. In this work, we first study of the gate sizing problem for today\u27s industrial designs, and explore the contributions and limitations of all the existing approaches, which mainly suffer from producing only continuous solutions, using outdated timing models or experiencing performance inefficiency. In this dissertation, we present our new discrete gate sizing technique which optimizes different aspects of circuit performance, including delay, area and power consumption. And our method is fast and efficient as it applies the local search instead of global exhaustive search during gate size selection process, which greatly reduces the search space and improves the computation complexity. In addition to that, it is also flexible with different timing models, and it is able to deal with the constraints of input/output slew and output load capacitance, under which very few previous research works were reported. We then propose a new timing model, which is derived from the classic Elmore delay model, but takes the features of modern timing models from standard cell library. With our new timing model, we are able to formulate the combinatorial discrete sizing problem as a simplified mathematical expression and apply it to existing Lagrangian relaxation method, which is shown to converge to optimal solution. We demonstrate that the classic Elmore delay model based gate sizing approaches can still be valid. Therefore, our work might provide a new look into the numerous Elmore delay model based research works in various areas (such as placement, routing, layout, buffer insertion, timing analysis, etc.)

    Adaptive Integrated Circuit Design for Variation Resilience and Security

    Get PDF
    The past few decades witness the burgeoning development of integrated circuit in terms of process technology scaling. Along with the tremendous benefits coming from the scaling, challenges are also presented in various stages. During the design time, the complexity of developing a circuit with millions to billions of smaller size transistors is extended after the variations are taken into account. The difficulty of analyzing these nondeterministic properties makes the allocation scheme of redundant resource hardly work in a cost-efficient way. Besides fabrication variations, analog circuits are suffered from severe performance degradations owing to their physical attributes which are vulnerable to aging effects. As such, the post-silicon calibration approach gains increasing attentions to compensate the performance mismatch. For the user-end applications, additional system failures result from the pirated and counterfeited devices provided by the untrusted semiconductor supply chain. Again analog circuits show their weakness to this threat due to the shortage of piracy avoidance techniques. In this dissertation, we propose three adaptive integrated circuit designs to overcome these challenges respectively. The first one investigates the variability-aware gate implementation with the consideration of the overhead control of adaptivity assignment. This design improves the variation resilience typically for digital circuits while optimizing the power consumption and timing yield. The second design is implemented as a self-validation system for the calibration of diverse analog circuits. The system is completely integrated on chip to enhance the convenience without external assistance. In the last design, a classic analog component is further studied to establish the configurable locking mechanism for analog circuits. The use of Satisfiability Modulo Theories addresses the difficulty of searching the unique unlocking pattern of non-Boolean variables

    Performance-Based Optical Proximity Correction

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A New Algorithm for Simultaneous Gate Sizing and Threshold Voltage Assignment

    No full text
    Gate sizing and threshold voltage (Vt) assignment are popular tech-niques for circuit timing and power optimization. Existing meth-ods, by and large, are either sensitivity-driven heuristics or based on rounding continuous optimization solutions. Sensitivity-driven heuristics are easily trapped in local optimum and the rounding is subject to remarkable errors. In this paper, we propose a system-atic combinatorial approach for simultaneous gate sizing and Vt assignment. The core ideas of this approach include consistency relaxation and coupled bi-directional search. Our algorithm is com-pared with a state-of-the-art previous work on benchmark circuits. The results from our algorithm can lead to about 24 % less power dissipation subject to the same timing constraints

    Modeling and Simulation of Advanced Nano-Scale Very Large Scale Integration Circuits

    Get PDF
    With VLSI(very large scale integration) technology shrinking and frequency increasing, the minimum feature size is smaller than sub-wavelength lithography wavelength, and the manufacturing cost is significantly increasing in order to achieve a good yield. Consequently design companies need to further lower power consumption. All these factors bring new challenges; simulation and modeling need to handle more design constraints, and need to work with modern manufacturing processes. In this dissertation, algorithms and new methodology are presented for these problems: (1) fast and accurate capacitance extraction, (2) capacitance extraction considering lithography effect, (3) BEOL(back end of line) impact on SRAM(static random access memory) performance and yield, and (4) new physical synthesis optimization flow is used to shed area and reduce the power consumption. Interconnect parasitic extraction plays an important role in simulation, verification, optimization. A fast and accurate parasitic extraction algorithm is always important for a current design automation tool. In this dissertation, we propose a new algorithm named HybCap to efficiently handle multiple planar, conformal or embedded dielectric media. From experimental results, the new method is significantly faster than the previous one, 77X speedup, and has a 99% memory savings compared with FastCap and 2X speedup, and has an 80% memory savings compared with PHiCap for complex dielectric media. In order to consider lithography effect in the existing LPE(Layout Parasitic Extraction) flow, a modified LPE flow and fast algorithms for interconnect parasitic extraction are proposed in this dissertation. Our methodology is efficient, compatible with the existing design flow and has high accuracy. With the new enhanced parasitic extraction flow, simulation of BEOL effect on SRAM performance becomes possible. A SRAM simulation model with internal cell interconnect RC parasitics is proposed in order to study the BEOL lithography impact. The impact of BEOL variations on memory designs are systematically evaluated in this dissertation. The results show the power estimation with our SRAM model is more accurate. Finally, a new optimization flow to shed area blow in the design synthesis flow is proposed, which is one level beyond simulation and modeling to directly optimize design, but is also built upon accurate simulations and modeling. Two simple, yet efficient, buffering and gate sizing techniques are presented. On 20 industrial designs in 45nm and 65nm, our new work achieves 12.5% logic area growth reduction, 5.8% total area reduction, 10% wirelength reduction and 770 ps worst slack improvement on average

    Algorithms for VLSI Circuit Optimization and GPU-Based Parallelization

    Get PDF
    This research addresses some critical challenges in various problems of VLSI design automation, including sophisticated solution search on DAG topology, simultaneous multi-stage design optimization, optimization on multi-scenario and multi-core designs, and GPU-based parallel computing for runtime acceleration. Discrete optimization for VLSI design automation problems is often quite complex, due to the inconsistency and interference between solutions on reconvergent paths in directed acyclic graph (DAG). This research proposes a systematic solution search guided by a global view of the solution space. The key idea of the proposal is joint relaxation and restriction (JRR), which is similar in spirit to mathematical relaxation techniques, such as Lagrangian relaxation. Here, the relaxation and restriction together provides a global view, and iteratively improves the solution. Traditionally, circuit optimization is carried out in a sequence of separate optimization stages. The problem with sequential optimization is that the best solution in one stage may be worse for another. To overcome this difficulty, we take the approach of performing multiple optimization techniques simultaneously. By searching in the combined solution space of multiple optimization techniques, a broader view of the problem leads to the overall better optimization result. This research takes this approach on two problems, namely, simultaneous technology mapping and cell placement, and simultaneous gate sizing and threshold voltage assignment. Modern processors have multiple working modes, which trade off between power consumption and performance, or to maintain certain performance level in a powerefficient way. As a result, the design of a circuit needs to accommodate different scenarios, such as different supply voltage settings. This research deals with this multi-scenario optimization problem with Lagrangian relaxation technique. Multiple scenarios are taken care of simultaneously through the balance by Lagrangian multipliers. Similarly, multiple objective and constraints are simultaneously dealt with by Lagrangian relaxation. This research proposed a new method to calculate the subgradients of the Lagrangian function, and solve the Lagrangian dual problem more effectively. Multi-core architecture also poses new problems and challenges to design automation. For example, multiple cores on the same chip may have identical design in some part, while differ from each other in the rest. In the case of buffer insertion, the identical part have to be carefully optimized for all the cores with different environmental parameters. This problem has much higher complexity compared to buffer insertion on single cores. This research proposes an algorithm that optimizes the buffering solution for multiple cores simultaneously, based on critical component analysis. Under the intensifying time-to-market pressure, circuit optimization not only needs to find high quality solutions, but also has to come up with the result fast. Recent advance in general purpose graphics processing unit (GPGPU) technology provides massive parallel computing power. This research turns the complex computation task of circuit optimization into many subtasks processed by parallel threads. The proposed task partitioning and scheduling methods take advantage of the GPU computing power, achieve significant speedup without sacrifice on the solution quality

    Sizing discreto baseado em relaxação lagrangeana para minimização de leakage em circuitos digitais

    Get PDF
    Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2013.A minimização da corrente de leakage é um passo essencial do projeto de circuitos digitais, uma vez que nas tecnologias CMOS recentes a potência de leakage tornou-se comparável à potência dinâmica. Gate sizing é uma técnica amplamente utilizada para minimização da potência de leakage devido à sua eficácia e ao baixo impacto que ele causa no fluxo standard cell. Em tal fluxo, o problema de sizing corresponde a selecionar, para cada porta do circuito, uma combinação de largura de porta e tensão de threshold disponível na biblioteca de células, de modo a satisfazer as restrições de projeto. A natureza discreta do problema, a qual o torna NP-difícil, e o grande número de portas nos circuitos contemporâneos têm motivado a busca por heurísticas eficientes, que sejam capazes de resolvê-lo em tempo de execução aceitável. Este trabalho apresenta três contribuições principais ao estado da arte. A primeira é uma formulação aperfeiçoada para o problema de sizing discreto baseada em Relaxação Lagrangeana (LR), a qual considera valores máximos de slew de entrada e de capacitância de saída das portas, impostas pelas bibliotecas standard cell. A segunda é uma heurística topológica gulosa para resolver a formulação LR proposta utilizando informações locais para guiar as decisões do algoritmo. A terceira contribuição reside em uma técnica híbrida de três passos para superar algumas das limitações da heurística topológica gulosa. Tal técnica híbrida inicia resolvendo a formulação LR assumindo um atraso crítico ligeiramente maior do que o atraso crítico-alvo e em seguida, aplica uma heurística rápida de recuperação de atraso para que o atraso crítico-alvo original seja satisfeito. Como terceiro passo, é usada uma heurística de recuperação de potência para reduzir ainda mais a potência de leakage explorando o espaço para otimização deixado pelos dois passos anteriores. Os experimentos práticos foram gerados utilizando-se a infraestrutura da Competição de Sizing Discreto do ISPD2012, a qual provê uma base comum para comparações justas com os trabalhos correlates mais recentes. Os resultados experimentais para a formulação LR usando a heurística topológica gulosa foram comparados com os resultados obtidos pelas três equipes melhor classificadas na Competição do ISPD 2012, os quais representavam o estado da arte no momento em que tais experimentos foram realizados. A potência de leakage obtida é, em média, 18,9%, 16,7% e 43,8% menor do que aquelas obtidas pelas três melhores equipes da Competição do ISPD2012, respectivamente, ao passo que o tempo de execução total é 38, 31 e 39 vezes menor. Com relação à técnica híbrida, a potência de leakage obtida é, em média, 8,15\\\\% menor do que aquela relatada pelo trabalho que representa o estado da arte na ocasião em que estes experimentos foram realizados, sendo o tempo total de execução uma ordem de magnitude menor. É Importante ressaltar que o trabalho estado da arte referido já havia superado as três melhores equipes da Competição do ISPD2012. 2013-12-05T23:12:19
    corecore