Search CORE

27 research outputs found

Beyond the arithmetic constraint: depth-optimal mapping of logic chains in reconfigurable fabrics

Author: Frederick Michael Todd
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2008
Field of study

Look-up table based FPGAs have migrated from a niche technology for design prototyping to a valuable end-product component and, in some cases, a replacement for general purpose processors and ASICs alike. One way architects have bridged the performance gap between FPGAs and ASICs is through the inclusion of specialized components such as multipliers, RAM modules, and microcontrollers. Another dedicated structure that has become standard in reconfigurable fabrics is the arithmetic carry chain. Currently, it is only used to map arithmetic operations as identified by HDL macros. For non-arithmetic operations, it is an idle but potentially powerful resource.;Obstacles to using the carry chain for generic logic operations include lack of architectural and computer-aided design support. Current carry-select architectures facilitate carry chain reuse, although they do so only for (K-1)-input operations. Additionally, hardware description language (HDL) macros are the only recourse for a designer wishing to map generic logic chains in a carry-select architecture. A novel architecture that allows the full K-input operational capacity of the carry chain to be harnessed is presented as a solution to current architectural limitations. It is shown to have negligible impact on logic element area and delay. Using only two additional 2:1 pass transistor multiplexers, it enables the transmission of a K-input operation to the carry chain and general routing simultaneously. To successfully identify logic chains in an arbitrary Boolean network, ChainMap is presented as a novel technology mapping algorithm. ChainMap creates delay-optimal generic logic chains in polynomial time without HDL macros. It maps both arithmetic and non-arithmetic logic chains whenever depth increasing nodes, which increase logic depth but not routing depth, are encountered. Use of the chain is not reserved for arithmetic, but rather any set of gates exhibiting similar characteristics. By using the carry chain as a generic, near zero-delay adjacent cell interconnection structure a potential average optimal speedup of 1.4x is revealed. Post place and route experiments indicate that ChainMap solutions perform similarly to HDL chains when cluster resources are abundant and significantly better in cluster-constrained arrays

Digital Repository @ Iowa State University (ISU)

Handling the complexity of routing problem in modern VLSI design

Author: Zhang Yanheng
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2011
Field of study

In VLSI physical design, the routing task consists of using over-the-cell metal wires to connect pins and ports of circuit gates and blocks. Traditionally, VLSI routing is an important design step in the sense that the quality of routing solution has great impact on various design metrics such as circuit timing, power consumption, chip reliability and manufacturability etc. As the advancing VLSI design enters the nanometer era, the routing success (routability issue) has been arising as one of the most critical problems in back-end design. In one aspect, the degree of design complexity is increasing dramatically as more and more modules are integrated into the chip. Much higher chip density leads to higher routing demands and potentially more risks in routing failure. In another aspect, with decreasing design feature size, there are more complex design rules imposed to ensure manufacturability. These design rules are hard to satisfy and they usually create more barriers for achieving routing closure (i.e., generate DRC free routing solution) and thus affect chip time to market (TTM) plan. In general, the behavior and performance of routing are affected by three consecutive phases: placement phase, global routing phase and detailed routing phase in a typical VLSI physical design flow. Traditional CAD tools handle each of the three phases independently and the global picture of the routability issue is neglected. Different from conventional approaches which propose tools and algorithms for one particular design phase, this thesis investigates the routability issue from all three phases and proposes a series of systematic solutions to build a more generic flow and improve quality of results (QoR). For the placement phase, we will introduce a mixed-sized placement refinement tool for alleviating congestion after placement. The tool shifts and relocates modules based on a global routing estimation. For the global routing phase, a very fast and effective global router is developed. Its performance surpasses many peer works as verified by ISPD 2008 global routing contest results. In the detailed routing phase, a tool is proposed to perform detailed routing using regular routing patterns based on a correct-by-construction methodology to improve routability as well as satisfy most design rules. Finally, the tool which integrates global routing and detailed routing is developed to remedy the inconsistency between global routing and detailed routing. To verify the algorithms we proposed, three sets of testcases derived from ISPD98 and ISPD05/06 placement benchmark suites are proposed. The results indicate that our proposed methods construct an integrated and systematic flow for routability improvement which is better than conventional methods

Digital Repository @ Iowa State University (ISU)

Lexicographic path searches for FPGA routing

Author: So Keith Kam-Ho
Publication venue: UNSW, Sydney
Publication date: 01/01/2008
Field of study

This dissertation reports on studies of the application of lexicographic graph searches to solve problems in FPGA detailed routing. Our contributions include the derivation of iteration limits for scalar implementations of negotiation congestion for standard floating point types and the identification of pathological cases for path choice. In the study of the routability-driven detailed FPGA routing problem, we show universal detailed routability is NP-complete based on a related proof by Lee and Wong. We describe the design of a lexicographic composition operator of totally-ordered monoids as path cost metrics and show its optimality under an adapted A* search. Our new router, CornNC, based on lexicographic composition of congestion and wirelength, established a new minimum track count for the FPGA Place and Route Challenge. For the problem of long-path timing-driven FPGA detailed routing, we show that long-path budgeted detailed routability is NP-complete by reduction to universal detailed routability. We generalise the lexicographic composition to any finite length and verify its optimality under A* search. The application of the timing budget solution of Ghiasi et al. is used to solve the long-path timing budget problem for FPGA connections. Our delay-clamped spiral lexicographic composition design, SpiralRoute, ensures connection based budgets are always met, thus achieves timing closure when it successfully routes. For 113 test routing instances derived from standard benchmarks, SpiralRoute found 13 routable instances with timing closure that were unroutable by a scalar negotiated congestion router and achieved timing closure in another 27 cases when the scalar router did not, at the expense of increased runtime. We also study techniques to improve SpiralRoute runtimes, including a data structure of a trie augmented by data stacks for minimum element retrieval, and the technique of step tomonoid elimination in reducing the retrieval depth in a trie of stacks structure

UNSWorks

An adaptive method to tolerate soft errors in SRAM-based FPGAs

Author: Bahramnejad S.
Zarandi H.R.
Publication venue: Sharif University of Technology. Production and hosting by Elsevier B.V.Production and hosting by Elsevier B.V.
Publication date: 31/12/2011
Field of study

AbstractIn this paper, we present an adaptive method that is a combination of SEU-avoidance in CAD flow and adaptive redundancy to tolerate soft error effects in SRAM-based FPGAs. This method is based on the modification of T-VPack and VPR tools. Three different steps of these tools are modified for SEU-awareness: (1) clustering, (2) placement and (3) routing. Then we use the unused resources as redundancy. We have investigated the effect of this method on several MCNC benchmarks. This investigation has been performed using three experiments: (1) SEU-awareness in clustering with redundancy, (2) SEU-awareness in clustering and placement with redundancy and (3) SEU-awareness in clustering, placement and routing with redundancy. With a confidence level of 95%, the results show that, using each of these three experiments, the system failure rate of ten MCNC circuits has been decreased between 4.52% and 10.42%, between 10.25% and 21.63%, and between 10.48% and 24.39%, respectively

Elsevier - Publisher Connector

Routing in Optical and Non-Optical Networks using Boolean Satisfiability

Author: Bashar Al Rawi
Fadi A. Aloul
Mokhtar Aboelaze
Publication venue: 'Academy Publisher'
Publication date
Field of study

Crossref

물리적 설계 자동화에서 표준셀 합성 및 최적화와 설계 품질 예측 방법론

Author: 백경현
Publication venue: 서울대학교 대학원
Publication date: 01/02/2023
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2023. 2. 김태환.In the physical design of chip implementation, designing high-quality standard cell layout and accurately predicting post-route DRV (design rule violation) at an early stage is an important problem, especially in advanced technology nodes. This dissertation presents two methodologies that can contribute to improving the design quality and design turnaround time of physical design flow. Firstly, we propose an integrated approach to the two problems of transistor folding and placement in standard cell layout synthesis. Precisely, we propose a globally optimal algorithm of search tree based design space exploration, devising a set of effective speeding up techniques as well as dynamic programming based fast cost computation. In addition, our algorithm incorporates the minimum oxide diffusion jog constraint, which closely relies on both of transistor folding and placement. Through experiments with the transistor netlists and design rules in advanced node, our proposed method is able to synthesize fully routable cell layouts of minimal size within a very fast time for each netlist, outperforming the cell layout quality in the manual design. Secondly, we propose a novel ML based DRC hotspot prediction technique, which is able to accurately capture the combined impact of pin accessibility and routing congestion on DRC hotspots. Precisely, we devise a graph, called pin proximity graph, that effectively models the spatial information on cell I/O pins and the information on pin-to-pin disturbance relation. Then, we propose a new ML model, which tightly combines GNN (graph neural network) and U-net in a way that GNN is used to embed pin accessibility information abstracted from our pin proximity graph while U-net is used to extract routing congestion information from grid-based features. Through experiments with a set of benchmark designs using advanced node, our model outperforms the existing ML models on all benchmark designs within the fast inference time in comparison with that of the state-of-the-art techniques.칩 구현의 물리적 설계 단계에서, 높은 성능의 표준 셀 설계와 배선 연결 이후 조기에 설계 규칙 위반을 정확히 예측하는 것은 최신 공정에서 특히 중요한 문제이다. 본 논문에서는 물리적 설계에서의 설계 품질과 총 설계 시간 향상을 달성할 수 있는 두 가지 방법론을 제안한다. 먼저, 본 논문에서는 표준 셀 레이아웃 합성에서 트랜지스터 폴딩과 배치를 종합적으로 진행할 수 있는 방법론을 논한다. 구체적으로 탐색 트리 기반의 최적화 알고리즘과 동적 프로그래밍 기반 빠른 비용 계산 방법과 여러 속도 개선 기법을 제안한다. 여기에 더해, 최신 공정에서 트랜지스터 폴딩과 배치로 인해 발생할 수 있는 최소 산화물 확산 영역 설계 규칙을 고려하였다. 최신 공정에 대한 표준 셀 합성 실험 결과, 본 논문에서 제안한 방법이 설계 전문가가 수동으로 설계한 것 대비 높은 성능을 보이고, 설계 시간도 매우 짧음을 보인다. 두번째로, 본 논문에서는 셀 배치 단계에서 핀 접근성과 연결 혼잡으로 인한 영향을 종합적으로 고려할 수 있는 머신 러닝 기반 설계 규칙 위반 구역 예측 방법론을 제안한다. 먼저 표준 셀의 입/출력 핀의 물리적 정보와 핀과 핀 사이 방해 관계를 효과적으로 표현할 수 있는 핀 근접 그래프를 제안하고, 그래프 신경망과 유넷 신경망을 효과적으로 결합한 새로운 형태의 머신 러닝 모델을 제안한다. 이 모델에서 그래프 신경망은 핀 근접 그래프로부터 핀 접근성 정보를 추출하고, 유넷 신경망은 격자 기반 특징으로부터 연결 혼잡 정보를 추출한다. 실험 결과 본 논문에서 제안한 방법은 이전 연구들 대비 더 빠른 예측 시간에 더 높은 예측 성능을 달성함을 보인다.1 Introduction 1 1.1 Standard Cell Layout Synthesis 1 1.2 Machine Learning for Electronic Design Automation 6 1.3 Prediction of Design Rule Violation 8 1.4 Contributions of This Dissertation 11 2 Standard Cell Layout Synthesis of Advanced Nodes with Simultaneous Transistor Folding and Placement 14 2.1 Motivations 14 2.2 Algorithm for Standard Cell Layout Synthesis 16 2.2.1 Problem Definition 16 2.2.2 Overall Flow 18 2.2.3 Step 1: Generation of Folding Shapes 18 2.2.4 Step 2: Search-tree Based Design Space Exploration 20 2.2.5 Speeding up Techniques 23 2.2.6 In-cell Routability Estimation 28 2.2.7 Step 3: In-cell Routing 30 2.2.8 Step 4: Splitting Folding Shapes 35 2.2.9 Step 5: Relaxing Minimum-area Constraints 37 2.3 Experimental Results 38 2.3.1 Comparison with ASAP 7nm Cell Layouts 40 2.3.2 Effectiveness of Dynamic Folding 42 2.3.3 Effectiveness of Speeding Up Techniques 43 2.3.4 Impact of Splitting Folding Shape 48 2.3.5 Runtime Analysis According to Area Relaxation 51 2.3.6 Comparison with Previous Works 52 3 Pin Accessibility and Routing Congestion Aware DRC Hotspot Prediction using Graph Neural Network and U-Net 54 3.1 Preliminary 54 3.1.1 Graph Neural Network 54 3.1.2 Fully Convolutional Network 56 3.2 Proposed Prediction Methodology 57 3.2.1 Overall Flow 57 3.2.2 Pin Proximity Graph 58 3.2.3 Grid-based Features 61 3.2.4 Overall Architecture of PGNN 64 3.2.5 GNN Architecture in PGNN 64 3.2.6 U-net Architecture in PGNN 66 3.2.7 Final Prediction in PGNN 66 3.3 Experimental Results 68 3.3.1 Experimental Setup 68 3.3.2 Analysis on PGNN Performance 71 3.3.3 Comparison with Previous Works 72 3.3.4 Adaptation to Real-world Designs 81 3.3.5 Handling Data Imbalance Problem in Regression Model 86 4 Conclusions 92 4.1 Chapter 2 92 4.2 Chapter 3 93박

SNU Open Repository and Archive

A complete design path for the layout of flexible macros

Author: Huijbregts E.P.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1996
Field of study

XIV+172hlm.;24c

Repository TU/e

Pure OAI Repository

uilis.unsyiah.ac.id

Performance Comparison of Static CMOS and Domino Logic Style in VLSI Design: A Review

Author: Dr. S. M. Ramesh Dr. M. Jagdissh Chandra Prasad, Dr. T. V. P. Sundararajan, Dr. P. Senthilkumar
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/08/2019
Field of study

Of late, there is a steep rise in the usage of handheld gadgets and high speed applications. VLSI designers often choose static CMOS logic style for low power applications. This logic style provides low power dissipation and is free from signal noise integrity issues. However, designs based on this logic style often are slow and cannot be used in high performance circuits. On the other hand designs based on Domino logic style yield high performance and occupy less area. Yet, they have more power dissipation compared to their static CMOS counterparts. As a practice, designers during circuit synthesis, mix more than one logic style judiciously to obtain the advantages of each logic style. Carefully designing a mixed static Domino CMOS circuit can tap the advantages of both static and Domino logic styles overcoming their own short comings

International Journal on Future Revolution in Computer Science & Communication Engineering