2,318 research outputs found
Recommended from our members
Microarchitecture optimization for timing and layout
In recent years the drive to produce more complex integrated circuits while spending less design time has driven the demand for design automation tools. The search for design automation methods has resulted in the design of numerous behavioral synthesis and logic synthesis tools. This report describes a system that fills the gap between traditional behavioral synthesis and logic synthesis tools. Techniques are introduced for improving the microarchitecture structure and using feedback from lower-level optimization tools to guide design optimizations while attempting to meet user specified area and time constraints. These techniques include the capability for mixing layout styles such as custom layout for random-logic components and bit-slicing for regularly structured components. In this manner the entire design, control logic and datapath, can be optimized at the same time. Further, this paper presents a new methodology for microarchitecture-level optimization that greatly reduces the amount of technology-specific knowledge necessary to perform the optimizations
Extending systems-on-chip to the third dimension : performance, cost and technological tradeoffs.
Because of the today's market demand for high-performance, high-density portable hand-held applications, electronic system design technology has shifted the focus from 2-D planar SoC single-chip solutions to different alternative options as tiled silicon and single-level embedded modules as well as 3-D integration. Among the various choices, finding an optimal solution for system implementation dealt usually with cost, performance and other technological trade-off analysis at the system conceptual level. It has been identified that the decisions made within the first 20% of the total design cycle time will ultimately result up to 80% of the final product cost. In this paper, we discuss appropriate and realistic metric for performance and cost trade-off analysis both at system conceptual level (up-front in the design phase) and at implementation phase for verification in the three-dimensional integration. In order to validate the methodology, two ubiquitous electronic systems are analyzed under various implementation schemes and discuss the pros and cons of each of them
Recommended from our members
A system for microarchitecture and logic optimization
This thesis spans two levels of the design process by examining optimization at both the register-transfer level and at the logic level. More specifically, this thesis addresses the following two problems: 1) performing logic synthesis for custom layout rather than the traditional approach that focuses on synthesis for standard cells, and 2) performing optimization for custom layout from register-transfer level netlists. Thus optimization is performed on the microarchitecture design and at a lower level for individual microarchitecture components.First, techniques are introduced for generating gate-level netlists that take advantage of custom layout capabilities. Such techniques include limiting serial/parallel transistor chains, transistor sizes, and capacitive loads in forming complex gates. These considerations have not been incorporated in previous logic synthesis systems.Second, techniques are introduced for improving the microarchitecture structure and using estimates from lower-level optimization tools to guide microarchitecture design optimizations that attempt to meet user specified area and time constraints. These techniques include the capability for mixing layout styles such as custom layout for random-logic components and bit-slicing for regularly structured components. In this manner the entire design, control logic and datapath, can be optimized at the same time. Further, this paper presents a new methodology for microarchitecture-level optimization that greatly reduces the amount of technology-specific knowledge necessary to perform the optimizations
Recommended from our members
Behavioral synthesis from VHDL using structured modeling
This dissertation describes work in behavioral synthesis involving the development of a VHDL Synthesis System VSS which accepts a VHDL behavioral input specification and performs technology independent synthesis to generate a circuit netlist of generic components. The VHDL language is used for input and output descriptions. An intermediate representation which incorporates signal typing and component attributes simplifies compilation and facilitates design optimization.A Structured Modeling methodology has been developed to suggest standard VHDL modeling practices for synthesis. Structured modeling provides recommendations for the use of available VHDL description styles so that optimal designs will be synthesized.A design composed of generic components is synthesized from the input description through a process of Graph Compilation, Graph Criticism, and Design Compilation. Experiments were performed to demonstrate the effects of different modeling styles on the quality of the design produced by VSS. Several alternative VHDL models were examined for each benchmark, illustrating the improvements in design quality achieved when Structured Modeling guidelines were followed
Novel dual-threshold voltage FinFETs for circuit design and optimization
A great research effort has been invested on finding alternatives to CMOS that have better process variation and subthreshold leakage. From possible candidates, FinFET is the most compatible with respect to CMOS and it has shown promising leakage and speed performance. This thesis introduces basic characteristics of FinFETs and the effects of FinFET physical parameters on their performance are explained quantitatively. I show how dual- V th independent-gate FinFETs can be fabricated by optimizing their physical parameters. Optimum values for these physical parameters are derived using the physics-based University of Florida SPICE model for double-gate devices, and the optimized FinFETs are simulated and validated using Sentaurus TCAD simulations. Dual-14, FinFETs with independent gates enable series and parallel merge transformations in logic gates, realizing compact low power alternative gates with competitive performance and reduced input capacitance in comparison to conventional FinFET gates. Furthermore, they also enable the design of a new class of compact logic gates with higher expressive power and flexibility than CMOS gates. Synthesis results for 16 benchmark circuits from the ISCAS and OpenSPARC suites indicate that on average at 2GHz and 75°C, the library that contains the novel gates reduces total power and the number of fins by 36% and 37% respectively, over a conventional library that does not have novel gates in the 32nm technology
Design Optimization of Double Gate Based Full Adder
Full adder is the essential block of circuit of arithmetic’s found in microprocessor and microcontroller in ALU (arithmetic and logic unit). Improving the performance of the adder is very important for up gradation the performance of digital circuit of electronics in which adder is utilized. The main aim of designing of arithmetic circuit is the power consumption. In an arithmetic circuit, the adder is a critical module for operation of addition and also the core for many operations related to arithmetic. Therefore it is obliged to decrease the consumption of power of adder circuit so as to diminish the consumption of the module related to arithmetic. In this paper, a review of double gate based full adder is presented. Various works on this research work which is already available is presented and also problem related to them are presented. Keywords: DG-MOSFET, ALU, XOR, Full Adder,PTI, GDI, Diffused Gdi,Finfet, ELK, Power Grating , Stacking
Dual-Vth Independent-Gate FinFETs for Low Power Logic Circuits
This paper describes the electrode work-function,
oxide thickness, gate-source/drain underlap, and silicon thickness
optimization required to realize dual-Vth independent-gate
FinFETs. Optimum values for these FinFET design parameters
are derived using the physics-based University of Florida SPICE
model for double-gate devices, and the optimized FinFETs are
simulated and validated using Sentaurus TCAD simulations.
Dual-Vth FinFETs with independent gates enable series and
parallel merge transformations in logic gates, realizing compact
low power alternative gates with competitive performance and
reduced input capacitance in comparison to conventional FinFET
gates. Furthermore, they also enable the design of a new class of
compact logic gates with higher expressive power and flexibility
than conventional CMOS gates, e.g., implementing 12 unique
Boolean functions using only four transistors. Circuit designs
that balance and improve the performance of the novel gates
are described. The gates are designed and calibrated using
the University of Florida double-gate model into conventional
and enhanced technology libraries. Synthesis results for 16
benchmark circuits from the ISCAS and OpenSPARC suites
indicate that on average at 2GHz, the enhanced library reduces
total power and the number of fins by 36% and 37%, respectively,
over a conventional library designed using shorted-gate FinFETs
in 32 nm technology
Improving multithreading performance for clustered VLIW architectures.
Very Long Instruction Word (VLIW) processors are very popular in embedded and mobile computing domain. Use of VLIW processors range from Digital Signal Processors (DSPs) found in a plethora of communication and multimedia devices to Graphics Processing Units (GPUs) used in gaming and high performance computing devices. The advantage of VLIWs is their low complexity and low power design which enable high performance at a low cost. Scalability of VLIWs is limited by the scalability of register file ports. It is not viable to have a VLIW processor with a single large register file because of area and power consumption implications of the register file.
Clustered VLIW solve the register file scalability issue by partitioning the register file into multiple clusters and a set of functional units that are attached to register file of that cluster. Using a clustered approach, higher issue width can be achieved while keeping the cost of register file within reasonable limits. Several commercial VLIW processors have been designed using the clustered VLIW model.
VLIW processors can be used to run a larger set of applications. Many of these applications have a good Lnstruction Level Parallelism (ILP) which can be efficiently utilized. However, several applications, specially the ones that are control code dominated do not exibit good ILP and the processor is underutilized. Cache misses is another major source of resource underutiliztion. Multithreading is a popular technique to improve processor utilization. Interleaved MultiThreading (IMT) hides cache miss latencies by scheduling a different thread each cycle but cannot hide unused instructions slots. Simultaneous MultiThread (SMT) can also remove ILP under-utilization by issuing multiple threads to fill the empty instruction slots. However, SMT has a higher implementation cost than IMT. The thesis presents Cluster-level Simultaneous MultiThreading (CSMT) that supports a limited form of SMT where VLIW instructions from different threads are merged at a cluster-level granularity. This lowers the hardware implementation cost to a level comparable to the cheap IMT technique. The more complex SMT combines VLIW instructions at the individual operation-level granularity which is quite expensive especially in for a mobile solution. We refer to SMT at operation-level as OpSMT to reduce ambiguity. While previous studies restricted OpSMT on a VLIW to 2 threads, CSMT has a better scalability and upto 8 threads can be supported at a reasonable cost.
The thesis proposes several other techniques to further improve CSMT performance. In particular, Cluster renaming remaps the clusters used by instructions of different threads to reduce resource conflicts. Cluster renaming is quite effective in reducing the issue-slots under-utilization and significantly improves CSMT performance.The thesis also proposes: a hybrid between IMT and CSMT which increases the number of supported threads, heterogeneous instruction merging where some instructions are combined using SMT and CSMT rest, and finally, split-issue, a technique that allows to launch partially an instruction making it easier to be combined with others
- …