Search CORE

64,038 research outputs found

Throughput-driven floorplanning with wire pipelining

Author: Casu Mario Roberto
Macchiarulo Luca
Publication venue: IEEE
Publication date: 01/01/2005
Field of study

The size of future high-performance SoC is such that the time-of-flight of wires connecting distant pins in the layout can be much higher than the clock period. In order to keep the frequency as high as possible, the wires may be pipelined. However, the insertion of flip-flops may alter the throughput of the system due to the presence of loops in the logic netlist. In this paper, we address the problem of floorplanning a large design where long interconnects are pipelined by inserting the throughput in the cost function of a tool based on simulated annealing. The results obtained on a series of benchmarks are then validated using a simple router that breaks long interconnects by suitably placing flip-flops along the wires

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Physical Design and Clock Tree Synthesis Methods For A 8-Bit Processor

Author: Daxiniray Debaprasad
Publication venue
Publication date: 01/01/2016
Field of study

Now days a number of processors are available with a lot kind of feature from different industries. A processor with similar kind of architecture of the current processors only missing the memory stuffs like the RAM and ROM has been designed here with the help of Verilog style of coding. This processor contains architecturally the program counter, instruction register, ALU, ALU latch, General Purpose Registers, control state module, flag registers and the core module containing all the modules. And a test module is designed for testing the processor. After the design of the processor with successful functionality, the processor is synthesized with 180nm technology. The synthesis is performed with the data path optimization like the selection of proper adders and multipliers for timing optimization in the data path while the ALU operations are performed. During synthesis how to take care of the worst negative slack (WNS), how to include the clock gating cells, how to define the cost and path groups etc. have been covered. After the proper synthesis we get the proper net list and the synthesized constraint file for carrying out the physical design. In physical design the steps like floor-planning, partitioning, placement, legalization of the placement, clock tree synthesis, and routing etc. have been performed. At all the stages the static timing analysis is performed for the timing meet of the design for better performance in terms of timing or frequency. Each steps of physical design are discussed with special effort towards the concepts behind the step. Out of all the steps of physical design the clock tree synthesis is performed with some improvement in the performance of the clock tree by creating a symmetrical clock tree and maintaining more common clock paths. A special algorithm has been framed for creating a symmetrical clock tree and thereby making the power consumption of the clock tree low

ethesis@nitr

Temperature Dependent Wire Delay Estimation in Floorplanning

Author: Liu Wei
Nannarelli Alberto
Vrudhula Sarma
Winther Andreas Thor
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Crossref

Online Research Database In Technology

A novel scan segmentation design method for avoiding shift timing failure in scan testing

Author: Kajihara Seiji
Michael A. Kochte
Miyase Kohei
Wang Laung-Terng
Wen Xiaoqing
Yamato Yuta
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

ITC : 2011 IEEE International Test Conference , 20-22 Sep. 2011 , Anaheim, CA, USAHigh power consumption in scan testing can cause undue yield loss which has increasingly become a serious problem for deep-submicron VLSI circuits. Growing evidence attributes this problem to shift timing failures, which are primarily caused by excessive switching activity in the proximities of clock paths that tends to introduce severe clock skew due to IR-drop-induced delay increase. This paper is the first of its kind to address this critical issue with a novel layout-aware scheme based on scan segmentation design, called LCTI-SS (Low-Clock-Tree-Impact Scan Segmentation). An optimal combination of scan segments is identified for simultaneous clocking so that the switching activity in the proximities of clock trees is reduced while maintaining the average power reduction effect on conventional scan segmentation. Experimental results on benchmark and industrial circuits have demonstrated the advantage of the LCTI-SS scheme

NAIST Academic Repository

Clock Distribution Network Building Algorithm For Multiple Ips In System On A Chip

Author: Tan Tze Liang
Publication venue
Publication date: 01/01/2017
Field of study

In this project, an algorithm is proposed and developed to build the global CDN that is used to distribute the clocks to all partitions in the SoC using the channels available between partitions. The conventional method of building the global CDN involves manual interventions which decrease the global CDN building efficiency and increase the overall SoC design cycle. To solve this issue, an algorithm is proposed to automate the global CDN building process and at the same time obtain a balanced overall CDN not achieved by the conventional method. Other researches have proposed different CDN structures to simplify the design process but the proposals often sacrifice placement resources to achieve this. The algorithm first collects the partition clock latency numbers and other constraints needed as setup. When the setup is done, the global CDN is build and routed. The algorithm checks for clock skew and scenic routing issues before proceeding to shield the global CDN to prevent cross-talk issues. The algorithm is done when a final checking on the clock skew is done. The algorithm is tested on two different floorplans with varying size and available channels using three different clocks for each floorplan to ensure the accuracy of the algorithm. Finally, the global CDN build using the algorithm is evaluated based on the time needed to build the global CDN and the clock buffer numbers and areas used. The algorithm is shown to be able to reduce 50% of the global CDN design cycle and save 5% of clock buffer numbers and areas. The improvement achieved by the algorithm in this project shows the efficiency in designing the global CDN improved tremendously compared to conventional method

Repository@USM

Adaptive Latency Insensitive Protocols

Author: Casu Mario Roberto
Macchiarulo Luca
Publication venue: IEEE
Publication date: 01/01/2007
Field of study

Latency-insensitive design copes with excessive delays typical of global wires in current and future IC technologies. It achieves its goal via encapsulation of synchronous logic blocks in wrappers that communicate through a latency-insensitive protocol (LIP) and pipelined interconnects. Previously proposed solutions suffer from an excessive performance penalty in terms of throughput or from a lack of generality. This article presents an adaptive LIP that outperforms previous static implementations, as demonstrated by two relevant cases — a microprocessor and an MPEG encoder — whose components we made insensitive to the latencies of their interconnections through a newly developed wrapper. We also present an informal exposition of the theoretical basis of adaptive LIPs, as well as implementation detail

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Course grained low power design flow using UPF

Author: Varanasi Archana
Publication venue: RIT Scholar Works
Publication date: 01/08/2009
Field of study

Increased system complexity has led to the substitution of the traditional bottom-up design flow by systematic hierarchical design flow. The main motivation behind the evolution of such an approach is the increasing difficulty in hardware realization of complex systems. With decreasing channel lengths, few key problems such as timing closure, design sign-off, routing complexity, signal integrity, and power dissipation arise in the design flows. Specifically, minimizing power dissipation is critical in several high-end processors. In high-end processors, the design complexity contributes to the overall dynamic power while the decreasing transistor size results in static power dissipation. This research aims at optimizing the design flow for power and timing using the unified power format (UPF). UPF provides a strategic format to specify power-aware design information at every stage in the flow. The low power reduction techniques enforced in this research are multi-voltage, multi-threshold voltage (Vth), and power gating with state retention. An inherent design challenge addressed in this research is the choice of power optimization techniques as the flow advances from synthesis to physical design. A top-down digital design flow for a 32 bit MIPS RISC processor has been implemented with and without UPF synthesis flow for 65nm technology. The UPF synthesis is implemented with two voltages, 1.08V and 0.864V (Multi-VDD). Area, power and timing metrics are analyzed for the flows developed. Power savings of about 20 % are achieved in the design flow with \u27multi-threshold\u27 power technique compared to that of the design flow with no low power techniques employed. Similarly, 30 % power savings are achieved in the design flow with the UPF implemented when compared to that of the design flow with \u27multi-threshold\u27 power technique employed. Thus, a cumulative power savings of 42% has been achieved in a complete power efficient design flow (UPF) compared to that of the generic top-down standard flow with no power saving techniques employed. This is substantiated by the low voltage operation of modules in the design, reduction in clock switching power by gating clocks in the design and extensive use of HVT and LVT standard cells for implementation. The UPF synthesis flow saw the worst timing slack and more area when compared to those of the `multi-threshold\u27 or the generic flow. Percentage increase in the area with UPF is approximately 15%; a significant source for this increase being the additional power controlling logic added

RIT Scholar Works