2,911 research outputs found

    Standard Transistor Array (STAR). Volume 1: Placement technique

    Get PDF
    A large scale integration (LSI) technology, the standard transistor array uses a prefabricated understructure of transistors and a comprehensive library of digital logic cells to allow efficient fabrication of semicustom digital LSI circuits. The cell placement technique for this technology involves formation of a one dimensional cell layout and "folding" of the one dimensional placement onto the chip. It was found that, by use of various folding methods, high quality chip layouts can be achieved. Methods developed to measure of the "goodness" of the generated placements include efficient means for estimating channel usage requirements and for via counting. The placement and rating techniques were incorporated into a placement program (CAPSTAR). By means of repetitive use of the folding methods and simple placement improvement strategies, this program provides near optimum placements in a reasonable amount of time. The program was tested on several typical LSI circuits to provide performance comparisons both with respect to input parameters and with respect to the performance of other placement techniques. The results of this testing indicate that near optimum placements can be achieved by use of the procedures incurring severe time penalties

    Impact of Complex-Logic Cell Layout on the Single-Event Transient Sensitivity

    Get PDF
    International audienceThe design methodology based on standard cells is widely used in a broad range of VLSI applications. Further, several optimization algorithms can be employed to address different constraints such as power consumption or reliability. This work evaluates the implications of the usage of complex-logic cells from a 45 nm Standard-Cell library to the Single-Event Transient sensitivity under heavy ions. Results show that even though a reduction in the layout area is obtained when adopting complex-logic gates, a slight reduction in the total sensitive area of the circuit is observed. Moreover, the effectiveness of logical masking can be suppressed, leading to a higher SET cross-section

    Exploiting Application Behaviors for Resilient Static Random Access Memory Arrays in the Near-Threshold Computing Regime

    Get PDF
    Near-Threshold Computing embodies an intriguing choice for mobile processors due to the promise of superior energy efficiency, extending the battery life of these devices while reducing the peak power draw. However, process, voltage, and temperature variations cause a significantly high failure rate of Level One cache cells in the near-threshold regime a stark contrast to designs in the super-threshold regime, where fault sites are rare. This thesis work shows that faulty cells in the near-threshold regime are highly clustered in certain regions of the cache. In addition, popular mobile benchmarks are studied to investigate the impact of run-time workloads on timing faults manifestation. A technique to mitigate the run-time faults is proposed. This scheme maps frequently used data to healthy cache regions by exploiting the application cache behaviors. The results show up to 78% gain in performance over two other state-of-the-art techniques

    SCALABLE INTEGRATED CIRCUIT SIMULATION ALGORITHMS FOR ENERGY-EFFICIENT TERAFLOP HETEROGENEOUS PARALLEL COMPUTING PLATFORMS

    Get PDF
    Integrated circuit technology has gone through several decades of aggressive scaling.It is increasingly challenging to analyze growing design complexity. Post-layout SPICE simulation can be computationally prohibitive due to the huge amount of parasitic elements, which can easily boost the computation and memory cost. As the decrease in device size, the circuits become more vulnerable to process variations. Designers need to statistically simulate the probability that a circuit does not meet the performance metric, which requires millions times of simulations to capture rare failure events. Recent, multiprocessors with heterogeneous architecture have emerged as mainstream computing platforms. The heterogeneous computing platform can achieve highthroughput energy efficient computing. However, the application of such platform is not trivial and needs to reinvent existing algorithms to fully utilize the computing resources. This dissertation presents several new algorithms to address those aforementioned two significant and challenging issues on the heterogeneous platform. Harmonic Balance (HB) analysis is essential for efficient verification of large postlayout RF and microwave integrated circuits (ICs). However, existing methods either suffer from excessively long simulation time and prohibitively large memory consumption or exhibit poor stability. This dissertation introduces a novel transient-simulation guided graph sparsification technique, as well as an efficient runtime performance modeling approach tailored for heterogeneous manycore CPU-GPU computing system to build nearly-optimal subgraph preconditioners that can lead to minimum HB simulation runtime. Additionally, we propose a novel heterogeneous parallel sparse block matrix algorithm by taking advantages of the structure of HB Jacobian matrices as well as GPU’s streaming multiprocessors to achieve optimal workload balancing during the preconditioning phase of HB analysis. We also show how the proposed preconditioned iterative algorithm can efficiently adapt to heterogeneous computing systems with different CPU and GPU computing capabilities. Extensive experimental results show that our HB solver can achieve up to 20X speedups and 5X memory reduction when compared with the state-of-the-art direct solver highly optimized for twelve-core CPUs. In nowadays variation-aware IC designs, cell characterizations and SRAM memory yield analysis require many thousands or even millions of repeated SPICE simulations for relatively small nonlinear circuits. In this dissertation, for the first time, we present a massively parallel SPICE simulator on GPU, TinySPICE, for efficiently analyzing small nonlinear circuits. TinySPICE integrates a highly-optimized shared-memory based matrix solver and fast parametric three-dimensional (3D) LUTs based device evaluation method. A novel circuit clustering method is also proposed to improve the stability and efficiency of the matrix solver. Compared with CPU-based SPICE simulator, TinySPICE achieves up to 264X speedups for parametric SRAM yield analysis without loss of accuracy

    VLSI Design

    Get PDF
    This book provides some recent advances in design nanometer VLSI chips. The selected topics try to present some open problems and challenges with important topics ranging from design tools, new post-silicon devices, GPU-based parallel computing, emerging 3D integration, and antenna design. The book consists of two parts, with chapters such as: VLSI design for multi-sensor smart systems on a chip, Three-dimensional integrated circuits design for thousand-core processors, Parallel symbolic analysis of large analog circuits on GPU platforms, Algorithms for CAD tools VLSI design, A multilevel memetic algorithm for large SAT-encoded problems, etc

    A Low-Voltage, Low-Power 4-bit BCD Adder, designed using the Clock Gated Power Gating, and the DVT Scheme

    Full text link
    This paper proposes a Low-Power, Energy Efficient 4-bit Binary Coded Decimal (BCD) adder design where the conventional 4-bit BCD adder has been modified with the Clock Gated Power Gating Technique. Moreover, the concept of DVT (Dual-vth) scheme has been introduced while designing the full adder blocks to reduce the Leakage Power, as well as, to maintain the overall performance of the entire circuit. The reported architecture of 4-bit BCD adder is designed using 45 nm technology and it consumes 1.384 {\mu}Watt of Average Power while operating with a frequency of 200 MHz, and a Supply Voltage (Vdd) of 1 Volt. The results obtained from different simulation runs on SPICE, indicate the superiority of the proposed design compared to the conventional 4-bit BCD adder. Considering the product of Average Power and Delay, for the operating frequency of 200 MHz, a fair 47.41 % reduction compared to the conventional design has been achieved with this proposed scheme.Comment: To appear in the proceedings of 2013 IEEE International Conference on Signal Processing, Computing and Control (ISPCC,13
    • …
    corecore