Search CORE

653 research outputs found

Programmable rate modem utilizing digital signal processing techniques

Author: Bunya George K.
Wallace Robert L.
Publication venue
Publication date
Field of study

The engineering development study to follow was written to address the need for a Programmable Rate Digital Satellite Modem capable of supporting both burst and continuous transmission modes with either binary phase shift keying (BPSK) or quadrature phase shift keying (QPSK) modulation. The preferred implementation technique is an all digital one which utilizes as much digital signal processing (DSP) as possible. Here design tradeoffs in each portion of the modulator and demodulator subsystem are outlined, and viable circuit approaches which are easily repeatable, have low implementation losses and have low production costs are identified. The research involved for this study was divided into nine technical papers, each addressing a significant region of concern in a variable rate modem design. Trivial portions and basic support logic designs surrounding the nine major modem blocks were omitted. In brief, the nine topic areas were: (1) Transmit Data Filtering; (2) Transmit Clock Generation; (3) Carrier Synthesizer; (4) Receive AGC; (5) Receive Data Filtering; (6) RF Oscillator Phase Noise; (7) Receive Carrier Selectivity; (8) Carrier Recovery; and (9) Timing Recovery

NASA Technical Reports Server

Hardware, Software and Data Analysis Techniques for SRAM-Based Field Programmable Gate Array Circuits

Author: Hockenberry Eugene B.
Publication venue: AFIT Scholar
Publication date: 01/01/2008
Field of study

This work presents a built, tested, and demonstrated test structure that is low-cost, flexible, and re-usable for robust radiation experimentation, primarily to investigate memory, in this case SRAMs and SRAM-based FPGAs. The space environment can induce many kinds of failures due to radiation effects. These failures result in a loss of money, time, intelligence, and information. In order to evaluate technologies for potential failures, a detailed test methodology and associated structure are required. In this solution, an FPGA board was used as the controller platform, with multiple VHDL circuit controllers, data collection and reporting modules. The structure was demonstrated by programming an SRAM-based FPGA board as the device under test (DUT) with various types of adders, counters and RAM modules. The controllers, hardware, and data collection operations were tested and validated using gamma radiation from a Co-60 source at the Ohio State University Nuclear Reactor to irradiate the DUT. The test structure is easily modified to allow for a broad range of experiments on the same DUT. In addition, this structure is easily adaptable for other memory types, such as DRAM, FlashRam, and MRAM. These additions will be discussed further in this document. The system fits in a backpack and costs less than $1000

AFTI Scholar (Air Force Institute of Technology)

CiteSeerX

Timing speculation and adaptive reliable overclocking techniques for aggressive computer systems

Author: Subramanian Viswanathan
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2009
Field of study

Computers have changed our lives beyond our own imagination in the past several decades. The continued and progressive advancements in VLSI technology and numerous micro-architectural innovations have played a key role in the design of spectacular low-cost high performance computing systems that have become omnipresent in today\u27s technology driven world. Performance and dependability have become key concerns as these ubiquitous computing machines continue to drive our everyday life. Every application has unique demands, as they run in diverse operating environments. Dependable, aggressive and adaptive systems improve efficiency in terms of speed, reliability and energy consumption. Traditional computing systems run at a fixed clock frequency, which is determined by taking into account the worst-case timing paths, operating conditions, and process variations. Timing speculation based reliable overclocking advocates going beyond worst-case limits to achieve best performance while not avoiding, but detecting and correcting a modest number of timing errors. The success of this design methodology relies on the fact that timing critical paths are rarely exercised in a design, and typical execution happens much faster than the timing requirements dictated by worst-case design methodology. Better-than-worst-case design methodology is advocated by several recent research pursuits, which exploit dependability techniques to enhance computer system performance. In this dissertation, we address different aspects of timing speculation based adaptive reliable overclocking schemes, and evaluate their role in the design of low-cost, high performance, energy efficient and dependable systems. We visualize various control knobs in the design that can be favorably controlled to ensure different design targets. As part of this research, we extend the SPRIT3E, or Superscalar PeRformance Improvement Through Tolerating Timing Errors, framework, and characterize the extent of application dependent performance acceleration achievable in superscalar processors by scrutinizing the various parameters that impact the operation beyond worst-case limits. We study the limitations imposed by short-path constraints on our technique, and present ways to exploit them to maximize performance gains. We analyze the sensitivity of our technique\u27s adaptiveness by exploring the necessary hardware requirements for dynamic overclocking schemes. Experimental analysis based on SPEC2000 benchmarks running on a SimpleScalar Alpha processor simulator, augmented with error rate data obtained from hardware simulations of a superscalar processor, are presented. Even though reliable overclocking guarantees functional correctness, it leads to higher power consumption. As a consequence, reliable overclocking without considering on-chip temperatures will bring down the lifetime reliability of the chip. In this thesis, we analyze how reliable overclocking impacts the on-chip temperature of a microprocessor and evaluate the effects of overheating, due to such reliable dynamic frequency tuning mechanisms, on the lifetime reliability of these systems. We then evaluate the effect of performing thermal throttling, a technique that clamps the on-chip temperature below a predefined value, on system performance and reliability. Our study shows that a reliably overclocked system with dynamic thermal management achieves 25% performance improvement, while lasting for 14 years when being operated within 353K. Over the past five decades, technology scaling, as predicted by Moore\u27s law, has been the bedrock of semiconductor technology evolution. The continued downscaling of CMOS technology to deep sub-micron gate lengths has been the primary reason for its dominance in today\u27s omnipresent silicon microchips. Even as the transition to the next technology node is indispensable, the initial cost and time associated in doing so presents a non-level playing field for the competitors in the semiconductor business. As part of this thesis, we evaluate the capability of speculative reliable overclocking mechanisms to maximize performance at a given technology level. We evaluate its competitiveness when compared to technology scaling, in terms of performance, power consumption, energy and energy delay product. We present a comprehensive comparison for integer and floating point SPEC2000 benchmarks running on a simulated Alpha processor at three different technology nodes in normal and enhanced modes. Our results suggest that adopting reliable overclocking strategies will help skip a technology node altogether, or be competitive in the market, while porting to the next technology node. Reliability has become a serious concern as systems embrace nanometer technologies. In this dissertation, we propose a novel fault tolerant aggressive system that combines soft error protection and timing error tolerance. We replicate both the pipeline registers and the pipeline stage combinational logic. The replicated logic receives its inputs from the primary pipeline registers while writing its output to the replicated pipeline registers. The organization of redundancy in the proposed Conjoined Pipeline system supports overclocking, provides concurrent error detection and recovery capability for soft errors, intermittent faults and timing errors, and flags permanent silicon defects. The fast recovery process requires no checkpointing and takes three cycles. Back annotated post-layout gate-level timing simulations, using 45nm technology, of a conjoined two-stage arithmetic pipeline and a conjoined five-stage DLX pipeline processor, with forwarding logic, show that our approach, even under a severe fault injection campaign, achieves near 100% fault coverage and an average performance improvement of about 20%, when dynamically overclocked

Digital Repository @ Iowa State University (ISU)

Low Power Processor Architectures and Contemporary Techniques for Power Optimization – A Review

Author: Gujarathi Hemal S
McDonald-Maier Klaus D
Qadri Muhammad Yasir
Publication venue: 'Academy Publisher'
Publication date: 01/01/2009
Field of study

The technological evolution has increased the number of transistors for a given die area significantly and increased the switching speed from few MHz to GHz range. Such inversely proportional decline in size and boost in performance consequently demands shrinking of supply voltage and effective power dissipation in chips with millions of transistors. This has triggered substantial amount of research in power reduction techniques into almost every aspect of the chip and particularly the processor cores contained in the chip. This paper presents an overview of techniques for achieving the power efficiency mainly at the processor core level but also visits related domains such as buses and memories. There are various processor parameters and features such as supply voltage, clock frequency, cache and pipelining which can be optimized to reduce the power consumption of the processor. This paper discusses various ways in which these parameters can be optimized. Also, emerging power efficient processor architectures are overviewed and research activities are discussed which should help reader identify how these factors in a processor contribute to power consumption. Some of these concepts have been already established whereas others are still active research areas. © 2009 ACADEMY PUBLISHER

University of Essex Research Repository

CiteSeerX

Crossref

Development of an image converter of radical design

Author: Farnsworth D. L.
Irwin E. L.
Publication venue
Publication date
Field of study

A long term investigation of thin film sensors, monolithic photo-field effect transistors, and epitaxially diffused phototransistors and photodiodes to meet requirements to produce acceptable all solid state, electronically scanned imaging system, led to the production of an advanced engineering model camera which employs a 200,000 element phototransistor array (organized in a matrix of 400 rows by 500 columns) to secure resolution comparable to commercial television. The full investigation is described for the period July 1962 through July 1972, and covers the following broad topics in detail: (1) sensor monoliths; (2) fabrication technology; (3) functional theory; (4) system methodology; and (5) deployment profile. A summary of the work and conclusions are given, along with extensive schematic diagrams of the final solid state imaging system product

NASA Technical Reports Server

Heuristics Based Test Overhead Reduction Techniques in VLSI Circuits

Author: Chakraborty Avijit
Publication venue
Publication date: 07/02/2023
Field of study

The electronic industry has evolved at a mindboggling pace over the last five decades. Moore’s Law [1] has enabled the chip makers to push the limits of the physics to shrink the feature sizes on Silicon (Si) wafers over the years. A constant push for power-performance-area (PPA) optimization has driven the higher transistor density trends. The defect density in advanced process nodes has posed a challenge in achieving sustainable yield. Maintaining a low Defect-per-Million (DPM) target for a product to be viable with stringent Time-to-Market (TTM) has become one of the most important aspects of the chip manufacturing process. Design-for-Test (DFT) plays an instrumental role in enabling low DPM. DFT however impacts the PPA of a chip. This research describes an approach of minimizing the scan test overhead in a chip based on circuit topology heuristics. These heuristics are applied on a full-scan design to convert a subset of the scan flip-flops (SFF) into D flip-flops (DFF). The K Longest Path per Gate (KLPG) [2] automatic test pattern generation (ATPG) algorithm is used to generate tests for robust paths in the circuit. Observability driven multi cycle path generation [3][4] and test are used in this work to minimize coverage loss caused by the SFF conversion process. The presence of memory arrays in a design exacerbates the coverage loss due to the shadow cast by the array on its neighboring logic. A specialized behavioral modeling for the memory array is required to enable test coverage of the shadow logic. This work develops a memory model integrated into the ATPG engine for this purpose. Multiple clock domains pose challenges in the path generation process. The inter-domain clocking relationship and corresponding logic sensitization are modeled in our work to generate synchronous inter-domain paths over multiple clock cycles. Results are demonstrated on ISCAS89 and ITC99 benchmark circuits. Power saving benefit is quantified using an open-source standard-cell library

Texas A&M Repository

Radiation Hardened by Design Methodologies for Soft-Error Mitigated Digital Architectures

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: Digital architectures for data encryption, processing, clock synthesis, data transfer, etc. are susceptible to radiation induced soft errors due to charge collection in complementary metal oxide semiconductor (CMOS) integrated circuits (ICs). Radiation hardening by design (RHBD) techniques such as double modular redundancy (DMR) and triple modular redundancy (TMR) are used for error detection and correction respectively in such architectures. Multiple node charge collection (MNCC) causes domain crossing errors (DCE) which can render the redundancy ineffectual. This dissertation describes techniques to ensure DCE mitigation with statistical confidence for various designs. Both sequential and combinatorial logic are separated using these custom and computer aided design (CAD) methodologies. Radiation vulnerability and design overhead are studied on VLSI sub-systems including an advanced encryption standard (AES) which is DCE mitigated using module level coarse separation on a 90-nm process with 99.999% DCE mitigation. A radiation hardened microprocessor (HERMES2) is implemented in both 90-nm and 55-nm technologies with an interleaved separation methodology with 99.99% DCE mitigation while achieving 4.9% increased cell density, 28.5 % reduced routing and 5.6% reduced power dissipation over the module fences implementation. A DMR register-file (RF) is implemented in 55 nm process and used in the HERMES2 microprocessor. The RF array custom design and the decoders APR designed are explored with a focus on design cycle time. Quality of results (QOR) is studied from power, performance, area and reliability (PPAR) perspective to ascertain the improvement over other design techniques. A radiation hardened all-digital multiplying pulsed digital delay line (DDL) is designed for double data rate (DDR2/3) applications for data eye centering during high speed off-chip data transfer. The effect of noise, radiation particle strikes and statistical variation on the designed DDL are studied in detail. The design achieves the best in class 22.4 ps peak-to-peak jitter, 100-850 MHz range at 14 pJ/cycle energy consumption. Vulnerability of the non-hardened design is characterized and portions of the redundant DDL are separated in custom and auto-place and route (APR). Thus, a range of designs for mission critical applications are implemented using methodologies proposed in this work and their potential PPAR benefits explored in detail.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

ASU Digital Repository

Advanced Timing and Synchronization Methodologies for Digital VLSI Integrated Circuits

Author: Taskin Baris
Publication venue
Publication date: 01/01/2005
Field of study

This dissertation addresses timing and synchronization methodologies that are critical to the design, analysis and optimization of high-performance, integrated digital VLSI systems. As process sizes shrink and design complexities increase, achieving timing closure for digital VLSI circuits becomes a significant bottleneck in the integrated circuit design flow. Circuit designers are motivated to investigate and employ alternative methods to satisfy the timing and physical design performance targets. Such novel methods for the timing and synchronization of complex circuitry are developed in this dissertation and analyzed for performance and applicability.Mainstream integrated circuit design flow is normally tuned for zero clock skew, edge-triggered circuit design. Non-zero clock skew or multi-phase clock synchronization is seldom used because the lack of design automation tools increases the length and cost of the design cycle. For similar reasons, level-sensitive registers have not become an industry standard despite their superior size, speed and power consumption characteristics compared to conventional edge-triggered flip-flops.In this dissertation, novel design and analysis techniques that fully automate the design and analysis of non-zero clock skew circuits are presented. Clock skew scheduling of both edge-triggered and level-sensitive circuits are investigated in order to exploit maximum circuit performances. The effects of multi-phase clocking on non-zero clock skew, level-sensitive circuits are investigated leading to advanced synchronization methodologies. Improvements in the scalability of the computational timing analysis process with clock skew scheduling are explored through partitioning and parallelization.The integration of the proposed design and analysis methods to the physical design flow of integrated circuits synchronized with a next-generation clocking technology-resonant rotary clocking technology-is also presented. Based on the design and analysis methods presented in this dissertation, a computer-aided design tool for the design of rotary clock synchronized integrated circuits is developed

CiteSeerX

D-Scholarship@Pitt

High Peformance and Low Power On-Die Interconnect Fabrics.

Author: Satpathy Sudhir Kumar
Publication venue
Publication date
Field of study

Increasing power density with technology scaling has caused stagnation in operating frequency of modern day microprocessors. This has led designers to prefer multicore architectures over complex monolithic processors to keep up with the demand for rising computing throughput. Although processing units are getting smaller and simpler, the dramatic rise of their count on a single die has made the fabric that connects these processing units increasingly complex. These interconnect fabrics have become a bottleneck in improving overall system effciency. As a result, the design paradigm for multi-core chips is gradually shifting from a core-centric architecture towards an interconnect-centric architecture, where system efficiency is limited by the fabric rather than the processing ability of any individual core. This dissertation introduces three novel and synergistic circuit techniques to improve scalability of switch fabrics to make on-die integration of hundreds to thousands of cores feasible. 1) A matrix topology is proposed for designing a fully connected switch fabric that re-uses output buses for programming, and stores shue congurations at cross points. This significantly reduces routing congestion, lowers area/power, and improves per- formance. Silicon measurements demonstrate 47% energy savings in a 64-lane SIMD processor fabricated in 65nm CMOS over a conventional implementation. 2) A novel approach to handle high radix arbitration along with data routing is proposed. It optimally uses existing cross-bar interconnect resources without requiring any additional overhead. Bandwidth exceeding 2Tb/s is recorded in a test prototype fabricated in 65nm. 3) Building on the later, a new circuit topology to manage and update priority adaptively within the switch fabric without incurring additional delay or area is then proposed. Several assist circuit techniques, such as a thyristor based sense amplifier and self regenerating bi-directional repeaters are proposed for high speed energy efficient signaling to and from the switch fabric to improve overall routing efficiency. Using these techniques a 64 x 64 switch fabric with 128b data bus fabricated in 45nm achieves a throughput of 4.5Tb/s at single cycle latency while operating at 559MHz.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91506/1/sudhirks_1.pd

Deep Blue Documents at the University of Michigan

Gallium arsenide bit-serial integrated circuits

Author: Lam S.C.K.
Publication venue: The University of Edinburgh
Publication date: 01/01/1990
Field of study

Edinburgh Research Archive