Search CORE

530 research outputs found

Recommended from our members

Investigation into the wafer-scale integration of fine-grain parallel processing computer systems

Author: Jones Simon Richard
Publication venue: Brunel University
Publication date: 01/01/1986
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.This thesis investigates the potential of wafer-scale integration (WSI) for the implementation of low-cost fine-grain parallel processing computer systems. As WSI is a relatively new subject, there was little work on which to base investigations. Indeed, most WSI architectures existed only as untried and sometimes vague proposals. Accordingly, the research strategy approached this problem by identifying a representative WSI structure and architecture on which to base investigations. An analysis of architectural proposals identified associative memory to be general purpose parallel processing component used in a wide range of WSI architectures. Furthermore, this analysis provided a set of WSI-level design requirements to evaluate the sustainability of different architectures as research vehicles. The WSI-ASP (WASP) device, which has a large associative memory as its main component is shown to meet these requirements and hence was chosen as the research vehicle. Consequently, this thesis addresses WSI potential through an in-depth investigation into the feasibility of implementing a large associative memory for the WASP device that meets the demanding technological constraints of WSI. Overall, the thesis concludes that WSI offers significant potential for the implementation of low-cost fine-grain parallel processing computer systems. However, due to the dual constraints of thermal management and the area required for the power distribution network, power density is a major design constraint in WSI. Indeed, it is shown that WSI power densities need to be an order of magnitude lower than VLSI power densities. The thesis demonstrates that for associative memories at least, VLSI designs are unsuited to implementation in WSI. Rather, it is shown that WSI circuits must be closely matched to the operational environment to assure suitable power densities. These circuits are significantly larger than their VLSI equivalents. Nonetheless, the thesis demonstrates that by concentrating on the most power intensive circuits, it is possible to achieve acceptable power densities with only a modest increase in area overheads.SER

Brunel University Research Archive

Reconfigurable architecture for very large scale microelectronic systems

Author: Chen Wei
Publication venue: The University of Edinburgh
Publication date: 01/01/1986
Field of study

Edinburgh Research Archive

VLSI architectures for high speed Fourier transform processing

Author: Mactaggart Iain Ross
Publication venue: The University of Edinburgh
Publication date: 01/01/1984
Field of study

Edinburgh Research Archive

Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

Author: Huang Yanxiang
Publication venue
Publication date: 15/09/2017
Field of study

With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page

arXiv.org e-Print Archive

Lirias

On Energy Efficient Computing Platforms

Author: Yin Alexander Wei
Publication venue: Turku Centre for Computer Science
Publication date: 28/08/2013
Field of study

In accordance with the Moore's law, the increasing number of on-chip integrated transistors has enabled modern computing platforms with not only higher processing power but also more affordable prices. As a result, these platforms, including portable devices, work stations and data centres, are becoming an inevitable part of the human society. However, with the demand for portability and raising cost of power, energy efficiency has emerged to be a major concern for modern computing platforms. As the complexity of on-chip systems increases, Network-on-Chip (NoC) has been proved as an efficient communication architecture which can further improve system performances and scalability while reducing the design cost. Therefore, in this thesis, we study and propose energy optimization approaches based on NoC architecture, with special focuses on the following aspects. As the architectural trend of future computing platforms, 3D systems have many bene ts including higher integration density, smaller footprint, heterogeneous integration, etc. Moreover, 3D technology can signi cantly improve the network communication and effectively avoid long wirings, and therefore, provide higher system performance and energy efficiency. With the dynamic nature of on-chip communication in large scale NoC based systems, run-time system optimization is of crucial importance in order to achieve higher system reliability and essentially energy efficiency. In this thesis, we propose an agent based system design approach where agents are on-chip components which monitor and control system parameters such as supply voltage, operating frequency, etc. With this approach, we have analysed the implementation alternatives for dynamic voltage and frequency scaling and power gating techniques at different granularity, which reduce both dynamic and leakage energy consumption. Topologies, being one of the key factors for NoCs, are also explored for energy saving purpose. A Honeycomb NoC architecture is proposed in this thesis with turn-model based deadlock-free routing algorithms. Our analysis and simulation based evaluation show that Honeycomb NoCs outperform their Mesh based counterparts in terms of network cost, system performance as well as energy efficiency.Siirretty Doriast

UTUPub

Advanced space communications architecture study. Volume 2: Technical report

Author: Hadinger Peter J.
Horstein Michael
Publication venue
Publication date
Field of study

The technical feasibility and economic viability of satellite system architectures that are suitable for customer premise service (CPS) communications are investigated. System evaluation is performed at 30/20 GHz (Ka-band); however, the system architectures examined are equally applicable to 14/11 GHz (Ku-band). Emphasis is placed on systems that permit low-cost user terminals. Frequency division multiple access (FDMA) is used on the uplink, with typically 10,000 simultaneous accesses per satellite, each of 64 kbps. Bulk demodulators onboard the satellite, in combination with a baseband multiplexer, convert the many narrowband uplink signals into a small number of wideband data streams for downlink transmission. Single-hop network interconnectivity is accomplished via downlink scanning beams. Each satellite is estimated to weigh 5600 lb and consume 6850W of power; the corresponding payload totals are 1000 lb and 5000 W. Nonrecurring satellite cost is estimated at

110 million, with the first-unit cost at

113 million. In large quantities, the user terminal cost estimate is $25,000. For an assumed traffic profile, the required system revenue has been computed as a function of the internal rate of return (IRR) on invested capital. The equivalent user charge per-minute of 64-kbps channel service has also been determined

NASA Technical Reports Server

Applications for FPGA's on Nanosatellites

Author: Thai Thong
Publication venue
Publication date: 26/01/2015
Field of study

This thesis examines the feasibility of using a Field Programmable Gate Array (FPGA) based design on-board a CubeSat-sized nanosatellite. FPGAs are programmable logic devices that allow for the implementation of custom digital hardware on a single Integrated Circuit (IC). By using these FPGAs in spacecraft, more efficient processing can be done by moving the design onto hardware. A variety of different FPGA-based designs are looked at, including a Watchdog Timer (WDT), a Global Positioning System (GPS) receiver, and a camera interface

YorkSpace

Innovative Techniques for Testing and Diagnosing SoCs

Author: DE CARVALHO Mauricio
Publication venue: Politecnico di Torino
Publication date: 01/01/2015
Field of study

We rely upon the continued functioning of many electronic devices for our everyday welfare, usually embedding integrated circuits that are becoming even cheaper and smaller with improved features. Nowadays, microelectronics can integrate a working computer with CPU, memories, and even GPUs on a single die, namely System-On-Chip (SoC). SoCs are also employed on automotive safety-critical applications, but need to be tested thoroughly to comply with reliability standards, in particular the ISO26262 functional safety for road vehicles. The goal of this PhD. thesis is to improve SoC reliability by proposing innovative techniques for testing and diagnosing its internal modules: CPUs, memories, peripherals, and GPUs. The proposed approaches in the sequence appearing in this thesis are described as follows: 1. Embedded Memory Diagnosis: Memories are dense and complex circuits which are susceptible to design and manufacturing errors. Hence, it is important to understand the fault occurrence in the memory array. In practice, the logical and physical array representation differs due to an optimized design which adds enhancements to the device, namely scrambling. This part proposes an accurate memory diagnosis by showing the efforts of a software tool able to analyze test results, unscramble the memory array, map failing syndromes to cell locations, elaborate cumulative analysis, and elaborate a final fault model hypothesis. Several SRAM memory failing syndromes were analyzed as case studies gathered on an industrial automotive 32-bit SoC developed by STMicroelectronics. The tool displayed defects virtually, and results were confirmed by real photos taken from a microscope. 2. Functional Test Pattern Generation: The key for a successful test is the pattern applied to the device. They can be structural or functional; the former usually benefits from embedded test modules targeting manufacturing errors and is only effective before shipping the component to the client. The latter, on the other hand, can be applied during mission minimally impacting on performance but is penalized due to high generation time. However, functional test patterns may benefit for having different goals in functional mission mode. Part III of this PhD thesis proposes three different functional test pattern generation methods for CPU cores embedded in SoCs, targeting different test purposes, described as follows: a. Functional Stress Patterns: Are suitable for optimizing functional stress during I Operational-life Tests and Burn-in Screening for an optimal device reliability characterization b. Functional Power Hungry Patterns: Are suitable for determining functional peak power for strictly limiting the power of structural patterns during manufacturing tests, thus reducing premature device over-kill while delivering high test coverage c. Software-Based Self-Test Patterns: Combines the potentiality of structural patterns with functional ones, allowing its execution periodically during mission. In addition, an external hardware communicating with a devised SBST was proposed. It helps increasing in 3% the fault coverage by testing critical Hardly Functionally Testable Faults not covered by conventional SBST patterns. An automatic functional test pattern generation exploiting an evolutionary algorithm maximizing metrics related to stress, power, and fault coverage was employed in the above-mentioned approaches to quickly generate the desired patterns. The approaches were evaluated on two industrial cases developed by STMicroelectronics; 8051-based and a 32-bit Power Architecture SoCs. Results show that generation time was reduced upto 75% in comparison to older methodologies while increasing significantly the desired metrics. 3. Fault Injection in GPGPU: Fault injection mechanisms in semiconductor devices are suitable for generating structural patterns, testing and activating mitigation techniques, and validating robust hardware and software applications. GPGPUs are known for fast parallel computation used in high performance computing and advanced driver assistance where reliability is the key point. Moreover, GPGPU manufacturers do not provide design description code due to content secrecy. Therefore, commercial fault injectors using the GPGPU model is unfeasible, making radiation tests the only resource available, but are costly. In the last part of this thesis, we propose a software implemented fault injector able to inject bit-flip in memory elements of a real GPGPU. It exploits a software debugger tool and combines the C-CUDA grammar to wisely determine fault spots and apply bit-flip operations in program variables. The goal is to validate robust parallel algorithms by studying fault propagation or activating redundancy mechanisms they possibly embed. The effectiveness of the tool was evaluated on two robust applications: redundant parallel matrix multiplication and floating point Fast Fourier Transform

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Building Efficient and Reliable Emerging Technology Systems

Author: Amarnath Aporva
Publication venue
Publication date: 01/01/2021
Field of study

The semiconductor industry has been reaping the benefits of Moore’s law powered by Dennard’s voltage scaling for the past fifty years. However, with the end of Dennard scaling, silicon chip manufacturers are facing a widespread plateau in performance improvements. While the architecture community has focused its effort on exploring parallelism, such as with multi-core, many-core and accelerator-based systems, chip manufacturers have been forced to explore beyond-Moore technologies to improve performance while maintaining power density. Examples of such technologies include monolithic 3D integration, carbon nanotube transistors, tunneling-based transistors, spintronics and quantum computing. However, the infancy of the manufacturing process of these new technologies impedes their usage in commercial products. The goal of this dissertation is to combine both architectural and device-level efforts to provide solutions across the computing stack that can overcome the reliability concerns of emerging technologies. This allows for beyond-Moore systems to compete with highly optimized silicon-based processors, thus, enabling faster commercialization of such systems. This dissertation proposes the following key steps: (i) Multifaceted understanding and modeling of variation and yield issues that occur in emerging technologies, such as carbon nanotube transistors (CNFETs). (ii) Design of systems using suitable logic families such as pass transistor logic that provide high performance. (iii) Design of a multi-granular fault-tolerant reconfigurable architecture that enhances yield and performance. (iv) Design of a multi-technology, multi-accelerator heterogeneous system (v) Development of real-time constrained efficient workload scheduling mechanism for heterogeneous systems. This dissertation first presents the use of pass transistor logic family as an alternate to the CMOS logic family for CNFETs to improve performance. It explores various architectural design choices for CNFETs using pass transistor logic (PTL) to create an energy-efficient RISC-V processor. Our results show that while a CNFET RISC-V processor using CMOS logic achieves a 2.9x energy-delay product (EDP) improvement over a silicon design, using PTL along the critical path components of the processor can boost EDP improvement by 5x as well as reduce area by 17% over 16 nm silicon CMOS. This document further builds on providing fault-tolerant and yield enhancing solutions for emerging 3D integration compatible technologies in the context of CNFETs. The proposed framework can efficiently support high-variation technologies by providing protection against manufacturing defects at multiple granularities: module and pipeline-stage levels. Based on the variation observed in a synthesized design, a reliable CNFET-based 3D multi-granular reconfigurable architecture, 3DTUBE, is presented to overcome the manufacturing difficulties. For 0.4-0.7 V, 3DTUBE provides up to 6.0x higher throughput and 3.1x lower EDP compared to a silicon-based multi-core design evaluated at 1 part per billion transistor failure rate, which is 10,000x lower in comparison to CNFET’s failure rate. This dissertation then ventures into building multi-accelerator heterogeneous systems and real-time schedulers that cater to the requirements of the applications while taking advantage of the underlying heterogeneous system. We introduce optimizations like task pruning, hierarchical hetero-ranking and rank update built upon two scheduler policies (MS-static and MS-dynamic), that result in a performance improvement of 3.5x (average) for real-world autonomous vehicle applications, when compared against state-of-the-art schedulers. Adopting insights from the above work, this thesis presents a multi-accelerator, multi-technology heterogeneous system powered by a multi-constrained scheduler that optimizes for varying task requirements to achieve up to 6.1x better energy over a baseline silicon-based system.PHDElectrical and Computer EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169699/1/aporvaa_1.pd

Deep Blue Documents at the University of Michigan

NASA Tech Briefs, May 2011

Author
Publication venue
Publication date
Field of study

Topics covered include: 1) Method to Estimate the Dissolved Air Content in Hydraulic Fluid; 2) Method for Measuring Collimator-Pointing Sensitivity to Temperature Changes; 3) High-Temperature Thermometer Using Cr-Doped GdAlO3 Broadband Luminescence; 4)Metrology Arrangement for Measuring the Positions of Mirrors of a Submillimeter Telescope; 5) On-Wafer S-Parameter Measurements in the 325-508-GHz Band; 6) Reconfigurable Microwave Phase Delay Element for Frequency Reference and Phase-Shifter Applications; 7) High-Speed Isolation Board for Flight Hardware Testing; 8) High-Throughput, Adaptive FFT Architecture for FPGA-Based Spaceborne Data Processors; 9) 3D Orbit Visualization for Earth-Observing Missions; 10) MaROS: Web Visualization of Mars Orbiting and Landed Assets; 11) RAPID: Collaborative Commanding and Monitoring of Lunar Assets; 12) Image Segmentation, Registration, Compression, and Matching; 13) Image Calibration; 14) Rapid ISS Power Availability Simulator; 15) A Method of Strengthening Composite/Metal Joints; 16) Pre-Finishing of SiC for Optical Applications; 17) Optimization of Indium Bump Morphology for Improved Flip Chip Devices; 18) Measuring Moisture Levels in Graphite Epoxy Composite Sandwich Structures; 19) Marshall Convergent Spray Formulation Improvement for High Temperatures; 20) Real-Time Deposition Monitor for Ultrathin Conductive Films; 21) Optimized Li-Ion Electrolytes Containing Triphenyl Phosphate as a Flame-Retardant Additive; 22) Radiation-Resistant Hybrid Lotus Effect for Achieving Photoelectrocatalytic Self-Cleaning Anticontamination Coatings; 23) Improved, Low-Stress Economical Submerged Pipeline; 24) Optical Fiber Array Assemblies for Space Flight on the Lunar Reconnaissance Orbiter; 25) Local Leak Detection and Health Monitoring of Pressurized Tanks; 26) Dielectric Covered Planar Antennas at Submillimeter Wavelengths for Terahertz Imaging; 27) Automated Cryocooler Monitor and Control System; 28) Broadband Achromatic Phase Shifter for a Nulling Interferometer; 29) Super Dwarf Wheat for Growth in Confined Spaces; 30) Fine Guidance Sensing for Coronagraphic Observatories; 31) Single-Antenna Temperature- and Humidity-Sounding Microwave Receiver; 32) Multi-Wavelength, Multi-Beam, and Polarization-Sensitive Laser Transmitter for Surface Mapping; 33) Optical Communications Link to Airborne Transceiver; 34) Ascent Heating Thermal Analysis on Spacecraft Adaptor Fairings; 35) Entanglement in Self-Supervised Dynamics; 36) Prioritized LT Codes; 37) Fast Image Texture Classification Using Decision Trees; 38) Constraint Embedding Technique for Multibody System Dynamics; 39) Improved Systematic Pointing Error Model for the DSN Antennas; 40) Observability and Estimation of Distributed Space Systems via Local Information-Exchange Networks; 41) More-Accurate Model of Flows in Rocket Injectors; 42) In-Orbit Instrument-Pointing Calibration Using the Moon as a Target; 43) Reliability of Ceramic Column Grid Array Interconnect Packages Under Extreme Temperatures; 44) Six Degrees-of-Freedom Ascent Control for Small-Body Touch and Go; and 45) Optical-Path-Difference Linear Mechanism for the Panchromatic Fourier Transform Spectrometer

NASA Technical Reports Server