Search CORE

498 research outputs found

Ultrafast Temperature Profile Calculation in Ic Chips

Author: Bian Z.
Kemper T.
Shakouri A.
Zhang Y.
Publication venue
Publication date: 01/01/2006
Field of study

One of the crucial steps in the design of an integrated circuit is the minimization of heating and temperature non-uniformity. Current temperature calculation methods, such as finite element analysis and resistor networks have considerable computation times, making them incompatible for use in routing and placement optimization algorithms. In an effort to reduce the computation time, we have developed a new method, deemed power blurring, for calculating temperature distributions using a matrix convolution technique in analogy with image blurring. For steady state analysis, power blurring was able to predict hot spot temperatures within 1 degree C with computation times 3 orders of magnitude faster than FEA. For transient analysis the computation times where enhanced by a factor of 1000 for a single pulse and around 100 for multiple frequency application, while predicting hot spot temperature within about 1 degree C. The main strength of the power blurring technique is that it exploits the dominant heat spreading in the silicon substrate and it uses superposition principle. With one or two finite element simulations, the temperature point spread function for a sophisticated package can be calculated. Additional simulations could be used to improve the accuracy of the point spread function in different locations on the chip. In this calculation, we considered the dominant heat transfer path through the back of the IC chip and the heat sink. Heat transfer from the top of the chip through metallization layers and the board is usually a small fraction of the total heat dissipation and it is neglected in this analysis.Comment: Submitted on behalf of TIMA Editions (http://irevues.inist.fr/tima-editions

arXiv.org e-Print Archive

NASA Space Engineering Research Center for VLSI System Design

Author
Publication venue
Publication date
Field of study

This annual report outlines the activities of the past year at the NASA SERC on VLSI Design. Highlights for this year include the following: a significant breakthrough was achieved in utilizing commercial IC foundries for producing flight electronics; the first two flight qualified chips were designed, fabricated, and tested and are now being delivered into NASA flight systems; and a new technology transfer mechanism has been established to transfer VLSI advances into NASA and commercial systems

NASA Technical Reports Server

Analogue neuromorphic systems.

Author: Chesnokov Vyacheslav
Publication venue: 'De Montfort University'
Publication date: 01/01/2001
Field of study

This thesis addresses a new area of science and technology, that of neuromorphic systems, namely the problems and prospects of analogue neuromorphic systems. The subject is subdivided into three chapters. Chapter 1 is an introduction. It formulates the oncoming problem of the creation of highly computationally costly systems of nonlinear information processing (such as artificial neural networks and artificial intelligence systems). It shows that an analogue technology could make a vital contribution to the creation such systems. The basic principles of creation of analogue neuromorphic systems are formulated. The importance will be emphasised of the principle of orthogonality for future highly efficient complex information processing systems. Chapter 2 reviews the basics of neural and neuromorphic systems and informs on the present situation in this field of research, including both experimental and theoretical knowledge gained up-to-date. The chapter provides the necessary background for correct interpretation of the results reported in Chapter 3 and for a realistic decision on the direction for future work. Chapter 3 describes my own experimental and computational results within the framework of the subject, obtained at De Montfort University. These include: the building of (i) Analogue Polynomial Approximator/lnterpolatoriExtrapolator, (ii) Synthesiser of orthogonal functions, (iii) analogue real-time video filter (performing the homomorphic filtration), (iv) Adaptive polynomial compensator of geometrical distortions of CRT- monitors, (v) analogue parallel-learning neural network (backpropagation algorithm). Thus, this thesis makes a dual contribution to the chosen field: it summarises the present knowledge on the possibility of utilising analogue technology in up-to-date and future computational systems, and it reports new results within the framework of the subject. The main conclusion is that due to its promising power characteristics, small sizes and high tolerance to degradation, the analogue neuromorphic systems will playa more and more important role in future computational systems (in particular in systems of artificial intelligence)

De Montfort University Open Research Archive

Method of Images for the Fast Calculation of Temperature Distributions in Packaged VLSI Chips

Author: Hériz Virginia Martín
Kang S. -M.
Kemper T.
Park J. -H.
Shakouri A.
Publication venue
Publication date: 01/01/2007
Field of study

Thermal aware routing and placement algorithms are important in industry. Currently, there are reasonably fast Green's function based algorithms that calculate the temperature distribution in a chip made from a stack of different materials. However, the layers are all assumed to have the same size, thus neglecting the important fact that the thermal mounts which are placed underneath the chip can be significantly larger than the chip itself. In an earlier publication, we showed that the image blurring technique can be used to calculate quickly temperature distribution in realistic packages. For this method to be effective, temperature distribution for several point heat sources at the center and at the corner and edges of the chip should be calculated using finite element analysis (FEA) or measured. In addition, more accurate results require correction by a weighting function that will need several FEA simulations. In this paper, we introduce the method of images that take the symmetry of the thermal boundary conditions into account. Thus with only "two" finite element simulations, the steady-state temperature distribution for an arbitrary complex power dissipation profile in a packaged chip can be calculated. Several simulation results are presented. It is shown that the power blurring technique together with the method of images can reproduce the temperature profile with an error less than 0.5%.Comment: Submitted on behalf of TIMA Editions (http://irevues.inist.fr/tima-editions

arXiv.org e-Print Archive

I-Revues

HAL Descartes

Hal-Diderot

NASA Space Engineering Research Center for VLSI systems design

Author
Publication venue
Publication date
Field of study

This annual review reports the center's activities and findings on very large scale integration (VLSI) systems design for 1990, including project status, financial support, publications, the NASA Space Engineering Research Center (SERC) Symposium on VLSI Design, research results, and outreach programs. Processor chips completed or under development are listed. Research results summarized include a design technique to harden complementary metal oxide semiconductors (CMOS) memory circuits against single event upset (SEU); improved circuit design procedures; and advances in computer aided design (CAD), communications, computer architectures, and reliability design. Also described is a high school teacher program that exposes teachers to the fundamentals of digital logic design

NASA Technical Reports Server

Energy efficient hardware acceleration of multimedia processing tools

Author: Kinane Andrew
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/01/2006
Field of study

The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

Irish Universities

DCU Online Research Access Service

A PC/AT-based ICT image archiving system.

Author
Publication venue: Department of Cultural and Religious Studies, The Chinese University of Hong Kong
Publication date: 01/01/1991
Field of study

by Ringo Wai-kit Lam.Thesis (M.Phil.)--Chinese University of Hong Kong, 1991.Includes bibliographical references.ACKNOWLEDGEMENTSABSTRACTLIST OF FIGURES --- p.iLIST OF TABLES --- p.iiiChapter CHAPTER 1 --- INTRODUCTION --- p.1-1Chapter 1.1 --- Introduction --- p.1-1Chapter 1.2 --- Transform Coding Theory --- p.1-2Chapter 1.2.1 --- Image Transform Coder and Decoder --- p.1-2Chapter 1.2.2 --- Transformation --- p.1-4Chapter 1.2.3 --- Bit Allocation --- p.1-5Chapter 1.2.4 --- Quantization --- p.1-7Chapter 1.2.5 --- Entropy Coding --- p.1-8Chapter 1.2.6 --- Error of Transform Coding --- p.1-9Chapter 1.3 --- Organization of The Thesis --- p.1-10Chapter CHAPTER 2 --- 2D INTEGER COSINE TRANSFORM CHIP SET --- p.2-1Chapter 2.1 --- Introduction --- p.2-1Chapter 2.2 --- The Integer Cosine Transform (ICT) --- p.2-2Chapter 2.3 --- LSI Implementation --- p.2-4Chapter 2.3.1 --- ICT Chip --- p.2-4Chapter 2.3.2 --- Data Sequencer --- p.2-7Chapter 2.4 --- Design Considerations --- p.2-8Chapter 2.4.1 --- ICT chip --- p.2-9Chapter 2.4.1.1 --- Specifications --- p.2-9Chapter 2.4.1.2 --- I/O Bit Length Consideration --- p.2-10Chapter 2.4.1.3 --- Selection of The Transform Matrix --- p.2-12Chapter 2.4.2 --- Data Sequencer --- p.2-16Chapter 2.4.2.1 --- Normal Operation --- p.2-16Chapter 2.4.2.2 --- Low-pass Filtering Operation --- p.2-16Chapter 2.4.2.3 --- Subsampling Operation --- p.2-17Chapter 2.5 --- Architecture --- p.2-18Chapter 2.5.1 --- ICT chip --- p.2-18Chapter 2.5.1.1 --- Input Stage --- p.2-18Chapter 2.5.1.2 --- Control Block --- p.2-19Chapter 2.5.1.3 --- Multiplier --- p.2-19Chapter 2.5.1.4 --- Accumulator --- p.2-20Chapter 2.5.1.5 --- Output Stage --- p.2-21Chapter 2.5.2 --- Data Sequencer --- p.2-21Chapter 2.5.2.1 --- Input Stage --- p.2-22Chapter 2.5.2.2 --- Control Logic --- p.2-22Chapter 2.5.2.3 --- Internal Storage --- p.2-23Chapter 2.5.2.4 --- Output Stage --- p.2-24Chapter 2.6 --- 2D Integer Cosine Transform System --- p.2-24Chapter 2.6.1 --- Hardware Architecture --- p.2-24Chapter 2.6.2 --- Timing --- p.2-26Chapter 2.7 --- Conclusion --- p.2-27Chapter CHAPTER 3 --- A PC/AT-BASED IMAGE ARCHIVING SYSTEM --- p.3-1Chapter 3.1 --- Introduction --- p.3-1Chapter 3.2 --- Design Consideration --- p.3-1Chapter 3.2.1 --- Specifications --- p.3-2Chapter 3.2.1.1 --- Operations Supported --- p.3-2Chapter 3.2.1.2 --- Image Formats --- p.3-3Chapter 3.2.1.3 --- Software --- p.3-6Chapter 3.2.2 --- Storage Format of the Coded Image --- p.3-6Chapter 3.3 --- Hardware Architecture --- p.3-8Chapter 3.3.1 --- Input Stage --- p.3-11Chapter 3.3.2 --- Inverse Transform Address Generator --- p.3-11Chapter 3.3.3 --- Input Memory --- p.3-13Chapter 3.3.3.1 --- Address Map --- p.3-14Chapter 3.3.3.2 --- Bit Map --- p.3-14Chapter 3.3.3.3 --- Class Map --- p.3-15Chapter 3.3.4 --- ICT Processor --- p.3-15Chapter 3.3.5 --- Output Memory --- p.3-16Chapter 3.3.6 --- Address Generator --- p.3-16Chapter 3.3.6.1 --- Address Generator 1 (AG1) --- p.3-17Chapter 3.3.6.2 --- Address Generator 2 (AG2) --- p.3-21Chapter 3.3.6.3 --- Address Generator 3 (AG3) --- p.3-22Chapter 3.3.7 --- Control Register --- p.3-22Chapter 3.3.8 --- Interface Consideration --- p.3-23Chapter 3.3.9 --- Frame Buffer --- p.3-23Chapter 3.4 --- Software Structure --- p.3-23Chapter 3.4.1 --- Main Menu --- p.3-24Chapter 3.4.2 --- Forward Transform --- p.3-25Chapter 3.4.3 --- Inverse Transform --- p.3-25Chapter 3.4.3.1 --- Normal --- p.3-26Chapter 3.4.3.2 --- Subsampling --- p.3-26Chapter 3.4.3.3 --- Filtering --- p.3-26Chapter 3.4.3.4 --- Album --- p.3-27Chapter 3.4.3.5 --- Display and System --- p.3-28Chapter 3.5 --- Conclusion --- p.3-29Chapter CHAPTER 4 --- SYSTEM PERFORMANCE EVALUATION --- p.4-1Chapter 4.1 --- Introduction --- p.4-1Chapter 4.2 --- Result of Image Display --- p.4-1Chapter 4.3 --- Computation Time Requirement --- p.4-12Chapter 4.4 --- Comparison to Other Transform Chips and Image Transform Systems --- p.4-16Chapter 4.5 --- Conclusion --- p.4-20Chapter CHAPTER 5 --- CONCLUSION --- p.5-1Chapter 5.1 --- Further Development --- p.5-1Chapter 5.1.1 --- Employment of JPEG Scheme --- p.5-1Chapter 5.1.2 --- ICT Chip Set --- p.5-5Chapter 5.2 --- Summary of the Image Archiving System --- p.5-6Chapter CHAPTER 6 --- REFERENCES --- p.6-1Chapter CHAPTER 7 --- APPENDIX --- p.7-

CUHK Digital Repository

Low Power Architectures for MPEG-4 AVC/H.264 Video Compression

Author: Bahari Asral
Publication venue: The University of Edinburgh
Publication date: 01/01/2008
Field of study

Edinburgh Research Archive

Implementation of JPEG compression and motion estimation on FPGA hardware

Author: Gopalakrishnan Ramakrishna
Publication venue: Digital Scholarship@UNLV
Publication date: 01/01/2008
Field of study

A hardware implementation of JPEG allows for real-time compression in data intensivve applications, such as high speed scanning, medical imaging and satellite image transmission. Implementation options include dedicated DSP or media processors, FPGA boards, and ASICs. Factors that affect the choice of platform selection involve cost, speed, memory, size, power consumption, and case of reconfiguration. The proposed hardware solution is based on a Very high speed integrated circuit Hardware Description Language (VHDL) implememtation of the codec with prefered realization using an FPGA board due to speed, cost and flexibility factors; The VHDL language is commonly used to model hardware impletations from a top down perspective. The VHDL code may be simulated to correct mistakes and subsequently synthesized into hardware using a synthesis tool, such as the xilinx ise suite. The same VHDL code may be synthesized into a number of sifferent hardware architetcures based on constraints given. For example speed was the major constraint when synthesizing the pipeline of jpeg encoding and decoding, while chip area and power consumption were primary constraints when synthesizing the on-die memory because of large area. Thus, there is a trade off between area and speed in logic synthesis

University of Nevada, Las Vegas Repository

Recommended from our members

Concurrent error detection in 2-D separable linear transform

Author: Hu Shih-Hsin
Publication venue
Publication date: 02/02/2017
Field of study

As process technology continues to scale to smaller geometries and reduces the supply voltage, reliability of the resulting semiconductor becomes a greater concern. The effect of deep submicron noise, soft errors, variation, and aging degradation pose challenges on the functional correctness of VLSI systems and places roadblocks on reductions in scale. On the other side, as computing moves toward mobile, the energy efficiency of digital systems becomes one of the most important design metrics. However, reliability and energy efficiency are contradicting design requirements. Adding a voltage guard band is the most common method to mitigate the reliability impacts in such instances. Low power design technique like voltage over-scaling (VOS) even reduces the power by scaling the supply voltage just before data-dependant timing errors start to appear. Concurrent error detection is the solution to tackle reliability and energy-efficiency in a unified manner. Fault tolerance can be deployed at different design hierarchies. Given its low overhead, algorithm level error detection is an attractive approach. In this work, a generic weighted checksum code based error detection algorithm targeted generic 2-D separable linear transform is proposed. This technique encodes the input array at the 2-D linear trans- formation level, and algorithms are designed to operate on encoded data and produce encoded output data. The proposed error detection technique is a system-level method and therefore can be used in existing hardware or software 2-D linear transformation architectures with low overhead. The mathematic proof of the algorithm is provided within the scope of this dissertation. The checksum weighting vector for several common transforms are derived as examples, error detection cost and algorithm effectiveness are analyzed. In traditional fault tolerance study, the error is often evaluated at the boolean level. Many DSP applications, like 2-D linear transformation used in the multimedia compression system, do not require exactly correct results, but rather that the quality of the output is within the acceptable range. A generic quality aware error detection in the 2-D separable linear transform is proposed by extending the above property and defining the errors at the functional level. As an example, the quality-aware error detection technique is deployed on a low-power wavelet lifting transform architecture in JPEG2000. A low-cost Signal to Noise Ratio (SNR) aware detection logic based on proposed scheme is integrated into the discrete wavelet lifting transform architecture. This detection logic checks whether the image quality degradation caused by voltage over-scaling induced timing errors is acceptable and determines the optimal voltage set point in operating conditions at run time. This novel quality-based error detection approach is significantly different from traditional error detection schemes which look for exact data equivalence. A simulation result for one design shows that the supply voltage can be scaled down to 75% of the nominal voltage in typical process corner without significant image quality degradation, which translates to 9.15mW power consumption (44% power saving).Electrical and Computer Engineerin

Texas ScholarWorks