Search CORE

7 research outputs found

ARC 2014 over-clocking KLT designs on FPGAs under process, voltage, and temperature variation

Author: Bouganis C-S
Duarte RP
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/11/2015
Field of study

Spiral - Imperial College Digital Repository

Variation-aware high-level DSP circuit design optimisation framework for FPGAs

Author: Policarpo Duarte Rui
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/08/2014
Field of study

The constant technology shrinking and the increasing demand for systems that operate under different power profiles with the maximum performance, have motivated the work in this thesis. Modern design tools that target FPGA devices take a conservative approach in the estimation of the maximum performance that can be achieved by a design when it is placed on a device, accounting for any variability in the fabrication process of the device. The work presented here takes a new view on the performance improvement of DSP designs by pushing them into the error-prone regime, as defined by the synthesis tools, and by investigating methodologies that reduce the impact of timing errors at the output of the system. In this work two novel error reduction techniques are proposed to address this problem. One is based on reduced-precision redundancy and the other on an error optimisation framework that uses information from a prior characterisation of the device. The first one is a generic architecture that is appended to existing arithmetic operators. The second defines the high-level parameters of the algorithm without using extra resources. Both of these methods allow to achieve graceful degradation whilst variation increases. A comparison of the new methods is laid against the existing methodologies, and conclusions drawn on the tradeoffs between their cost, in terms of resources and errors, and their benefits in terms of throughput. In some cases it is possible to double the performance of the design while still producing valid results.Open Acces

Spiral - Imperial College Digital Repository

High-Speed and Low-Energy On-Chip Communication Circuits.

Author: Seo Jae-Sun
Publication venue
Publication date: 01/01/2009
Field of study

Continuous technology scaling sharply reduces transistor delays, while fixed-length global wire delays have increased due to less wiring pitch with higher resistance and coupling capacitance. Due to this ever growing gap, long on-chip interconnects pose well-known latency, bandwidth, and energy challenges to high-performance VLSI systems. Repeaters effectively mitigate wire RC effects but do little to improve their energy costs. Moreover, the increased complexity and high level of integration requires higher wire densities, worsening crosstalk noise and power consumption of conventionally repeated interconnects. Such increasing concerns in global on-chip wires motivate circuits to improve wire performance and energy while reducing the number of repeaters. This work presents circuit techniques and investigation for high-performance and energy-efficient on-chip communication in the aspects of encoding, data compression, self-timed current injection, signal pre-emphasis, low-swing signaling, and technology mapping. The improved bus designs also consider the constraints of robust operation and performance/energy gains across process corners and design space. Measurement results from 5mm links on 65nm and 90nm prototype chips validate 2.5-3X improvement in energy-delay product.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/75800/1/jseo_1.pd

Deep Blue Documents at the University of Michigan

A Unified Framework for Over-Clocking Linear Projections on FPGAs under PVT Variation

Author: C.-S. Bouganis
J.-U. Chu
P. Sedcole
S. Das
S. Geman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref

Design of asynchronous microprocessor for power proportionality

Author: Rykunov Maxim
Publication venue: Newcastle University
Publication date: 01/01/2014
Field of study

PhD ThesisMicroprocessors continue to get exponentially cheaper for end users following Moore’s law, while the costs involved in their design keep growing, also at an exponential rate. The reason is the ever increasing complexity of processors, which modern EDA tools struggle to keep up with. This makes further scaling for performance subject to a high risk in the reliability of the system. To keep this risk low, yet improve the performance, CPU designers try to optimise various parts of the processor. Instruction Set Architecture (ISA) is a significant part of the whole processor design flow, whose optimal design for a particular combination of available hardware resources and software requirements is crucial for building processors with high performance and efficient energy utilisation. This is a challenging task involving a lot of heuristics and high-level design decisions. Another issue impacting CPU reliability is continuous scaling for power consumption. For the last decades CPU designers have been mainly focused on improving performance, but “keeping energy and power consumption in mind”. The consequence of this was a development of energy-efficient systems, where energy was considered as a resource whose consumption should be optimised. As CMOS technology was progressing, with feature size decreasing and power delivered to circuit components becoming less stable, the energy resource turned from an optimisation criterion into a constraint, sometimes a critical one. At this point power proportionality becomes one of the most important aspects in system design. Developing methods and techniques which will address the problem of designing a power-proportional microprocessor, capable to adapt to varying operating conditions (such as low or even unstable voltage levels) and application requirements in the runtime, is one of today’s grand challenges. In this thesis this challenge is addressed by proposing a new design flow for the development of an ISA for microprocessors, which can be altered to suit a particular hardware platform or a specific operating mode. This flow uses an expressive and powerful formalism for the specification of processor instruction sets called the Conditional Partial Order Graph (CPOG). The CPOG model captures large sets of behavioural scenarios for a microarchitectural level in a computationally efficient form amenable to formal transformations for synthesis, verification and automated derivation of asynchronous hardware for the CPU microcontrol. The feasibility of the methodology, novel design flow and a number of optimisation techniques was proven in a full size asynchronous Intel 8051 microprocessor and its demonstrator silicon. The chip showed the ability to work in a wide range of operating voltage and environmental conditions. Depending on application requirements and power budget our ASIC supports several operating modes: one optimised for energy consumption and the other one for performance. This was achieved by extending a traditional datapath structure with an auxiliary control layer for adaptable and fault tolerant operation. These and other optimisations resulted in a reconfigurable and adaptable implementation, which was proven by measurements, analysis and evaluation of the chip.EPSR

Newcastle University eTheses

Investigation of reconfigurable-accuracy approximate adder designs for image processing applications

Author: Al-Ma'aitah Khaled Suleiman
Publication venue: Newcastle University
Publication date: 01/01/2019
Field of study

Ph. D. Thesis.In the last decades, integrated circuits with CMOS technology show progressive scaling challenges of both increased power density and power dissipation. Meanwhile, high-performance requirements of current and future application operations show rapid demands of computing resources like power. This design conflict has pushed much effort to search for high performance and energy efficient design approach, such as approximate computing. Approximate computing exploits the error resilience of compute- intensive applications such as image processing applications to implement approximation design techniques with different levels of abstractions and scalability. The basic principle is to relax the strict accuracy requirements in favour of a lower design complexity, thereby achieving more computational performance (i.e., speed) and energy saving. The adder arithmetic unit is considered one of the essential computational blocks in most of the applications. As such, much effort has explored new designs of an efficient approximate adder design. This thesis presents an investigation into design enhancement, novel approximate adder designs and implementation approaches. The first approach introduces a modification to the error detection technique of a popular configurable-accuracy approximate adder design. The proposed lightweight error detection technique reduces the required gates of the error detection circuit, thus, mitigating the design area overhead. Furthermore, at the error correction process of the adder, we have proposed an extensive error detection while activating more than one correction stage concurrently. As a result, this ensures achieving an optimum accuracy of outputs for the worst case of quality requirements. In general, approximate (speculative) adder designs use the seg- mentation technique to divide the adder into multiple short length sub-adders which operate in parallel. Hence, this would limit the long chains of carry propagation and result in a better performance operations. However, the use of overlapped parts of sub-adders regarding a better carry speculation and then more accuracy be- comes a significant challenge of a large design area overhead. The second approach continues mitigating this challenge by present- ing a novel and simpler adder dividing technique to a number of sub-adders. The new method uses what is known as the carry-kill signal for both limiting the carry propagation and applying adder segmentation. Further, between every two adjacent sub-adders, one AND gate and one XOR gate are used for carry speculation and error (i.e., carry propagation) detection respectively. Thus, a significant reduction of the design overhead has been achieved, yet, with acceptable levels of output results accuracy. In the third final approach, simple logic OR gates are used to build the approximate adder while compensating the conventional full adders operation. The resulted approximate adder design presents very low complex- ity, high speed, and low power consumption. Furthermore, instead of augmenting error recovery circuit, short bit-length exact adders are used as correction stages to control the general level of output quality (i.e., without error detection overhead). At the final correc- tion stage, the proposed design would operate the same as an exact adder. To validate the efficiency of these approaches, a number of adders with different bit-widths are designed and synthesized showing considerable reductions in the critical delay, silicon area and more savings in energy consumption, compared to other existing ap- proaches. In addition to acceptable levels or output errors, which are extensively analysed for each proposed design. In this study, the proposed configurable adder designs exhibit energy/quality trade-offs at a different number of correction stages. These trade-offs can be effectively exploited to implement adders in applications, where energy can be gracefully minimised within the envelope of quality requirements. As such, designs implemen- tation in an image processing application known as Gaussian blur filter was introduced, demonstrating the loss in the image quality at each error correction stage. The output images showed promis- ing results to use the proposed designs for more energy-efficient applications, where output quality requirements can be relaxed.Mutah Universit

Newcastle University eTheses

Particle Physics Reference Library

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This second open access volume of the handbook series deals with detectors, large experimental facilities and data handling, both for accelerator and non-accelerator based experiments. It also covers applications in medicine and life sciences. A joint CERN-Springer initiative, the “Particle Physics Reference Library” provides revised and updated contributions based on previously published material in the well-known Landolt-Boernstein series on particle physics, accelerators and detectors (volumes 21A,B1,B2,C), which took stock of the field approximately one decade ago. Central to this new initiative is publication under full open access

OAPEN Library