1,132 research outputs found
Design of asynchronous microprocessor for power proportionality
PhD ThesisMicroprocessors continue to get exponentially cheaper for end users following Mooreâs
law, while the costs involved in their design keep growing, also at an exponential rate.
The reason is the ever increasing complexity of processors, which modern EDA tools
struggle to keep up with. This makes further scaling for performance subject to a high
risk in the reliability of the system. To keep this risk low, yet improve the performance,
CPU designers try to optimise various parts of the processor. Instruction Set Architecture
(ISA) is a significant part of the whole processor design flow, whose optimal design
for a particular combination of available hardware resources and software requirements
is crucial for building processors with high performance and efficient energy utilisation.
This is a challenging task involving a lot of heuristics and high-level design decisions.
Another issue impacting CPU reliability is continuous scaling for power consumption. For
the last decades CPU designers have been mainly focused on improving performance, but
âkeeping energy and power consumption in mindâ. The consequence of this was a development
of energy-efficient systems, where energy was considered as a resource whose
consumption should be optimised. As CMOS technology was progressing, with feature
size decreasing and power delivered to circuit components becoming less stable, the
energy resource turned from an optimisation criterion into a constraint, sometimes a critical
one. At this point power proportionality becomes one of the most important aspects
in system design. Developing methods and techniques which will address the problem
of designing a power-proportional microprocessor, capable to adapt to varying operating
conditions (such as low or even unstable voltage levels) and application requirements in
the runtime, is one of todayâs grand challenges. In this thesis this challenge is addressed
by proposing a new design flow for the development of an ISA for microprocessors, which
can be altered to suit a particular hardware platform or a specific operating mode. This
flow uses an expressive and powerful formalism for the specification of processor instruction
sets called the Conditional Partial Order Graph (CPOG). The CPOG model captures
large sets of behavioural scenarios for a microarchitectural level in a computationally
efficient form amenable to formal transformations for synthesis, verification and automated
derivation of asynchronous hardware for the CPU microcontrol. The feasibility of
the methodology, novel design flow and a number of optimisation techniques was proven
in a full size asynchronous Intel 8051 microprocessor and its demonstrator silicon. The
chip showed the ability to work in a wide range of operating voltage and environmental
conditions. Depending on application requirements and power budget our ASIC supports
several operating modes: one optimised for energy consumption and the other one for
performance. This was achieved by extending a traditional datapath structure with an
auxiliary control layer for adaptable and fault tolerant operation. These and other optimisations
resulted in a reconfigurable and adaptable implementation, which was proven
by measurements, analysis and evaluation of the chip.EPSR
Interpreted graph models
A model class called an Interpreted Graph Model (IGM) is defined. This class includes a large number of graph-based models that are used in asynchronous circuit design and other applications of concurrecy. The defining characteristic of this model class is an underlying static graph-like structure where behavioural semantics are attached using additional entities, such as tokens or node/arc states. The similarities in notation and expressive power allow a number of operations on these formalisms, such as visualisation, interactive simulation, serialisation, schematic entry and model conversion to be generalised. A software framework called Workcraft was developed to take advantage of these properties of IGMs. Workcraft provides an environment for rapid prototyping of graph-like models and related tools. It provides a large set of standardised functions that considerably facilitate the task of providing tool support for any IGM. The concept of Interpreted Graph Models is the result of research on methods of application of lower level models, such as Petri nets, as a back-end for simulation and verification of higher level models that are more easily manipulated. The goal is to achieve a high degree of automation of this process. In particular, a method for verification of speed-independence of asynchronous circuits is presented. Using this method, the circuit is specified as a gate netlist and its environment is specified as a Signal Transition Graph. The circuit is then automatically translated into a behaviourally equivalent Petri net model. This model is then composed with the specification of the environment. A number of important properties can be established on this compound model, such as the absence of deadlocks and hazards. If a trace is found that violates the required property, it is automatically interpreted in terms of switching of the gates in the original gate-level circuit specification and may be presented visually to the circuit designer. A similar technique is also used for the verification of a model called Static Data Flow Structure (SDFS). This high level model describes the behaviour of an asynchronous data path. SDFS is particularly interesting because it models complex behaviours such as preemption, early evaluation and speculation. Preemption is a technique which allows to destroy data objects in a computation pipeline if the result of computation is no longer needed, reducing the power consumption. Early evaluation allows a circuit to compute the output using a subset of its inputs and preempting the inputs which are not needed. In speculation, all conflicting branches of computation run concurrently without waiting for the selecting condition; once the selecting condition is computed the unneeded branches are preempted. The automated Petri net based verification technique is especially useful in this case because of the complex nature of these features. As a result of this work, a number of cases are presented where the concept of IGMs and the Workcraft tool were instrumental. These include the design of two different types of arbiter circuits, the design and debugging of the SDFS model, synthesis of asynchronous circuits from the Conditional Partial Order Graph model and the modification of the workflow of Balsa asynchronous circuit synthesis system.EThOS - Electronic Theses Online ServiceEPSRCGBUnited Kingdo
Verification and synthesis of asynchronous control circuits using petri net unfoldings
PhD ThesisDesign of asynchronous control circuits has traditionally been associated with application of
formal methods. Event-based models, such as Petri nets, provide a compact and easy to
understand way of specifying asynchronous behaviour. However, analysis of their behavioural
properties is often hindered by the problem of exponential growth of reachable state space.
This work proposes a new method for analysis of asynchronous circuit models based on Petri
nets. The new approach is called PN-unfolding segment. It extends and improves existing
Petri nets unfolding approaches. In addition, this thesis proposes a new analysis technique
for Signal Transition Graphs along with an efficient verification technique which is also based
on the Petri net unfolding. The former is called Full State Graph, the latter - STG-unfolding
segment. The boolean logic synthesis is an integral part of the asynchronous circuit design
process. In many cases, even if the verification of an asynchronous circuit specification has
been performed successfully, it is impossible to obtain its implementation using existing methods
because they are based on the reachability analysis. A new approach is proposed here
for automated synthesis of speed-independent circuits based on the STG-unfolding segment
constructed during the verification of the circuit's specification. Finally, this work presents
experimental results showing the need for the new Petri net unfolding techniques and confirming
the advantages of application of partial order approach to analysis, verification and
synthesis of asynchronous circuits.The Research Committee, Newcastle University:
Overseas Research Studentship Award
Asynchronous techniques for new generation variation-tolerant FPGA
PhD ThesisThis thesis presents a practical scenario for asynchronous logic implementation that would benefit the modern Field-Programmable Gate Arrays (FPGAs) technology in improving reliability. A method based on Asynchronously-Assisted Logic (AAL) blocks is proposed here in order to provide the right degree of variation tolerance, preserve as much of the traditional FPGAs structure as possible, and make use of asynchrony only when necessary or beneficial for functionality. The newly proposed AAL introduces extra underlying hard-blocks that support asynchronous interaction only when needed and at minimum overhead. This has the potential to avoid the obstacles to the progress of asynchronous designs, particularly in terms of area and power overheads. The proposed approach provides a solution that is complementary to existing variation tolerance techniques such as the late-binding technique, but improves the reliability of the system as well as reducing the designâs margin headroom when implemented on programmable logic devices (PLDs) or FPGAs. The proposed method suggests the deployment of configurable AAL blocks to reinforce only the variation-critical paths (VCPs) with the help of variation maps, rather than re-mapping and re-routing. The layout level results for this method's worst case increase in the CLBâs overall size only of 6.3%. The proposed strategy retains the structure of the global interconnect resources that occupy the lionâs share of the modern FPGAâs soft fabric, and yet permits the dual-rail
iv
completion-detection (DR-CD) protocol without the need to globally double the interconnect resources. Simulation results of global and interconnect voltage variations demonstrate the robustness of the method
Store-and-forward CDC packet transmission in digital systems
Data transmission across clock domains is a major point of interest by ASIC designers as a limited bandwidth can be responsible for a processing power bottleneck that affects the entire system. Clock domain crossing is inherently expensive in terms of area and latency as it requires overcoming issues related to the physical nature of integrated circuit latches, in particular metastability. These issues are dangerous as they do not manifest in RTL simulation. Due to this, designers often opt for generic data synchronisation solutions that are not fully suited to the nature of the data being transferred.This dissertation presents a clock domain crossing mechanism that allows two clock domains to share a common memory to transfer packet-based data. The mechanism consists in a memory controller that coordinates commands from a push (write) domain and a pop (read) domain.The main objective of the memory controller project consists in the implementation of efficient synchronisation methods in order to eliminate the need for separate synchronisation and storage memories. In essence, the controller is an extension of an asynchronous FIFO controller with added functionality, supporting multiple virtual FIFOs and variable length data packets
On the Distribution of Control in Asynchronous Processor Architectures
Institute for Computing Systems ArchitectureThe effective performance of computer systems is to a large measure
determined by the synergy between the processor architecture, the
instruction set and the compiler. In the past, the sequencing of
information within processor architectures has normally been
synchronous: controlled centrally by a clock. However, this global
signal could possibly limit the future gains in performance that can
potentially be achieved through improvements in implementation
technology.
This thesis investigates the effects of relaxing this strict synchrony
by distributing control within processor architectures through the use
of a novel asynchronous design model known as a micronet. The impact
of asynchronous control on the performance of a RISC-style processor
is explored at different levels. Firstly, improvements in the
performance of individual instructions by exploiting actual run-time
behaviours are demonstrated. Secondly, it is shown that micronets are
able to exploit further (both spatial and temporal) instructionlevel
parallelism (ILP) efficiently through the distribution of control to
datapath resources. Finally, exposing fine-grain concurrency within a
datapath can only be of benefit to a computer system if it can easily
be exploited by the compiler. Although compilers for micronet-based
asynchronous processors may be considered to be more complex than
their synchronous counterparts, it is shown that the variable
execution time of an instruction does not adversely affect the
compiler's ability to schedule code efficiently. In conclusion, the
modelling of a processor's datapath as a micronet permits the
exploitation of both finegrain ILP and actual run-time delays, thus
leading to the efficient utilisation of functional units and in turn
resulting in an improvement in overall system performance
Compositional approach to design of digital circuits
PhD ThesisIn this work we explore compositional methods for design of digital circuits with
the aim of improving existing methodoligies for desigh reuse. We address compositionality
techniques looking from both structural and behavioural perspectives.
First we consider the existing method of handshake circuit optimisation via control
path resynthesis using Petri nets, an approach using structural composition. In
that approach labelled Petri net parallel composition plays an important role and
we introduce an improvement to the parallel composition algorithm, reducing the
number of redundant places in the resulting Petri net representations. The proposed
algorithm applies to labelled Petri nets in general and can be applied outside of the
handshake circuit optimisation use case.
Next we look at the conditional partial order graph (CPOG) formalism, an approach
that allows for a convenient representation of systems consisting of multiple
alternative system behaviours, a phenomenon we call behavioural composition. We
generalise the notion of CPOG and identify an algebraic structure on a more general
notion of parameterised graph. This allows us to do equivalence-preserving manipulation
of graphs in symbolic form, which simplifies specification and reasoning about
systems defined in this way, as displayed by two case studies.
As a third contribution we build upon the previous work of CPOG synthesis used
to generate binary encoding of microcontroller instruction sets and design the corresponding
instruction decoder logic. The proposed CPOG synthesis technique solves
the optimisation problem for the general case, reducing it to Boolean satisfiability
problem and uses existing SAT solving tools to obtain the result.This work was
supported by a studentship from Newcastle University EECE school, EPSRC grant
EP/G037809/1 (VERDAD) and EPSRC grant EP/K001698/1 (UNCOVER).
i
- âŠ