1,132 research outputs found

    Conditional Partial Order Graphs and Dynamically Reconfigurable Control Synthesis

    Full text link

    Conditional partial order graphs and dynamically reconfigurable control synthesis

    Full text link

    Design of asynchronous microprocessor for power proportionality

    Get PDF
    PhD ThesisMicroprocessors continue to get exponentially cheaper for end users following Moore’s law, while the costs involved in their design keep growing, also at an exponential rate. The reason is the ever increasing complexity of processors, which modern EDA tools struggle to keep up with. This makes further scaling for performance subject to a high risk in the reliability of the system. To keep this risk low, yet improve the performance, CPU designers try to optimise various parts of the processor. Instruction Set Architecture (ISA) is a significant part of the whole processor design flow, whose optimal design for a particular combination of available hardware resources and software requirements is crucial for building processors with high performance and efficient energy utilisation. This is a challenging task involving a lot of heuristics and high-level design decisions. Another issue impacting CPU reliability is continuous scaling for power consumption. For the last decades CPU designers have been mainly focused on improving performance, but “keeping energy and power consumption in mind”. The consequence of this was a development of energy-efficient systems, where energy was considered as a resource whose consumption should be optimised. As CMOS technology was progressing, with feature size decreasing and power delivered to circuit components becoming less stable, the energy resource turned from an optimisation criterion into a constraint, sometimes a critical one. At this point power proportionality becomes one of the most important aspects in system design. Developing methods and techniques which will address the problem of designing a power-proportional microprocessor, capable to adapt to varying operating conditions (such as low or even unstable voltage levels) and application requirements in the runtime, is one of today’s grand challenges. In this thesis this challenge is addressed by proposing a new design flow for the development of an ISA for microprocessors, which can be altered to suit a particular hardware platform or a specific operating mode. This flow uses an expressive and powerful formalism for the specification of processor instruction sets called the Conditional Partial Order Graph (CPOG). The CPOG model captures large sets of behavioural scenarios for a microarchitectural level in a computationally efficient form amenable to formal transformations for synthesis, verification and automated derivation of asynchronous hardware for the CPU microcontrol. The feasibility of the methodology, novel design flow and a number of optimisation techniques was proven in a full size asynchronous Intel 8051 microprocessor and its demonstrator silicon. The chip showed the ability to work in a wide range of operating voltage and environmental conditions. Depending on application requirements and power budget our ASIC supports several operating modes: one optimised for energy consumption and the other one for performance. This was achieved by extending a traditional datapath structure with an auxiliary control layer for adaptable and fault tolerant operation. These and other optimisations resulted in a reconfigurable and adaptable implementation, which was proven by measurements, analysis and evaluation of the chip.EPSR

    An integrated soft- and hard-programmable multithreaded architecture

    Get PDF

    Interpreted graph models

    Get PDF
    A model class called an Interpreted Graph Model (IGM) is defined. This class includes a large number of graph-based models that are used in asynchronous circuit design and other applications of concurrecy. The defining characteristic of this model class is an underlying static graph-like structure where behavioural semantics are attached using additional entities, such as tokens or node/arc states. The similarities in notation and expressive power allow a number of operations on these formalisms, such as visualisation, interactive simulation, serialisation, schematic entry and model conversion to be generalised. A software framework called Workcraft was developed to take advantage of these properties of IGMs. Workcraft provides an environment for rapid prototyping of graph-like models and related tools. It provides a large set of standardised functions that considerably facilitate the task of providing tool support for any IGM. The concept of Interpreted Graph Models is the result of research on methods of application of lower level models, such as Petri nets, as a back-end for simulation and verification of higher level models that are more easily manipulated. The goal is to achieve a high degree of automation of this process. In particular, a method for verification of speed-independence of asynchronous circuits is presented. Using this method, the circuit is specified as a gate netlist and its environment is specified as a Signal Transition Graph. The circuit is then automatically translated into a behaviourally equivalent Petri net model. This model is then composed with the specification of the environment. A number of important properties can be established on this compound model, such as the absence of deadlocks and hazards. If a trace is found that violates the required property, it is automatically interpreted in terms of switching of the gates in the original gate-level circuit specification and may be presented visually to the circuit designer. A similar technique is also used for the verification of a model called Static Data Flow Structure (SDFS). This high level model describes the behaviour of an asynchronous data path. SDFS is particularly interesting because it models complex behaviours such as preemption, early evaluation and speculation. Preemption is a technique which allows to destroy data objects in a computation pipeline if the result of computation is no longer needed, reducing the power consumption. Early evaluation allows a circuit to compute the output using a subset of its inputs and preempting the inputs which are not needed. In speculation, all conflicting branches of computation run concurrently without waiting for the selecting condition; once the selecting condition is computed the unneeded branches are preempted. The automated Petri net based verification technique is especially useful in this case because of the complex nature of these features. As a result of this work, a number of cases are presented where the concept of IGMs and the Workcraft tool were instrumental. These include the design of two different types of arbiter circuits, the design and debugging of the SDFS model, synthesis of asynchronous circuits from the Conditional Partial Order Graph model and the modification of the workflow of Balsa asynchronous circuit synthesis system.EThOS - Electronic Theses Online ServiceEPSRCGBUnited Kingdo

    Verification and synthesis of asynchronous control circuits using petri net unfoldings

    Get PDF
    PhD ThesisDesign of asynchronous control circuits has traditionally been associated with application of formal methods. Event-based models, such as Petri nets, provide a compact and easy to understand way of specifying asynchronous behaviour. However, analysis of their behavioural properties is often hindered by the problem of exponential growth of reachable state space. This work proposes a new method for analysis of asynchronous circuit models based on Petri nets. The new approach is called PN-unfolding segment. It extends and improves existing Petri nets unfolding approaches. In addition, this thesis proposes a new analysis technique for Signal Transition Graphs along with an efficient verification technique which is also based on the Petri net unfolding. The former is called Full State Graph, the latter - STG-unfolding segment. The boolean logic synthesis is an integral part of the asynchronous circuit design process. In many cases, even if the verification of an asynchronous circuit specification has been performed successfully, it is impossible to obtain its implementation using existing methods because they are based on the reachability analysis. A new approach is proposed here for automated synthesis of speed-independent circuits based on the STG-unfolding segment constructed during the verification of the circuit's specification. Finally, this work presents experimental results showing the need for the new Petri net unfolding techniques and confirming the advantages of application of partial order approach to analysis, verification and synthesis of asynchronous circuits.The Research Committee, Newcastle University: Overseas Research Studentship Award

    Asynchronous techniques for new generation variation-tolerant FPGA

    Get PDF
    PhD ThesisThis thesis presents a practical scenario for asynchronous logic implementation that would benefit the modern Field-Programmable Gate Arrays (FPGAs) technology in improving reliability. A method based on Asynchronously-Assisted Logic (AAL) blocks is proposed here in order to provide the right degree of variation tolerance, preserve as much of the traditional FPGAs structure as possible, and make use of asynchrony only when necessary or beneficial for functionality. The newly proposed AAL introduces extra underlying hard-blocks that support asynchronous interaction only when needed and at minimum overhead. This has the potential to avoid the obstacles to the progress of asynchronous designs, particularly in terms of area and power overheads. The proposed approach provides a solution that is complementary to existing variation tolerance techniques such as the late-binding technique, but improves the reliability of the system as well as reducing the design’s margin headroom when implemented on programmable logic devices (PLDs) or FPGAs. The proposed method suggests the deployment of configurable AAL blocks to reinforce only the variation-critical paths (VCPs) with the help of variation maps, rather than re-mapping and re-routing. The layout level results for this method's worst case increase in the CLB’s overall size only of 6.3%. The proposed strategy retains the structure of the global interconnect resources that occupy the lion’s share of the modern FPGA’s soft fabric, and yet permits the dual-rail iv completion-detection (DR-CD) protocol without the need to globally double the interconnect resources. Simulation results of global and interconnect voltage variations demonstrate the robustness of the method

    Store-and-forward CDC packet transmission in digital systems

    Get PDF
    Data transmission across clock domains is a major point of interest by ASIC designers as a limited bandwidth can be responsible for a processing power bottleneck that affects the entire system. Clock domain crossing is inherently expensive in terms of area and latency as it requires overcoming issues related to the physical nature of integrated circuit latches, in particular metastability. These issues are dangerous as they do not manifest in RTL simulation. Due to this, designers often opt for generic data synchronisation solutions that are not fully suited to the nature of the data being transferred.This dissertation presents a clock domain crossing mechanism that allows two clock domains to share a common memory to transfer packet-based data. The mechanism consists in a memory controller that coordinates commands from a push (write) domain and a pop (read) domain.The main objective of the memory controller project consists in the implementation of efficient synchronisation methods in order to eliminate the need for separate synchronisation and storage memories. In essence, the controller is an extension of an asynchronous FIFO controller with added functionality, supporting multiple virtual FIFOs and variable length data packets

    On the Distribution of Control in Asynchronous Processor Architectures

    Get PDF
    Institute for Computing Systems ArchitectureThe effective performance of computer systems is to a large measure determined by the synergy between the processor architecture, the instruction set and the compiler. In the past, the sequencing of information within processor architectures has normally been synchronous: controlled centrally by a clock. However, this global signal could possibly limit the future gains in performance that can potentially be achieved through improvements in implementation technology. This thesis investigates the effects of relaxing this strict synchrony by distributing control within processor architectures through the use of a novel asynchronous design model known as a micronet. The impact of asynchronous control on the performance of a RISC-style processor is explored at different levels. Firstly, improvements in the performance of individual instructions by exploiting actual run-time behaviours are demonstrated. Secondly, it is shown that micronets are able to exploit further (both spatial and temporal) instructionlevel parallelism (ILP) efficiently through the distribution of control to datapath resources. Finally, exposing fine-grain concurrency within a datapath can only be of benefit to a computer system if it can easily be exploited by the compiler. Although compilers for micronet-based asynchronous processors may be considered to be more complex than their synchronous counterparts, it is shown that the variable execution time of an instruction does not adversely affect the compiler's ability to schedule code efficiently. In conclusion, the modelling of a processor's datapath as a micronet permits the exploitation of both finegrain ILP and actual run-time delays, thus leading to the efficient utilisation of functional units and in turn resulting in an improvement in overall system performance

    Compositional approach to design of digital circuits

    Get PDF
    PhD ThesisIn this work we explore compositional methods for design of digital circuits with the aim of improving existing methodoligies for desigh reuse. We address compositionality techniques looking from both structural and behavioural perspectives. First we consider the existing method of handshake circuit optimisation via control path resynthesis using Petri nets, an approach using structural composition. In that approach labelled Petri net parallel composition plays an important role and we introduce an improvement to the parallel composition algorithm, reducing the number of redundant places in the resulting Petri net representations. The proposed algorithm applies to labelled Petri nets in general and can be applied outside of the handshake circuit optimisation use case. Next we look at the conditional partial order graph (CPOG) formalism, an approach that allows for a convenient representation of systems consisting of multiple alternative system behaviours, a phenomenon we call behavioural composition. We generalise the notion of CPOG and identify an algebraic structure on a more general notion of parameterised graph. This allows us to do equivalence-preserving manipulation of graphs in symbolic form, which simplifies specification and reasoning about systems defined in this way, as displayed by two case studies. As a third contribution we build upon the previous work of CPOG synthesis used to generate binary encoding of microcontroller instruction sets and design the corresponding instruction decoder logic. The proposed CPOG synthesis technique solves the optimisation problem for the general case, reducing it to Boolean satisfiability problem and uses existing SAT solving tools to obtain the result.This work was supported by a studentship from Newcastle University EECE school, EPSRC grant EP/G037809/1 (VERDAD) and EPSRC grant EP/K001698/1 (UNCOVER). i
    • 

    corecore