2,042 research outputs found
Generic Pipelined Processor Modeling and High Performance Cycle-Accurate Simulator Generation
Detailed modeling of processors and high performance cycle-accurate
simulators are essential for today's hardware and software design. These
problems are challenging enough by themselves and have seen many previous
research efforts. Addressing both simultaneously is even more challenging, with
many existing approaches focusing on one over another. In this paper, we
propose the Reduced Colored Petri Net (RCPN) model that has two advantages:
first, it offers a very simple and intuitive way of modeling pipelined
processors; second, it can generate high performance cycle-accurate simulators.
RCPN benefits from all the useful features of Colored Petri Nets without
suffering from their exponential growth in complexity. RCPN processor models
are very intuitive since they are a mirror image of the processor pipeline
block diagram. Furthermore, in our experiments on the generated cycle-accurate
simulators for XScale and StrongArm processor models, we achieved an order of
magnitude (~15 times) speedup over the popular SimpleScalar ARM simulator.Comment: Submitted on behalf of EDAA (http://www.edaa.com/
Two-level pipelined systolic array graphics engine
The authors report a VLSI design of an advanced systolic array graphics (SAG) engine built from pipelined functional units which can generate realistic images interactively for high-resolution displays. They introduce a structured frame store system as an environment for the advanced SAG engine and present the principles and architecture of the advanced SAG engine. They introduce pipelined functional units into this SAG engine to meet the performance requirements. This is done by a formal approach where the original systolic array is represented at bit level by a finite, vertex-weighted, edge-weighted, directed graph. Two architectures built from pipelined functional units are described. A prototype containing nine processing elements was fabricated in a 1.6-Âżm CMOS technolog
Putting Instruction Sequences into Effect
An attempt is made to define the concept of execution of an instruction
sequence. It is found to be a special case of directly putting into effect of
an instruction sequence. Directly putting into effect of an instruction
sequences comprises interpretation as well as execution. Directly putting into
effect is a special case of putting into effect with other special cases
classified as indirectly putting into effect
Processor Verification Using Efficient Reductions of the Logic of Uninterpreted Functions to Propositional Logic
The logic of equality with uninterpreted functions (EUF) provides a means of
abstracting the manipulation of data by a processor when verifying the
correctness of its control logic. By reducing formulas in this logic to
propositional formulas, we can apply Boolean methods such as Ordered Binary
Decision Diagrams (BDDs) and Boolean satisfiability checkers to perform the
verification.
We can exploit characteristics of the formulas describing the verification
conditions to greatly simplify the propositional formulas generated. In
particular, we exploit the property that many equations appear only in positive
form. We can therefore reduce the set of interpretations of the function
symbols that must be considered to prove that a formula is universally valid to
those that are ``maximally diverse.''
We present experimental results demonstrating the efficiency of this approach
when verifying pipelined processors using the method proposed by Burch and
Dill.Comment: 46 page
Instruction-Level Abstraction (ILA): A Uniform Specification for System-on-Chip (SoC) Verification
Modern Systems-on-Chip (SoC) designs are increasingly heterogeneous and
contain specialized semi-programmable accelerators in addition to programmable
processors. In contrast to the pre-accelerator era, when the ISA played an
important role in verification by enabling a clean separation of concerns
between software and hardware, verification of these "accelerator-rich" SoCs
presents new challenges. From the perspective of hardware designers, there is a
lack of a common framework for the formal functional specification of
accelerator behavior. From the perspective of software developers, there exists
no unified framework for reasoning about software/hardware interactions of
programs that interact with accelerators. This paper addresses these challenges
by providing a formal specification and high-level abstraction for accelerator
functional behavior. It formalizes the concept of an Instruction Level
Abstraction (ILA), developed informally in our previous work, and shows its
application in modeling and verification of accelerators. This formal ILA
extends the familiar notion of instructions to accelerators and provides a
uniform, modular, and hierarchical abstraction for modeling software-visible
behavior of both accelerators and programmable processors. We demonstrate the
applicability of the ILA through several case studies of accelerators (for
image processing, machine learning, and cryptography), and a general-purpose
processor (RISC-V). We show how the ILA model facilitates equivalence checking
between two ILAs, and between an ILA and its hardware finite-state machine
(FSM) implementation. Further, this equivalence checking supports accelerator
upgrades using the notion of ILA compatibility, similar to processor upgrades
using ISA compatibility.Comment: 24 pages, 3 figures, 3 table
Verification of Synchronous Elastic Pipelined Systems
The constant shrinking of technology has lead to several design challenges that
the synchronous design paradigm is unable to cope with. Elastic design is a novel and
promising design paradigm that overcomes many of these challenges by using
components that are insensitive to the latencies of its inputs.
Verification is a critical problem for any design paradigm. The complexity of
elastic designs arises when the system is pipelined. We develop formal verification
techniques to verify synchronous elastic pipelined systems. Note that the goal of
verification is not to establish the correctness of the algorithm for synthesizing elastic
circuits, but instead, to find bugs and formally prove the correctness of elasticized
designs.
We develop two formal verification procedures. The first procedure checks the
correctness of elastic pipelined systems against their synchronous parent pipelined
systems. The second procedure checks the correctness of elastic pipelined systems
against their high-level non-pipelined specifications (such as an instruction set
architecture). Datatlow through elastic architectures is complicated by the insertion of
any number of elastic buffers in any place in the design. We introduce elastic tokenflow
diagrams, which arc used to track the flow of data in elastic architectures. We
provide a method to construct such diagrams. We also develop highly automated and
systematic procedures based on elastic token-flow diagrams that compute functions that map states of elastic systems to states of their specifications. Such functions, known as
refinement maps, are used to compare behaviors of elastic and synchronous systems and
hence prove their equivalence. We elasticized a 5-stage DLX processor that enables the
insertion of buffers in its data path. We constructed several elastic processors by
introducing up to 5 elastic buffers at various places in the data path and verified
equivalence with both their synchronous parent pipelined systems and also with their
instruction set architecture specifications
Decomposing the proof of correctness of pipelined microprocessors
technical reportWe present a systematic approach to decompose and incrementally build the proof of correctness of pipelined microprocessors. The central idea is to construct the abstraction function using completion functions, one per unfinished instruction, each of which specify the effect (on the observables) of completing the instruction. In addition to avoiding term-size and case explosion as could happen for deep and complex pipelines during flushing and helping localize errors, our method can also handle stages with iterative loops. The technique is illustrated on pipelined- as well as a superscalar pipelined implementations of a subset of the DLX architecture
- …