7,655 research outputs found
Coarse-grained reconfigurable array architectures
Coarse-Grained Reconfigurable Array (CGRA) architectures accelerate the same inner loops that benefit from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efficiently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on flexibility, performance, and power-efficiency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code
VLSI Architecture and Design
Integrated circuit technology is rapidly approaching a state where feature sizes of one micron or less are tractable. Chip sizes are increasing slowly. These two developments result in considerably increased complexity in chip design. The physical characteristics of integrated circuit technology are also changing. The cost of communication will be dominating making new architectures and algorithms both feasible and desirable. A large
number of processors on a single chip will be possible. The cost of communication will make
designs enforcing locality superior to other types of designs.
Scaling down feature sizes results in increase of the delay that wires introduce. The delay even of metal wires will become significant. Time tends to be a local property which will make the design of globally synchronous systems more difficult. Self-timed systems will eventually become a necessity.
With the chip complexity measured in terms of logic devices increasing by more than an order of magnitude over the next few years the importance of efficient design methodologies and tools become crucial. Hierarchical and structured design are ways of dealing with the complexity of chip design. Structered design focuses on the information
flow and enforces a high degree of regularity. Both hierarchical and structured design encourage the use of cell libraries. The geometry of the cells in such libraries should be parameterized so that for instance cells can adjust there size to neighboring cells and make the proper interconnection. Cells with this quality can be used as a basis for "Silicon Compilers"
Recommended from our members
Automatic synthesis of analog layout : a survey
A review of recent research in the automatic synthesis of physical geometry for analog integrated circuits is presented. On introduction, an explanation of the difficulties involved in analog layout as opposed to digital layout is covered. Review of the literature then follows. Emphasis is placed on the exposition of general methods for addressing problems specific to analog layout, with the details of specific systems only being given when they surve to illustrate these methods well. The conclusion discusses problems remaining and offers a prediction as to how technology will evolve to solve them. It is argued that although progress has been and will continue to be made in the automation of analog IC layout, due to fundamental differences in the nature of analog IC design as opposed to digital design, it should not be expected that the level of automation of the former will reach that of the latter any time soon
Full-Stack, Real-System Quantum Computer Studies: Architectural Comparisons and Design Insights
In recent years, Quantum Computing (QC) has progressed to the point where
small working prototypes are available for use. Termed Noisy Intermediate-Scale
Quantum (NISQ) computers, these prototypes are too small for large benchmarks
or even for Quantum Error Correction, but they do have sufficient resources to
run small benchmarks, particularly if compiled with optimizations to make use
of scarce qubits and limited operation counts and coherence times. QC has not
yet, however, settled on a particular preferred device implementation
technology, and indeed different NISQ prototypes implement qubits with very
different physical approaches and therefore widely-varying device and machine
characteristics.
Our work performs a full-stack, benchmark-driven hardware-software analysis
of QC systems. We evaluate QC architectural possibilities, software-visible
gates, and software optimizations to tackle fundamental design questions about
gate set choices, communication topology, the factors affecting benchmark
performance and compiler optimizations. In order to answer key cross-technology
and cross-platform design questions, our work has built the first top-to-bottom
toolflow to target different qubit device technologies, including
superconducting and trapped ion qubits which are the current QC front-runners.
We use our toolflow, TriQ, to conduct {\em real-system} measurements on 7
running QC prototypes from 3 different groups, IBM, Rigetti, and University of
Maryland. From these real-system experiences at QC's hardware-software
interface, we make observations about native and software-visible gates for
different QC technologies, communication topologies, and the value of
noise-aware compilation even on lower-noise platforms. This is the largest
cross-platform real-system QC study performed thus far; its results have the
potential to inform both QC device and compiler design going forward.Comment: Preprint of a publication in ISCA 201
Recommended from our members
Behavioral synthesis from VHDL using structured modeling
This dissertation describes work in behavioral synthesis involving the development of a VHDL Synthesis System VSS which accepts a VHDL behavioral input specification and performs technology independent synthesis to generate a circuit netlist of generic components. The VHDL language is used for input and output descriptions. An intermediate representation which incorporates signal typing and component attributes simplifies compilation and facilitates design optimization.A Structured Modeling methodology has been developed to suggest standard VHDL modeling practices for synthesis. Structured modeling provides recommendations for the use of available VHDL description styles so that optimal designs will be synthesized.A design composed of generic components is synthesized from the input description through a process of Graph Compilation, Graph Criticism, and Design Compilation. Experiments were performed to demonstrate the effects of different modeling styles on the quality of the design produced by VSS. Several alternative VHDL models were examined for each benchmark, illustrating the improvements in design quality achieved when Structured Modeling guidelines were followed
Packet Transactions: High-level Programming for Line-Rate Switches
Many algorithms for congestion control, scheduling, network measurement,
active queue management, security, and load balancing require custom processing
of packets as they traverse the data plane of a network switch. To run at line
rate, these data-plane algorithms must be in hardware. With today's switch
hardware, algorithms cannot be changed, nor new algorithms installed, after a
switch has been built.
This paper shows how to program data-plane algorithms in a high-level
language and compile those programs into low-level microcode that can run on
emerging programmable line-rate switching chipsets. The key challenge is that
these algorithms create and modify algorithmic state. The key idea to achieve
line-rate programmability for stateful algorithms is the notion of a packet
transaction : a sequential code block that is atomic and isolated from other
such code blocks. We have developed this idea in Domino, a C-like imperative
language to express data-plane algorithms. We show with many examples that
Domino provides a convenient and natural way to express sophisticated
data-plane algorithms, and show that these algorithms can be run at line rate
with modest estimated die-area overhead.Comment: 16 page
- …