Search CORE

1,191 research outputs found

An automatic tool flow for the combined implementation of multi-mode circuits

Author: Al Farisi Brahim
Bruneel Karel
Cardoso João M. P.
Stroobandt Dirk
Publication venue
Publication date: 01/01/2013
Field of study

A multi-mode circuit implements the functionality of a limited number of circuits, called modes, of which at any given time only one needs to be realised. Using run-time reconfiguration of an FPGA, all the modes can be implemented on the same reconfigurable region, requiring only an area that can contain the biggest mode. Typically, conventional run-time reconfiguration techniques generate a configuration for every mode separately. To switch between modes the complete reconfigurable region is rewritten, which often leads to very long reconfiguration times. In this paper we present a novel, fully automated tool flow that exploits similarities between the modes and uses Dynamic Circuit Specialization to drastically reduce reconfiguration time. Experimental results show that the number of bits that is rewritten in the configuration memory reduces with a factor from 4.6X to 5.1X without significant performance penalties

Ghent University Academic Bibliography

Configuration Sharing Optimized Placement and Routing

Author: Cobb Jonathan E.
Stepien Piotr
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Reconfigurable systems have been shown to achieve very high computational performance. However, the overhead associated with reconfiguration of hardware remains a critical factor in overall system performance. This paper discusses the development and evaluation of a technique to minimize the delay associated with reconfiguration based upon optimized sharing of configuration bit streams between design contexts. This is achieved through modified placement and routing algorithms

Bournemouth University Research Online

OpenPARF: An Open-Source Placement and Routing Framework for Large-Scale Heterogeneous FPGAs with Deep Learning Toolkit

Author: Di Zhixiong
Liang Yun
Lin Yibo
Luo Guojie
Mai Jing
Wang Jiarui
Publication venue
Publication date: 28/06/2023
Field of study

This paper proposes OpenPARF, an open-source placement and routing framework for large-scale FPGA designs. OpenPARF is implemented with the deep learning toolkit PyTorch and supports massive parallelization on GPU. The framework proposes a novel asymmetric multi-electrostatic field system to solve FPGA placement. It considers fine-grained routing resources inside configurable logic blocks (CLBs) for FPGA routing and supports large-scale irregular routing resource graphs. Experimental results on ISPD 2016 and ISPD 2017 FPGA contest benchmarks and industrial benchmarks demonstrate that OpenPARF can achieve 0.4-12.7% improvement in routed wirelength and more than

2\times

speedup in placement. We believe that OpenPARF can pave the road for developing FPGA physical design engines and stimulate further research on related topics

arXiv.org e-Print Archive

Circuit design and analysis for on-FPGA communication systems

Author: Mak Terrence Sui-Tung
Mak Terrence Sui-Tung
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/04/2010
Field of study

On-chip communication system has emerged as a prominently important subject in Very-Large- Scale-Integration (VLSI) design, as the trend of technology scaling favours logics more than interconnects. Interconnects often dictates the system performance, and, therefore, research for new methodologies and system architectures that deliver high-performance communication services across the chip is mandatory. The interconnect challenge is exacerbated in Field-Programmable Gate Array (FPGA), as a type of ASIC where the hardware can be programmed post-fabrication. Communication across an FPGA will be deteriorating as a result of interconnect scaling. The programmable fabrics, switches and the specific routing architecture also introduce additional latency and bandwidth degradation further hindering intra-chip communication performance. Past research efforts mainly focused on optimizing logic elements and functional units in FPGAs. Communication with programmable interconnect received little attention and is inadequately understood. This thesis is among the first to research on-chip communication systems that are built on top of programmable fabrics and proposes methodologies to maximize the interconnect throughput performance. There are three major contributions in this thesis: (i) an analysis of on-chip interconnect fringing, which degrades the bandwidth of communication channels due to routing congestions in reconfigurable architectures; (ii) a new analogue wave signalling scheme that significantly improves the interconnect throughput by exploiting the fundamental electrical characteristics of the reconfigurable interconnect structures. This new scheme can potentially mitigate the interconnect scaling challenges. (iii) a novel Dynamic Programming (DP)-network to provide adaptive routing in network-on-chip (NoC) systems. The DP-network architecture performs runtime optimization for route planning and dynamic routing which, effectively utilizes the in-silicon bandwidth. This thesis explores a new horizon in reconfigurable system design, in which new methodologies and concepts are proposed to enhance the on-FPGA communication throughput performance that is of vital importance in new technology processes

Spiral - Imperial College Digital Repository

Placement and Routing in 3D Integrated Circuits

Author: B. Goplen
C. Ababei
H. Mogal
K. Bazargan
S. Sapatnekar
Tianpei Zhang
Yan Feng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

A Finite Domain Constraint Approach for Placement and Routing of Coarse-Grained Reconfigurable Architectures

Author: Saraswat Rohit
Publication venue: DigitalCommons@USU
Publication date: 01/05/2010
Field of study

Scheduling, placement, and routing are important steps in Very Large Scale Integration (VLSI) design. Researchers have developed numerous techniques to solve placement and routing problems. As the complexity of Application Specific Integrated Circuits (ASICs) increased over the past decades, so did the demand for improved place and route techniques. The primary objective of these place and route approaches has typically been wirelength minimization due to its impact on signal delay and design performance. With the advent of Field Programmable Gate Arrays (FPGAs), the same place and route techniques were applied to FPGA-based design. However, traditional place and route techniques may not work for Coarse-Grained Reconfigurable Architectures (CGRAs), which are reconfigurable devices offering wider path widths than FPGAs and more flexibility than ASICs, due to the differences in architecture and routing network. Further, the routing network of several types of CGRAs, including the Field Programmable Object Array (FPOA), has deterministic timing as compared to the routing fabric of most ASICs and FPGAs reported in the literature. This necessitates a fresh look at alternative approaches to place and route designs. This dissertation presents a finite domain constraint-based, delay-aware placement and routing methodology targeting an FPOA. The proposed methodology takes advantage of the deterministic routing network of CGRAs to perform a delay aware placement

DigitalCommons@USU

Coarse-grained reconfigurable array architectures

Author: A Lambrechts
B Bougard
B Bougard
B Mei
B Mei
B Mei
B Sutter De
G Venkataramani
H Park
H Park
J Lee
JMP Cardoso
JW Waerdt van de
K Berkel van
K Bondalapati
K Sankaralingam
KE Coons
LH Lee
M Ahn
M Gebhart
M Schlansker
M Taylor
M Woh
MD Galanis
MH Lee
S Friedman
SA Mahlke
T Oh
Y Kim
Y Kim
Y Kim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Coarse-Grained Reconﬁgurable Array (CGRA) architectures accelerate the same inner loops that beneﬁt from the high ILP support in VLIW architectures. By executing non-loop code on other cores, however, CGRAs can focus on such loops to execute them more efﬁciently. This chapter discusses the basic principles of CGRAs, and the wide range of design options available to a CGRA designer, covering a large number of existing CGRA designs. The impact of different options on ﬂexibility, performance, and power-efﬁciency is discussed, as well as the need for compiler support. The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual ﬁne-tuning of source code

Crossref

Ghent University Academic Bibliography

Beyond the arithmetic constraint: depth-optimal mapping of logic chains in reconfigurable fabrics

Author: Frederick Michael Todd
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2008
Field of study

Look-up table based FPGAs have migrated from a niche technology for design prototyping to a valuable end-product component and, in some cases, a replacement for general purpose processors and ASICs alike. One way architects have bridged the performance gap between FPGAs and ASICs is through the inclusion of specialized components such as multipliers, RAM modules, and microcontrollers. Another dedicated structure that has become standard in reconfigurable fabrics is the arithmetic carry chain. Currently, it is only used to map arithmetic operations as identified by HDL macros. For non-arithmetic operations, it is an idle but potentially powerful resource.;Obstacles to using the carry chain for generic logic operations include lack of architectural and computer-aided design support. Current carry-select architectures facilitate carry chain reuse, although they do so only for (K-1)-input operations. Additionally, hardware description language (HDL) macros are the only recourse for a designer wishing to map generic logic chains in a carry-select architecture. A novel architecture that allows the full K-input operational capacity of the carry chain to be harnessed is presented as a solution to current architectural limitations. It is shown to have negligible impact on logic element area and delay. Using only two additional 2:1 pass transistor multiplexers, it enables the transmission of a K-input operation to the carry chain and general routing simultaneously. To successfully identify logic chains in an arbitrary Boolean network, ChainMap is presented as a novel technology mapping algorithm. ChainMap creates delay-optimal generic logic chains in polynomial time without HDL macros. It maps both arithmetic and non-arithmetic logic chains whenever depth increasing nodes, which increase logic depth but not routing depth, are encountered. Use of the chain is not reserved for arithmetic, but rather any set of gates exhibiting similar characteristics. By using the carry chain as a generic, near zero-delay adjacent cell interconnection structure a potential average optimal speedup of 1.4x is revealed. Post place and route experiments indicate that ChainMap solutions perform similarly to HDL chains when cluster resources are abundant and significantly better in cluster-constrained arrays

Digital Repository @ Iowa State University (ISU)