150 research outputs found

    Unifying mesh- and tree-based programmable interconnect

    Get PDF
    We examine the traditional, symmetric, Manhattan mesh design for field-programmable gate-array (FPGA) routing along with tree-of-meshes (ToM) and mesh-of-trees (MoT) based designs. All three networks can provide general routing for limited bisection designs (Rent's rule with p<1) and allow locality exploitation. They differ in their detailed topology and use of hierarchy. We show that all three have the same asymptotic wiring requirements. We bound this tightly by providing constructive mappings between routes in one network and routes in another. For example, we show that a (c,p) MoT design can be mapped to a (2c,p) linear population ToM and introduce a corner turn scheme which will make it possible to perform the reverse mapping from any (c,p) linear population ToM to a (2c,p) MoT augmented with a particular set of corner turn switches. One consequence of this latter mapping is a multilayer layout strategy for N-node, linear population ToM designs that requires only /spl Theta/(N) two-dimensional area for any p when given sufficient wiring layers. We further show upper and lower bounds for global mesh routes based on recursive bisection width and show these are within a constant factor of each other and within a constant factor of MoT and ToM layout area. In the process we identify the parameters and characteristics which make the networks different, making it clear there is a unified design continuum in which these networks are simply particular regions

    High-Level Annotation of Routing Congestion for Xilinx Vivado HLS Designs

    Get PDF
    Ever since transistor cost stopped decreasing, customized programmable platforms, such as field-programmable gate arrays (FPGAs), became a major way to improve software execution performance and energy consumption. While software developers can use high-level synthesis (HLS) to speed up register-transfer level (RTL) code generation from C++ or OpenCL source code, placement and routing issues, such as congestion, can still prevent achieving an FPGA programming bitstream or dramatically reduce the FPGA implementation performance. Congestion reports from physical design tools refer to thousands of RTL signal names instead of developer-accessible identifiers and statements, considerably complicating the developer understanding and resolution of the issues at the source level. We propose a high-level back-annotation flow that summarizes the routing congestion issues at the source level by analyzing the reports from the FPGA physical design tools and the internal debugging files of the HLS tools. Our flow describes congestion using comments back-annotated on the source code and identifies if the congestion causes are the on-chip memories or the DSP units (multipliers/adders), which are the shared resources very often associated with routing problems on FPGAs. We demonstrate on realistic large designs how the information provided by our flow helps to quickly spot congestion causes at the source level and to solve them using appropriate HLS directives

    Towards Machine Learning-Based FPGA Backend Flow: Challenges and Opportunities

    Get PDF
    Field-Programmable Gate Array (FPGA) is at the core of System on Chip (SoC) design across various Industry 5.0 digital systems—healthcare devices, farming equipment, autonomous vehicles and aerospace gear to name a few. Given that pre-silicon verification using Computer Aided Design (CAD) accounts for about 70% of the time and money spent on the design of modern digital systems, this paper summarizes the machine learning (ML)-oriented efforts in different FPGA CAD design steps. With the recent breakthrough of machine learning, FPGA CAD tasks—high-level synthesis (HLS), logic synthesis, placement and routing—are seeing a renewed interest in their respective decision-making steps. We focus on machine learning-based CAD tasks to suggest some pertinent research areas requiring more focus in CAD design. The development of open-source benchmarks optimized for an end-to-end machine learning experience, intra-FPGA optimization, domain-specific accelerators, lack of explainability and federated learning are the issues reviewed to identify important research spots requiring significant focus. The potential of the new cloud-based architectures to understand the application of the right ML algorithms in FPGA CAD decision-making steps is discussed, together with visualizing the scenario of incorporating more intelligence in the cloud platform, with the help of relatively newer technologies such as CAD as Adaptive OpenPlatform Service (CAOS). Altogether, this research explores several research opportunities linked with modern FPGA CAD flow design, which will serve as a single point of reference for modern FPGA CAD flow design

    A Router for Symmetrical FPGAs based on Exact Routing Density Evaluation

    Get PDF
    Abstract This paper presents a new performance and routability driven routing algorithm for symmetrical array based field-programmable gate arrays (FPGAs). A key contribution of our work is to overcome one essential limitation of the previous routing algorithms: inaccurate estimations of routing density which were too general for symmetrical FPGAs. To this end, we derive an exact routing density calculation that is based on a precise analysis of the structure (switch block) of symmetrical FPGAs, and utilize it consistently in global and detailed routings. With an introduction of the proposed accurate routing metrics, we design a new routing algorithm called a cost-effective net-decomposition based routing which is fast, and yet produces remarkable routing results in terms of both routability and path/net delays. We performed an extensive experiment to show the effectiveness of our algorithm based on the proposed cost metrics

    Circuit Design of Programmable Logic and Interconnect Blocks using Spin Transfer Torque RAM for Non-Volatile FPGAs

    Get PDF
    Most of the Field-Programmable Gate Arrays (FPGAs) are currently SRAM based. The conventional SRAM has been the primary choice for memory storage in the Configurable Logic Blocks (CLBs) as well as for the configuration bits of the reconfigurable interconnects. However SRAM based FPGAs are volatile and needs an external non-volatile memory to store the configuration data. Also SRAM leakage currents increases as technology scales towards lower nodes. The use of non-volatile memories such as Spin-Transfer Torque (STT)-RAM helps to overcome the drawbacks of SRAM-based FPGAs without significant speed penalty. In this paper we present the design of simple non-volatile CLBs using STT-RAM technology. For verifying the design these CLBs have been programmed to implement various functions. The design has been simulated and verified using cadence tools in CMOS 40nm technology

    Hybrid FPGA: Architecture and Interface

    No full text
    Hybrid FPGAs (Field Programmable Gate Arrays) are composed of general-purpose logic resources with different granularities, together with domain-specific coarse-grained units. This thesis proposes a novel hybrid FPGA architecture with embedded coarse-grained Floating Point Units (FPUs) to improve the floating point capability of FPGAs. Based on the proposed hybrid FPGA architecture, we examine three aspects to optimise the speed and area for domain-specific applications. First, we examine the interface between large coarse-grained embedded blocks (EBs) and fine-grained elements in hybrid FPGAs. The interface includes parameters for varying: (1) aspect ratio of EBs, (2) position of the EBs in the FPGA, (3) I/O pins arrangement of EBs, (4) interconnect flexibility of EBs, and (5) location of additional embedded elements such as memory. Second, we examine the interconnect structure for hybrid FPGAs. We investigate how large and highdensity EBs affect the routing demand for hybrid FPGAs over a set of domain-specific applications. We then propose three routing optimisation methods to meet the additional routing demand introduced by large EBs: (1) identifying the best separation distance between EBs, (2) adding routing switches on EBs to increase routing flexibility, and (3) introducing wider channel width near the edge of EBs. We study and compare the trade-offs in delay, area and routability of these three optimisation methods. Finally, we employ common subgraph extraction to determine the number of floating point adders/subtractors, multipliers and wordblocks in the FPUs. The wordblocks include registers and can implement fixed point operations. We study the area, speed and utilisation trade-offs of the selected FPU subgraphs in a set of floating point benchmark circuits. We develop an optimised coarse-grained FPU, taking into account both architectural and system-level issues. Furthermore, we investigate the trade-offs between granularities and performance by composing small FPUs into a large FPU. The results of this thesis would help design a domain-specific hybrid FPGA to meet user requirements, by optimising for speed, area or a combination of speed and area

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations
    • …
    corecore