1,874 research outputs found

    OPTIMIZING LARGE COMBINATIONAL NETWORKS FOR K-LUT BASED FPGA MAPPING

    Get PDF
    Optimizing by partitioning is a central problem in VLSI design automation, addressing circuit’s manufacturability. Circuit partitioning has multiple applications in VLSI design. One of the most common is that of dividing combinational circuits (usually large ones) that will not fit on a single package among a number of packages. Partitioning is of practical importance for k-LUT based FPGA circuit implementation. In this work is presented multilevel a multi-resource partitioning algorithm for partitioning large combinational circuits in order to efficiently use existing and commercially available FPGAs packagestwo-way partitioning, multi-way partitioning, recursive partitioning, flat partitioning, critical path, cutting cones, bottom-up clusters, top-down min-cut

    Experimental Evaluation and Comparison of Time-Multiplexed Multi-FPGA Routing Architectures

    Get PDF
    Emulating large complex designs require multi-FPGA systems (MFS). However, inter-FPGA communication is confronted by the challenge of lack of interconnect capacity due to limited number of FPGA input/output (I/O) pins. Serializing parallel signals onto a single trace effectively addresses the limited I/O pin obstacle. Besides the multiplexing scheme and multiplexing ratio (number of inter-FPGA signals per trace), the choice of the MFS routing architecture also affect the critical path latency. The routing architecture of an MFS is the interconnection pattern of FPGAs, fixed wires and/or programmable interconnect chips. Performance of existing MFS routing architectures is also limited by off-chip interface selection. In this dissertation we proposed novel 2D and 3D latency-optimized time-multiplexed MFS routing architectures. We used rigorous experimental approach and real sequential benchmark circuits to evaluate and compare the proposed and existing MFS routing architectures. This research provides a new insight into the encouraging effects of using off-chip optical interface and three dimensional MFS routing architectures. The vertical stacking results in shorter off-chip links improving the overall system frequency with the additional advantage of smaller footprint area. The proposed 3D architectures employed serialized interconnect between intra-plane and inter-plane FPGAs to address the pin limitation problem. Additionally, all off-chip links are replaced by optical fibers that exhibited latency improvement and resulted in faster MFS. Results indicated that exploiting third dimension provided latency and area improvements as compared to 2D MFS. We also proposed latency-optimized planar 2D MFS architectures in which electrical interconnections are replaced by optical interface in same spatial distribution. Performance evaluation and comparison showed that the proposed architectures have reduced critical path delay and system frequency improvement as compared to conventional MFS. We also experimentally evaluated and compared the system performance of three inter-FPGA communication schemes i.e. Logic Multiplexing, SERDES and MGT in conjunction with two routing architectures i.e. Completely Connected Graph (CCG) and TORUS. Experimental results showed that SERDES attained maximum frequency than the other two schemes. However, for very high multiplexing ratios, the performance of SERDES & MGT became comparable

    Theoretical and algorithmic approaches to field-programmable gate array partitioning

    Get PDF
    Many practical problems dealing with the design of Very Large Scale Integrated (VLSI) circuits can be modeled as graphs in which vertices represent components of the circuit and edges represent a relationship between these components. When expressed as graphs, these problems can then often be solved using graph theoretic methods. Unfortunately, many such problems are NP-complete, hence no practical exact solutions are known to exist. In this dissertation, we study NP-complete problems taken from the realm of partitioning for Field-Programmable Gate Arrays (FPGAs). We adopt a two-pronged approach of theory and practice, developing practical heuristics driven by theoretical study. The theoretical approach is motivated by well-quasi-order (WQO) theory, which can be used to show, among other things, that when some hard problems have fixed parameters, polynomial-time solutions exist. This is of significance in the area of FPGA partitioning, in which practical problems are often characterized by fixed parameter instances. WQO techniques are not generally practical, however, and we develop new methods to solve several important problems in VLSI that are not even amenable to WQO techniques. We begin with a representative partitioning problem, Min Degree Graph Partition (MDGP), the fixed-parameter version of which is closed under the immersion order. \Ve show that the obstruction set ( set of immersion minimal elements) for this problem is computable; we prove both upper and lower bounds on the obstruction set size; and we completely characterize all fixed-parameter MDGP simple tree obstructions. WQO theory tells us only that fixed-parameter MDGP is solvable in (high-degree) polynomial time. We attack the problem using what we refer to as kd-candidate subsets, culminating in linear-time decision and search algorithms. The kd-candidate subset method also paves the way for an efficient heuristic for the FPGA Minimization problem. We then broaden our scope to incorporate delay minimization into FPGA partitioning. We develop, analyze and test a novel method called critical path compression, inspired in part by compiler optimization techniques. We then look at a variety of generalizations of MDGP. Some of these problems are not immersion closed; others are not even defined in a way that WQO theory applies. However, almost all of them are efficiently solvable via the kd-candidate subset approach. Interspersed in these results are many refinements of what is known about the complexity of these problems. We also discuss a few other solution strategies, and present many open problems

    New FPGA design tools and architectures

    Get PDF

    Partitioning of large HDL ASIC designs into multiple FPGA devices for prototyping and verification

    Full text link
    The ASIC (Application specific Integrated Circuit) designs grow continuously bigger and bigger. This causes dramatic increase in the simulation run time. It is very hard to simulate these designs because the simulation time has risen from hours to days and weeks. Hardware Embedded Simulation (HES) is a technology that facilitates incremental design verification of ASICs. The FPGAs (Field Programmable Gate Arrays) can play an important role in ASIC design cycle. But it is not possible to fit an entire ASIC design into a single FPGA device. This problem can be solved by partitioning the given design into multiple small size designs (modules) and fitting those modules into multiple FPGAs. The purpose of my thesis is to take a large RTL (Register Transfer Level) design of an ASIC into consideration, write and test the software ( C code) practically to synthesize each top level module and analyze the size of each module in terms of number of CLBs (Configurable Logic Blocks), I/Os, flip-flops, latches and apply the algorithm to partition it automatically into minimum number of FPGAs

    Real-time simulation of dynamic vehicle models using high performance reconfigurable computing platforms

    Get PDF
    A software-based approach for Real-Time Simulation (RTS) may have difficulties in meeting real-time constraints for complex models. In this thesis, we present a methodology for the design and implementation of RTS algorithms, based on the use of Field-Programmable Gate Array (FPGA) technology to improve the response time of these models. Our methodology utilizes traditional Hardware/Software co-design approaches to generate a heterogeneous architecture for an FPGA-based simulator. We have optimized the hardware design such that it efficiently utilizes the parallel nature of FPGAs and pipelines the independent operations. Further enhancement is obtained through the use of custom accelerators for common non-linear functions. Since the systems we examine have relatively low response time requirements, our approach greatly simplifies the software components by porting the computationally complex regions to hardware. We illustrate the partitioning of a hardware-based simulator design across dual FPGAs, initiate RTS using a system input from a Hardware-in-the-Loop (HIL) framework, and use these simulation results from our FPGA-based platform to perform response analysis. The total simulation time, which includes the time required to receive the system input over a socket (without HIL), software initialization, hardware computation, and transfer of simulation results back over a socket, shows a speedup of 2x as compared to a similar setup with no hardware acceleration. The correctness of the simulation output from the hardware has also been validated with the simulated results from the software-only design

    Developments and experimental evaluation of partitioning algorithms for adaptive computing systems

    Get PDF
    Multi-FPGA systems offer the potential to deliver higher performance solutions than traditional computers for some low-level computing tasks. This requires a flexible hardware substrate and an automated mapping system. CHAMPION is an automated mapping system for implementing image processing applications in multi-FPGA systems under development at the University of Tennessee. CHAMPION will map applications in the Khoros Cantata graphical programming environment to hardware. The work described in this dissertation involves the automation of the CHAMPION backend design flow, which includes the partitioning problem, netlist to structural VHDL conversion, synthesis and placement and routing, and host code generation. The primary goal is to investigate the development and evaluation of three different k-way partitioning approaches. In the first and the second approaches, we discuss the development and implementation of two existing algorithms. The first approach is a hierarchical partitioning method based on topological ordering (HP). The second approach is a recursive algorithm based on the Fiduccia and Mattheyses bipartitioning heuristic (RP). We extend these algorithms to handle the multiple constraints imposed by adaptive computing systems. We also introduce a new recursive partitioning method based on topological ordering and levelization (RPL). In addition to handling the partitioning constraints, the new approach efficiently addresses the problem of minimizing the number of FPGAs used and the amount of computation, thereby overcoming some of the weaknesses of the HP and RP algorithms

    Efficient quadratic placement for FPGAs.

    Get PDF
    Field Programmable Gate Arrays (FPGAs) are widely used in industry because they can implement any digital circuit on site simply by specifying programmable logic and their interconnections. However, this rapid prototyping advantage may be adversely affected because of the long compile time, which is dominated by placement and routing. This issue is of great importance, especially as the logic capacities of FPGAs continue to grow. This thesis focuses on the placement phase of FPGA Computer Aided Design (CAD) flow and presents a fast, high quality, wirelength-driven placement algorithm for FPGAs that is based on the quadratic placement approach. In this thesis, multiple iterations of equation solving process together with a linear wirelength reduction technique are introduced. The proposed algorithm efficiently handles the main problems with the quadratic placement algorithm and produces a fast and high quality placement. Experimental results, using twenty benchmark circuits, show that this algorithm can achieve comparable total wirelength and, on average, 5X faster run time when compared to an existing, state-of-the-art placement tool. This thesis also shows that the proposed algorithm delivers promising preliminary results in minimizing the critical path delay while maintaining high placement quality.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .X86. Source: Masters Abstracts International, Volume: 44-04, page: 1946. Thesis (M.A.Sc.)--University of Windsor (Canada), 2005

    Digital Pulse Width Modulator Techniques For Dc - Dc Converters

    Get PDF
    Recent research activities focused on improving the steady-state as well as the dynamic behavior of DC-DC converters for proper system performance, by proposing different design methods and control approaches with growing tendency to using digital implementation over analog practices. Because of the rapid advancement in semiconductors and microprocessor industry, digital control grew in popularity among PWM converters and is taking over analog techniques due to availability of fast speed microprocessors, flexibility and immunity to noise and environmental variations. Furthermore, increased interest in Field Programmable Gate Arrays (FPGA) makes it a convenient design platform for digitally controlled converters. The objective of this research is to propose new digital control schemes, aiming to improve the steady-state and transient responses of a high switching frequency FPGA-based digitally controlled DC-DC converters. The target is to achieve enhanced performance in terms of tight regulation with minimum power consumption and high efficiency at steady-state, as well as shorter settling time with optimal over- and undershoots during transients. The main task is to develop new and innovative digital PWM techniques in order to achieve: 1. Tight regulation at steady-state: by proposing high resolution DPWM architecture, based on Digital Clock Management (DCM) resources available on FPGA boards. The proposed architecture Window-Masked Segmented Digital Clock Manager-FPGA based Digital Pulse Width Modulator Technique, is designed to achieve high resolution operating at high switching frequencies with minimum power consumption. 2. Enhanced dynamic response: by applying a shift to the basic saw-tooth DPWM signal, in order to benefit from the best linearity and simplest architecture offered by the conventional counter-comparator DPWM. This proposed control scheme will help the compensator reach the steady-state value faster. Dynamically Shifted Ramp Digital Control Technique for Improved Transient Response in DC-DC Converters, is projected to enhance the transient response by dynamically controlling the ramp signal of the DPWM unit
    corecore