21 research outputs found
Recommended from our members
Exploiting fast carry-chains of FPGAs for designing compressor trees
Fast carry chains featuring dedicated adder circuitry is a distinctive feature of modern FPGAs. The carry chains bypass the general routing network and are embedded in the logic blocks of FPGAs for fast addition. Conventional intuition is that such carry chains can be used only for implementing carry-propagate addition; state-of-the-art FPGA synthesizers can only exploit the carry chains for these specific circuits. This paper demonstrates that the carry chains can be used to build compressor trees, i.e., multi-input addition circuits used for parallel accumulation and partial product reduction for parallel multipliers implemented in FPGA logic. The key to our technique is to program the lookup tables (LUTs) in the logic blocks to stop the propagation of carry bits along the carry chain at appropriate points. This approach improves the area of compressor trees significantly compared to previous methods that synthesized compressor trees solely on LUTs, without compromising the performance gain over trees built from ternary carry-propagate adders. ©2009 IEEE
Recommended from our members
Reducing the pressure on routing resources of FPGAs with generic logic chains
Routing resources in modern FPGAs use 50% of the silicon real estate and are significant contributors to critical path delay and power consumption; the situation gets worse with each successive process generation, as transistors scale more effectively than wires. To cope with these challenges, FPGA architects have divided wires into local and global categories and introduced fast dedicated carry chains between adjacent logic cells, which reduce routing resource usage for certain arithmetic circuits (primarily adders and subtractors). Inspired by the carry chains, we generalize the idea to connect lookup tables (LUTs) in adjacent logic cells. By exploiting the fracturable structure of LUTs in current FPGA generations, we increase the utilization of the existing LUTs in the logic cell by providing new inputs along the logic chain, but without increasing the I/O bandwidth from the programmable interconnect. This allows us to increase the logic density of the configurable logic cells while reducing demand for routing resources, as long as the mapping tools are able to exploit the logic chains. Our experiments using the combinational MCNC benchmarks and comparing against an Altera Stratix-III FPGA show that the introduction of logic chains reduce the average usage of local routing wires by 37%, with a 12% reduction in total wiring (local and global); this translates to improvements in dynamic power consumption of 18% in the routing network and 10% overall, while utilizing 4% fewer logic cells, on average. Copyright 2011 ACM
Recommended from our members
Improved Synthesis of Compressor Trees on FPGAs by a Hybrid and Systematic Design Approach
Recommended from our members
Routing Wire Optimization through Generic Synthesis on FPGA Carry Chains
Recommended from our members
Routing Wire Optimization through Generic Synthesis on FPGA Carry Chains
Improved Synthesis of Compressor Trees on FPGAs by a Hybrid and Systematic Design Approach
Improving arithmetic circuits on FPGAs is one of the main imperatives of FPGA vendors. Augmenting logic cells with dedicated arithmetic components such as adders and carry chains indicates the need for such improvements. In a prior work, we showed how the carry chains in the state-of-the-art Altera FPGAs could be exploited for synthesis of compressor trees. In that work, we proposed generalized parallel counters (GPCs) as the building blocks and mapped them to logic cells of FPGA using LUTs and carry chains. In this paper, we propose a novel technique to increase the logic density of compressor tree synthesis by sharing the logic cells between two neighbor GPCs in a chain. Moreover, we expand the GPC library with bigger GPCs and we propose a systematic approach to select the right GPCs based on the synthesis optimization targets. Finally, we will demonstrate that our framework can be retargeted to Xilinx Virtex-5 FPGAs with minor modifications. 1