27 research outputs found
Scaled Quantization for the Vision Transformer
Quantization using a small number of bits shows promise for reducing latency
and memory usage in deep neural networks. However, most quantization methods
cannot readily handle complicated functions such as exponential and square
root, and prior approaches involve complex training processes that must
interact with floating-point values. This paper proposes a robust method for
the full integer quantization of vision transformer networks without requiring
any intermediate floating-point computations. The quantization techniques can
be applied in various hardware or software implementations, including
processor/memory architectures and FPGAs.Comment: 9 pages, 0 figur
Optimality of bus-invert coding
Dynamic power dissipation on I/O buses is an important issue for high-speed communication between chips. One can use coding techniques to reduce the number of transitions, which will reduce the dynamic power. Bus-invert coding is one popular technique for interchip buses, where the dominant contribution is from the self-capacitance of the wires. This algorithm uses an invert line to signal whether the bus data are in its original or an inverted form. While the method appears to be a greedy algorithm, we show that it is, in fact, an optimal strategy. To do so, we first represent the bus and invert line using a trellis diagram. Then, we show that applying bus-invert coding to a sequence of words gives the same result as would be obtained by using the Viterbi algorithm, which is known to be optimal. We also show that partitioning an M-bit bus into P subbuses and using bus-invert coding on each subbus can be described as applying the Viterbi algorithm on a 2P-state trellis
Digit-Serial Reconfigurable Fpga Logic Block Architecture
This paper presents a novel field-programmable gate array logic block architecture which incorporates support for digit-serial DSP architectures on a digit wide basis, without diminishing the support for random and control logic applications. To efficiently realize a digit-serial DSP design on FPGAs, one must create an FPGA architecture optimized for those types of systems. Key to the suitability of the FPGA for these applications is the fact that each of its basic blocks is capable of processing a digit-size of up to 4-bits. A novel digit-serial FPGA logic block architecture has been proposed to satisfy the requirement of rapid prototyping and efficient implementation of digit-serial DSP applications. Digit-serial DSP designs using the digit-serial FPGA are compared to those implemented on a Xilinx FPGA chip. The results show that the normalized area of digit-serial circuits on the DS-FPGA is only 33 ¸ 54% of the number required on the Xilinx FPGA. INTRODUCTION Field-Programmable Gat..