27 research outputs found

    Scaled Quantization for the Vision Transformer

    Full text link
    Quantization using a small number of bits shows promise for reducing latency and memory usage in deep neural networks. However, most quantization methods cannot readily handle complicated functions such as exponential and square root, and prior approaches involve complex training processes that must interact with floating-point values. This paper proposes a robust method for the full integer quantization of vision transformer networks without requiring any intermediate floating-point computations. The quantization techniques can be applied in various hardware or software implementations, including processor/memory architectures and FPGAs.Comment: 9 pages, 0 figur

    Optimality of bus-invert coding

    Get PDF
    Dynamic power dissipation on I/O buses is an important issue for high-speed communication between chips. One can use coding techniques to reduce the number of transitions, which will reduce the dynamic power. Bus-invert coding is one popular technique for interchip buses, where the dominant contribution is from the self-capacitance of the wires. This algorithm uses an invert line to signal whether the bus data are in its original or an inverted form. While the method appears to be a greedy algorithm, we show that it is, in fact, an optimal strategy. To do so, we first represent the bus and invert line using a trellis diagram. Then, we show that applying bus-invert coding to a sequence of words gives the same result as would be obtained by using the Viterbi algorithm, which is known to be optimal. We also show that partitioning an M-bit bus into P subbuses and using bus-invert coding on each subbus can be described as applying the Viterbi algorithm on a 2P-state trellis

    Advanced C : Techniques and aplications

    No full text
    Minneapolis320 p.: illus.; 21 cm

    Advanced C: Techniques and Applications

    No full text

    Digit-Serial Reconfigurable Fpga Logic Block Architecture

    No full text
    This paper presents a novel field-programmable gate array logic block architecture which incorporates support for digit-serial DSP architectures on a digit wide basis, without diminishing the support for random and control logic applications. To efficiently realize a digit-serial DSP design on FPGAs, one must create an FPGA architecture optimized for those types of systems. Key to the suitability of the FPGA for these applications is the fact that each of its basic blocks is capable of processing a digit-size of up to 4-bits. A novel digit-serial FPGA logic block architecture has been proposed to satisfy the requirement of rapid prototyping and efficient implementation of digit-serial DSP applications. Digit-serial DSP designs using the digit-serial FPGA are compared to those implemented on a Xilinx FPGA chip. The results show that the normalized area of digit-serial circuits on the DS-FPGA is only 33 ¸ 54% of the number required on the Xilinx FPGA. INTRODUCTION Field-Programmable Gat..
    corecore