Hardware implementation of the backpropagation without multiplication

Abstract

The back propagation algorithm has been modi ed to work without any multiplications and to tolerate computations with a low resolution, which makes it more attractive for a hardware implementation. Numbers are represented in oating-point format with 1 bit mantissa and 2 bits in the exponent for the states, and 1 bit mantissa and 4 bit exponent for the gradients, while the weights are 16 bit xed-point numbers. In this way, all the computations can be executed with shift and add operations. Large networks with over 100,000 weights were trained and demonstrated the same performance as networks computed with full precision. An estimate of a circuit implementation shows that a large network can be placed on a single chip, reaching more than 1 billion weight updates per second. A speedup is also obtained onany machine where a multiplication is slower than a shift operation.

    Similar works

    Full text

    thumbnail-image

    Available Versions