Edge computing must be capable of executing computationally intensive
algorithms, such as Deep Neural Networks (DNNs) while operating within a
constrained computational resource budget. Such computations involve Matrix
Vector Multiplications (MVMs) which are the dominant contributor to the memory
and energy budget of DNNs. To alleviate the computational intensity and storage
demand of MVMs, we propose circuit-algorithm co-design techniques with
low-complexity approximate Multiply-Accumulate (MAC) units derived from the
principles of Alphabet Set Multipliers (ASMs). Selection of few and proper
alphabets from ASMs lead to a Multiplier-less DNN implementation, and enables
encoding of low precision weights and input activations into fewer bits. To
maintain accuracy under alphabet set approximations, we developed a novel
ASM-alphabet aware training. The proposed low-complexity multiplication-aware
algorithm was implemented In-Memory and Near-Memory with efficient shift
operations to further improve the data-movement cost between memory and
processing unit. We benchmark our design on CIFAR10 and ImageNet datasets for
ResNet and MobileNet models and attain <1-2% accuracy degradation against full
precision with energy benefits of >50% compared to standard Von-Neumann
counterpart.Comment: Some results have been found incorrect through new experiments. Will
upload the correct one once this paper has been withdraw