Search CORE

3 research outputs found

Precision Scaling of Neural Networks for Efficient Audio Processing

Author: Fromm Josh
Ko Jong Hwan
Philipose Matthai
Tashev Ivan
Zarar Shuayb
Publication venue
Publication date: 04/12/2017
Field of study

While deep neural networks have shown powerful performance in many audio applications, their large computation and memory demand has been a challenge for real-time processing. In this paper, we study the impact of scaling the precision of neural networks on the performance of two common audio processing tasks, namely, voice-activity detection and single-channel speech enhancement. We determine the optimal pair of weight/neuron bit precision by exploring its impact on both the performance and processing time. Through experiments conducted with real user data, we demonstrate that deep neural networks that use lower bit precision significantly reduce the processing time (up to 30x). However, their performance impact is low (< 3.14%) only in the case of classification tasks such as those present in voice activity detection

arXiv.org e-Print Archive

A study on speech enhancement using exponent-only floating point quantized neural network (EOFP-QNN)

Author: Fu Szu-Wei
Hsu Yi-Te
Kuo Tei-Wei
Lin Yu-Chen
Tsao Yu
Publication venue
Publication date: 30/10/2018
Field of study

Numerous studies have investigated the effectiveness of neural network quantization on pattern classification tasks. The present study, for the first time, investigated the performance of speech enhancement (a regression task in speech processing) using a novel exponent-only floating-point quantized neural network (EOFP-QNN). The proposed EOFP-QNN consists of two stages: mantissa-quantization and exponent-quantization. In the mantissa-quantization stage, EOFP-QNN learns how to quantize the mantissa bits of the model parameters while preserving the regression accuracy using the least mantissa precision. In the exponent-quantization stage, the exponent part of the parameters is further quantized without causing any additional performance degradation. We evaluated the proposed EOFP quantization technique on two types of neural networks, namely, bidirectional long short-term memory (BLSTM) and fully convolutional neural network (FCN), on a speech enhancement task. Experimental results showed that the model sizes can be significantly reduced (the model sizes of the quantized BLSTM and FCN models were only 18.75% and 21.89%, respectively, compared to those of the original models) while maintaining satisfactory speech-enhancement performance

arXiv.org e-Print Archive

Faster Convolution Inference Through Using Pre-Calculated Lookup Tables

Author: Gatchev Grigor
Mollov Valentin
Publication venue
Publication date: 04/04/2021
Field of study

Low-cardinality activations permit an algorithm based on fetching the inference values from pre-calculated lookup tables instead of calculating them every time. This algorithm can have extensions, some of which offer abilities beyond those of the currently used algorithms. It also allows for a simpler and more effective CNN-specialized hardware.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive