Real-time audio processing on a raspberry Pi using deep neural networks


Over the past years, deep neural networks (DNNs) have quickly grown into the state-of-the-art technologyfor various machine learning tasks such as image and speech recognition or natural language processing.However, as DNN-based applications typically require significant amounts of computation, running DNNson resource-constrained devices still constitutes a challenge, especially for real-time applications such aslow-latency audio processing. In this paper, we aimed to perform real-time noise suppression on a low-costembedded platform with limited resources, using a pre-trained DNN-based speech enhancement model. Aportable setup was employed, consisting of a Raspberry Pi 3 Model B+ fitted with a soundcard and head-phones. A (basic) low-latency Python framework was developed to accommodate an audio processing al-gorithm operating in a real-time environment. Various layouts and trainable parameters of the DNN-basedmodel as well as different processing time intervals (from 64 up to 8 ms) were tested and compared usingobjective metrics (e.g. PESQ, segSNR) to achieve the best possible trade-off between noise suppressionperformance and audio latency. We show that 10-layer DNNs with up to 350,000 trainable parameters cansuccessfully be implemented on the Raspberry Pi 3 Model B+ and yield latencies below 16-ms for real-timeaudio applications

Similar works

This paper was published in Ghent University Academic Bibliography.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.