Search CORE

1,042 research outputs found

Denoising Deep Neural Networks Based Voice Activity Detection

Author: Wu Ji
Zhang Xiao-Lei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/03/2013
Field of study

Recently, the deep-belief-networks (DBN) based voice activity detection (VAD) has been proposed. It is powerful in fusing the advantages of multiple features, and achieves the state-of-the-art performance. However, the deep layers of the DBN-based VAD do not show an apparent superiority to the shallower layers. In this paper, we propose a denoising-deep-neural-network (DDNN) based VAD to address the aforementioned problem. Specifically, we pre-train a deep neural network in a special unsupervised denoising greedy layer-wise mode, and then fine-tune the whole network in a supervised way by the common back-propagation algorithm. In the pre-training phase, we take the noisy speech signals as the visible layer and try to extract a new feature that minimizes the reconstruction cross-entropy loss between the noisy speech signals and its corresponding clean speech signals. Experimental results show that the proposed DDNN-based VAD not only outperforms the DBN-based VAD but also shows an apparent performance improvement of the deep layers over shallower layers.Comment: This paper has been accepted by IEEE ICASSP-2013, and will be published online after May, 201

arXiv.org e-Print Archive

Crossref

Non-intrusive speech quality assessment using context-aware neural networks

Author: Dubey Rajesh Kumar
Jaiswal Rahul Kumar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

To meet the human perceived quality of experience (QoE) while communicating over various Voice over Internet protocol (VoIP) applications, for example Google Meet, Microsoft Skype, Apple FaceTime, etc. a precise speech quality assessment metric is needed. The metric should be able to detect and segregate different types of noise degradations present in the surroundings before measuring and monitoring the quality of speech in real-time. Our research is motivated by the lack of clear evidence presenting speech quality metric that can firstly distinguish different types of noise degradations before providing speech quality prediction decision. To that end, this paper presents a novel non-intrusive speech quality assessment metric using context-aware neural networks in which the noise class (context) of the degraded or noisy speech signal is first identified using a classifier then deep neutral networks (DNNs) based speech quality metrics (SQMs) are trained and optimized for each noise class to obtain the noise class-specific (context-specific) optimized speech quality predictions (MOS scores). The noisy speech signals, that is, clean speech signals degraded by different types of background noises are taken from the NOIZEUS speech corpus. Results demonstrate that even in the presence of less number of speech samples available from the NOIZEUS speech corpus, the proposed metric outperforms in different contexts compared to the metric where the contexts are not classified before speech quality prediction.publishedVersio

Agder University Research Archive

A Novel Method for Classification and Modelling of Underwater Acoustic Communication through Machine Learning and Image Processing Technique

Author: K SAIKUMAR MAKKAPATI HIMAJA, D. V. DIVAKARA RAO, DR.P.CHANDRA KANTH, NP LAVANYA KUMARI ,R REVATHI
Publication venue: ASSOC ADVANCEMENT ZOOLOGY , AZADANAGAR COLONY RUSTAMPUR, GORAKHPUR, INDIA, 273001
Publication date: 09/10/2023
Field of study

The increasing prevalence of underwater activities has highlighted the urgent need for reliable underwater acoustic communication systems. However, the challenging nature of the underwater environment poses significant obstacles to the implementation of conventional voice communication methods. To better understand and improve upon these systems, simulations of the underwater audio channel have been developed using mathematical models and assumptions. In this study, we utilize real-world information gathered from both a measured water reservoir and Lake to evaluate the ability of machine learning and machine learning methods, specifically Long Short-Term Memory (LSTM) and Deep Neural Network (DNN), to accurately reconstruct the underwater audio channel. The outcomes validate the efficiency of machine learning methods, particularly LSTM, in accurately simulating the underwater acoustic communication channel with low mean absolute percentage error. Additionally, this research also includes an image processing to identify the objects present the in the acoustic environmen

Journal Of Advanced Zoology