Search CORE

665 research outputs found

Speech Waveform Reconstruction using Convolutional Neural Networks with Noise and Periodic Inputs

Author: King Simon
Valentini Botinhao Cassia
Watts Oliver
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/04/2019
Field of study

Study of efficient methods of detection and reconstruction of gravitational waves from nonrotating 3D general relativistic core collapse supernovae explosion using multilayer signal estimation method

Author: Mukherjee Soma
Nurbek Gaukhar
Valdez Oscar
Publication venue: ScholarWorks @ UTRGV
Publication date: 06/05/2021
Field of study

In the post-detection era of gravitational wave (GW) astronomy, core collapse supernovae (CCSN) are one of the most interesting potential sources of signals arriving at the Advanced LIGO detectors. Mukherjee et al. have developed and implemented a new method to search for GW signals from the CCSN search based on a multistage, high accuracy spectral estimation to effectively achieve higher detection signal to noise ratio (SNR). The study has been further enhanced by incorporation of a convolutional neural network (CNN) to significantly reduce false alarm rates (FAR). The combined pipeline is termed multilayer signal estimation (MuLaSE) that works in an integrative manner with the coherent wave burst (cWB) pipeline. In order to compare the performance of this new search pipeline, termed “MuLaSECC”, with the cWB, an extensive analysis has been performed with two families of core collapse supernova waveforms corresponding to two different three dimensional (3D) general relativistic CCSN explosion models, viz. Kuroda 2017 and the Ott 2013. The performance of this pipeline has been characterized through receiver operating characteristics (ROC) and the reconstruction of the detected signals. The MuLaSECC is found to have higher efficiency in low false alarm range, a higher detection probability of weak signals and an improved reconstruction, especially in the lower frequency domain

Scholarworks@UTRGV Univ. of Texas RioGrande Valley

A review of differentiable digital signal processing for music and speech synthesis

Author: Fazekas G
Hayes B
McPherson A
Saitis C
Shier J
Publication venue: Frontiers Media
Publication date: 11/01/2024
Field of study

The term “differentiable digital signal processing” describes a family of techniques in which loss function gradients are backpropagated through digital signal processors, facilitating their integration into neural networks. This article surveys the literature on differentiable audio signal processing, focusing on its use in music and speech synthesis. We catalogue applications to tasks including music performance rendering, sound matching, and voice transformation, discussing the motivations for and implications of the use of this methodology. This is accompanied by an overview of digital signal processing operations that have been implemented differentiably, which is further supported by a web book containing practical advice on differentiable synthesiser programming (https://intro2ddsp.github.io/). Finally, we highlight open challenges, including optimisation pathologies, robustness to real-world conditions, and design trade-offs, and discuss directions for future research

Queen Mary Research Online

Speech Recognition in noisy environment using Deep Learning Neural Network

Author: Nasef Ashrf Ali Abraheem
Publication venue: Универзитет Сингидунум, Студије при универзитету
Publication date: 06/12/2017
Field of study

Recent researches in the field of automatic speaker recognition have shown that methods based on deep learning neural networks provide better performance than other statistical classifiers. On the other hand, these methods usually require adjustment of a significant number of parameters. The goal of this thesis is to show that selecting appropriate value of parameters can significantly improve speaker recognition performance of methods based on deep learning neural networks. The reported study introduces an approach to automatic speaker recognition based on deep neural networks and the stochastic gradient descent algorithm. It particularly focuses on three parameters of the stochastic gradient descent algorithm: the learning rate, and the hidden and input layer dropout rates. Additional attention was devoted to the research question of speaker recognition under noisy conditions. Thus, two experiments were conducted in the scope of this thesis. The first experiment was intended to demonstrate that the optimization of the observed parameters of the stochastic gradient descent algorithm can improve speaker recognition performance under no presence of noise. This experiment was conducted in two phases. In the first phase, the recognition rate is observed when the hidden layer dropout rate and the learning rate are varied, while the input layer dropout rate was constant. In the second phase of this experiment, the recognition rate is observed when the input layers dropout rate and learning rate are varied, while the hidden layer dropout rate was constant. The second experiment was intended to show that the optimization of the observed parameters of the stochastic gradient descent algorithm can improve speaker recognition performance even under noisy conditions. Thus, different noise levels were artificially applied on the original speech signal

National Repository of Dissertations in Serbia (NaRDuS)

Nardus