Search CORE

90,966 research outputs found

Rejecting noise in Baikal-GVD data with neural networks

Author: Kharuk I.
Rubtsov G.
Safronov G.
Publication venue
Publication date: 10/10/2022
Field of study

Baikal-GVD is a large (

\sim

1 km

^3

) underwater neutrino telescope installed in the fresh waters of Lake Baikal. The deep lake water environment is pervaded by background light, which produces detectable signals in the Baikal-GVD photosensors. We introduce a neural network for an efficient separation of these noise hits from the signal ones, stemming from the propagation of relativistic particles through the detector. The neural network has a U-net like architecture and employs temporal (causal) structure of events. On Monte-Carlo simulated data, it reaches 99% signal purity (precision) and 98% survival efficiency (recall). The benefits of using neural network for data analysis are discussed, and other possible architectures of neural networks, including graph based, are examined

arXiv.org e-Print Archive

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Author: Gu Rongzhi
Yu Dong
Zhang Shi-Xiong
Zou Yuexian
Publication venue
Publication date: 23/12/2022
Field of study

Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation. In parallel, the integration of time domain network structure and beamforming also gains significant attention. This study proposes a novel all-neural beamforming method in time domain and makes an attempt to unify the all-neural beamforming pipelines for time domain and frequency domain multichannel speech separation. The proposed model consists of two modules: separation and beamforming. Both modules perform temporal-spectral-spatial modeling and are trained from end-to-end using a joint loss function. The novelty of this study lies in two folds. Firstly, a time domain directional feature conditioned on the direction of the target speaker is proposed, which can be jointly optimized within the time domain architecture to enhance target signal estimation. Secondly, an all-neural beamforming network in time domain is designed to refine the pre-separated results. This module features with parametric time-variant beamforming coefficient estimation, without explicitly following the derivation of optimal filters that may lead to an upper bound. The proposed method is evaluated on simulated reverberant overlapped speech data derived from the AISHELL-1 corpus. Experimental results demonstrate significant performance improvements over frequency domain state-of-the-arts, ideal magnitude masks and existing time domain neural beamforming methods

arXiv.org e-Print Archive

A Digital Neuromorphic Architecture Efficiently Facilitating Complex Synaptic Response Functions Applied to Liquid State Machines

Author: Aimone James B.
Carlson Kristofor D.
Donaldson Jonathon
Follett David R.
Follett Pamela L.
Hill Aaron J.
James Conrad D.
Naegle John H.
Smith Michael R.
Vineyard Craig M.
Publication venue
Publication date: 21/03/2017
Field of study

Information in neural networks is represented as weighted connections, or synapses, between neurons. This poses a problem as the primary computational bottleneck for neural networks is the vector-matrix multiply when inputs are multiplied by the neural network weights. Conventional processing architectures are not well suited for simulating neural networks, often requiring large amounts of energy and time. Additionally, synapses in biological neural networks are not binary connections, but exhibit a nonlinear response function as neurotransmitters are emitted and diffuse between neurons. Inspired by neuroscience principles, we present a digital neuromorphic architecture, the Spiking Temporal Processing Unit (STPU), capable of modeling arbitrary complex synaptic response functions without requiring additional hardware components. We consider the paradigm of spiking neurons with temporally coded information as opposed to non-spiking rate coded neurons used in most neural networks. In this paradigm we examine liquid state machines applied to speech recognition and show how a liquid state machine with temporal dynamics maps onto the STPU-demonstrating the flexibility and efficiency of the STPU for instantiating neural algorithms.Comment: 8 pages, 4 Figures, Preprint of 2017 IJCN

arXiv.org e-Print Archive

Crossref

Deep Learning for Audio Signal Processing

Author: Chang Shuo-yiin
Li Bo
Purwins Hendrik
Sainath Tara
Schlüter Jan
Virtanen Tuomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2019
Field of study

Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

arXiv.org e-Print Archive

VBN

Multi-Resolution Fully Convolutional Neural Networks for Monaural Audio Source Separation

Author: Grais Emad M.
Plumbley Mark D.
Ward Dominic
Wierstorf Hagen
Publication venue
Publication date: 28/10/2017
Field of study

In deep neural networks with convolutional layers, each layer typically has fixed-size/single-resolution receptive field (RF). Convolutional layers with a large RF capture global information from the input features, while layers with small RF size capture local details with high resolution from the input features. In this work, we introduce novel deep multi-resolution fully convolutional neural networks (MR-FCNN), where each layer has different RF sizes to extract multi-resolution features that capture the global and local details information from its input features. The proposed MR-FCNN is applied to separate a target audio source from a mixture of many audio sources. Experimental results show that using MR-FCNN improves the performance compared to feedforward deep neural networks (DNNs) and single resolution deep fully convolutional neural networks (FCNNs) on the audio source separation problem.Comment: arXiv admin note: text overlap with arXiv:1703.0801

arXiv.org e-Print Archive

University of Surrey

Surrey Research Insight