Search CORE

21,097 research outputs found

Deep Learning for Audio Signal Processing

Author: Chang Shuo-yiin
Li Bo
Purwins Hendrik
Sainath Tara
Schlüter Jan
Virtanen Tuomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2019
Field of study

Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

arXiv.org e-Print Archive

VBN

Reflection-Aware Sound Source Localization

Author: An Inkyu
Manocha Dinesh
Son Myungbae
Yoon Sung-eui
Publication venue
Publication date: 21/11/2017
Field of study

We present a novel, reflection-aware method for 3D sound localization in indoor environments. Unlike prior approaches, which are mainly based on continuous sound signals from a stationary source, our formulation is designed to localize the position instantaneously from signals within a single frame. We consider direct sound and indirect sound signals that reach the microphones after reflecting off surfaces such as ceilings or walls. We then generate and trace direct and reflected acoustic paths using inverse acoustic ray tracing and utilize these paths with Monte Carlo localization to estimate a 3D sound source position. We have implemented our method on a robot with a cube-shaped microphone array and tested it against different settings with continuous and intermittent sound signals with a stationary or a mobile source. Across different settings, our approach can localize the sound with an average distance error of 0.8m tested in a room of 7m by 7m area with 3m height, including a mobile and non-line-of-sight sound source. We also reveal that the modeling of indirect rays increases the localization accuracy by 40% compared to only using direct acoustic rays.Comment: Submitted to ICRA 2018. The working video is available at (https://youtu.be/TkQ36lMEC-M

arXiv.org e-Print Archive

Crossref

Acoustical Ranging Techniques in Embedded Wireless Sensor Networked Devices

Author: Jha Sanjay
Kottege Navinda
Kusy Branislav
Misra Prasant
Ostry Diethelm
Publication venue
Publication date: 01/01/2013
Field of study

Location sensing provides endless opportunities for a wide range of applications in GPS-obstructed environments; where, typically, there is a need for higher degree of accuracy. In this article, we focus on robust range estimation, an important prerequisite for fine-grained localization. Motivated by the promise of acoustic in delivering high ranging accuracy, we present the design, implementation and evaluation of acoustic (both ultrasound and audible) ranging systems.We distill the limitations of acoustic ranging; and present efficient signal designs and detection algorithms to overcome the challenges of coverage, range, accuracy/resolution, tolerance to Doppler’s effect, and audible intensity. We evaluate our proposed techniques experimentally on TWEET, a low-power platform purpose-built for acoustic ranging applications. Our experiments demonstrate an operational range of 20 m (outdoor) and an average accuracy 2 cm in the ultrasound domain. Finally, we present the design of an audible-range acoustic tracking service that encompasses the benefits of a near-inaudible acoustic broadband chirp and approximately two times increase in Doppler tolerance to achieve better performance

Crossref

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

A Geometric Approach to Sound Source Localization from Time-Delay Estimates

Author: Alameda-Pineda Xavier
Horaud Radu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/11/2013
Field of study

This paper addresses the problem of sound-source localization from time-delay estimates using arbitrarily-shaped non-coplanar microphone arrays. A novel geometric formulation is proposed, together with a thorough algebraic analysis and a global optimization solver. The proposed model is thoroughly described and evaluated. The geometric analysis, stemming from the direct acoustic propagation model, leads to necessary and sufficient conditions for a set of time delays to correspond to a unique position in the source space. Such sets of time delays are referred to as feasible sets. We formally prove that every feasible set corresponds to exactly one position in the source space, whose value can be recovered using a closed-form localization mapping. Therefore we seek for the optimal feasible set of time delays given, as input, the received microphone signals. This time delay estimation problem is naturally cast into a programming task, constrained by the feasibility conditions derived from the geometric analysis. A global branch-and-bound optimization technique is proposed to solve the problem at hand, hence estimating the best set of feasible time delays and, subsequently, localizing the sound source. Extensive experiments with both simulated and real data are reported; we compare our methodology to four state-of-the-art techniques. This comparison clearly shows that the proposed method combined with the branch-and-bound algorithm outperforms existing methods. These in-depth geometric understanding, practical algorithms, and encouraging results, open several opportunities for future work.Comment: 13 pages, 2 figures, 3 table, journa

arXiv.org e-Print Archive

CiteSeerX

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1