Search CORE

1,430 research outputs found

Real-time multiple sound source localization using a circular microphone array based on single-source confidence measures

Author: Griffin Anthony
Mouchtaris Athanasios
Pavlidi Despoina
Puigt Matthieu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

International audienceWe propose a novel real-time adaptative localization approach for multiple sources using a circular array, in order to suppress the localization ambiguities faced with linear arrays, and assuming a weak sound source sparsity which is derived from blind source separation methods. Our proposed method performs very well both in simulations and in real conditions at 50% real-time

Source counting in real-time sound source localization using a circular microphone array

Author: Griffin Anthony
Mouchtaris Athanasios
Pavlidi Despoina
Puigt Matthieu
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

International audienceRecently, we proposed an approach inspired by Sparse Component Analysis for real-time localization of multiple sound sources using a circular microphone array. The method was based on identifying time-frequency zones where only one source is active, reducing the problem to single-source localization for these zones. A histogram of estimated Directions of Arrival (DOAs) was formed and then processed to obtain improved DOAestimates, assuming that the number of sources was known. In this paper, we extend our previous work by proposing three different methods for counting the number of sources by looking for prominent peaks in the derived histogram based on: (a) performing a peak search, (b) processing an LPC-smoothed version of the histogram, (c) employing a matching pursuit-based approach. The third approach is shown to perform very accurately in simulated reverberant conditions and additive noise, and its computational requirements are very small

Hal-Diderot

Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

Author: Alameda-Pineda Xavier
Ban Yutong
Girin Laurent
Horaud Radu
Li Xiaofei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/02/2019
Field of study

We address the problem of online localization and tracking of multiple moving speakers in reverberant environments. The paper has the following contributions. We use the direct-path relative transfer function (DP-RTF), an inter-channel feature that encodes acoustic information robust against reverberation, and we propose an online algorithm well suited for estimating DP-RTFs associated with moving audio sources. Another crucial ingredient of the proposed method is its ability to properly assign DP-RTFs to audio-source directions. Towards this goal, we adopt a maximum-likelihood formulation and we propose to use an exponentiated gradient (EG) to efficiently update source-direction estimates starting from their currently available values. The problem of multiple speaker tracking is computationally intractable because the number of possible associations between observed source directions and physical speakers grows exponentially with time. We adopt a Bayesian framework and we propose a variational approximation of the posterior filtering distribution associated with multiple speaker tracking, as well as an efficient variational expectation-maximization (VEM) solver. The proposed online localization and tracking method is thoroughly evaluated using two datasets that contain recordings performed in real environments.Comment: IEEE Journal of Selected Topics in Signal Processing, 201

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-Rennes 1

Microphone array for speaker localization and identification in shared autonomous vehicles

Author: Costa Diogo
Gomes Tiago Manuel Ribeiro
Hammerschmidt Niklas
Lima Carlos
Marques Ivo Cruz
Pereira Samuel
Pinto Sandro
Santos Afonso Manuel Macedo
Sousa João
Sousa Pedro
Sá Bruno Vilaça
Publication venue: 'MDPI AG'
Publication date: 01/03/2022
Field of study

With the current technological transformation in the automotive industry, autonomous vehicles are getting closer to the Society of Automative Engineers (SAE) automation level 5. This level corresponds to the full vehicle automation, where the driving system autonomously monitors and navigates the environment. With SAE-level 5, the concept of a Shared Autonomous Vehicle (SAV) will soon become a reality and mainstream. The main purpose of an SAV is to allow unrelated passengers to share an autonomous vehicle without a driver/moderator inside the shared space. However, to ensure their safety and well-being until they reach their final destination, active monitoring of all passengers is required. In this context, this article presents a microphone-based sensor system that is able to localize sound events inside an SAV. The solution is composed of a Micro-Electro-Mechanical System (MEMS) microphone array with a circular geometry connected to an embedded processing platform that resorts to Field-Programmable Gate Array (FPGA) technology to successfully process in the hardware the sound localization algorithms.This work is supported by: European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project nº 039334; Funding Reference: POCI-01-0247-FEDER-039334]

Multidisciplinary Digital Publishing Institute

Universidade do Minho: RepositoriUM

Directory of Open Access Journals

Deep learning assisted sound source localization from a flying drone

Author: Cavallaro A
Wang L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/09/2022
Field of study

Queen Mary Research Online

Measurement-based auralization methodology for the assessment of noise mitigation measures

Author: Algazi
Algazi
Aylor
Blauert
Chen
Cho
Defrance
Dick Botteldooren
Evans
Forssén
Gardner
Golub
Grantham
Jagla
Jonasson
Kistler
Langendijk
Na
Perrott
Pierce
Pieter Thomas
Pollow
Ribeiro
Salomons
Salomons
Timothy Van Renterghem
Van Renterghem
Van Renterghem
Weigang Wei
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography

Ambisonics

Author: Frank Matthias
Zotter Franz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book provides a concise explanation of the fundamentals and background of the surround sound recording and playback technology Ambisonics. It equips readers with the psychoacoustical, signal processing, acoustical, and mathematical knowledge needed to understand the inner workings of modern processing utilities, special equipment for recording, manipulation, and reproduction in the higher-order Ambisonic format. The book comes with various practical examples based on free software tools and open scientific data for reproducible research. The book’s introductory section offers a perspective on Ambisonics spanning from the origins of coincident recordings in the 1930s to the Ambisonic concepts of the 1970s, as well as classical ways of applying Ambisonics in first-order coincident sound scene recording and reproduction that have been practiced since the 1980s. As, from time to time, the underlying mathematics become quite involved, but should be comprehensive without sacrificing readability, the book includes an extensive mathematical appendix. The book offers readers a deeper understanding of Ambisonic technologies, and will especially benefit scientists, audio-system and audio-recording engineers. In the advanced sections of the book, fundamentals and modern techniques as higher-order Ambisonic decoding, 3D audio effects, and higher-order recording are explained. Those techniques are shown to be suitable to supply audience areas ranging from studio-sized to hundreds of listeners, or headphone-based playback, regardless whether it is live, interactive, or studio-produced 3D audio material

Directory of Open Access Books (DOAB)

Backward Compatible Spatialized Teleconferencing based on Squeezed Recordings

Author: Bin Cheng
Christian H. Ritz
Eva Cheng
Ian S Burnett
Muawiyath Shujau
Xiguang Zheng
Publication venue: 'IntechOpen'
Publication date: 01/01/2011
Field of study

Commercial teleconferencing systems currently available, although offering sophisticated video stimulus of the remote participants, commonly employ only mono or stereo audio playback for the user. However, in teleconferencing applications where there are multiple participants at multiple sites, spatializing the audio reproduced at each site (using headphones or loudspeakers) to assist listeners to distinguish between participating speakers can significantly improve the meeting experience (Baldis, 2001; Evans et al., 2000; Ward & Elko 1999; Kilgore et al., 2003; Wrigley et al., 2009; James & Hawksford, 2008). An example is Vocal Village (Kilgore et al., 2003), which uses online avatars to co-locate remote participants over the Internet in virtual space with audio spatialized over headphones (Kilgore, et al., 2003). This system adds speaker location cues to monaural speech to create a user manipulable soundfield that matches the avatar’s position in the virtual space. Giving participants the freedom to manipulate the acoustic location of other participants in the rendered sound scene that they experience has been shown to provide for improved multitasking performance (Wrigley et al., 2009). A system for multiparty teleconferencing requires firstly a stage for recording speech from multiple participants at each site. These signals then need to be compressed to allow for efficient transmission of the spatial speech. One approach is to utilise close-talking microphones to record each participant (e.g. lapel microphones), and then encode each speech signal separately prior to transmission (James & Hawksford, 2008). Alternatively, for increased flexibility, a microphone array located at a central point on, say, a meeting table can be used to generate a multichannel recording of the meeting speech A microphone array approach is adopted in this work and allows for processing of the recordings to identify relative spatial locations of the sources as well as multichannel speech enhancement techniques to improve the quality of recordings in noisy environments. For efficient transmission of the recorded signals, the approach also requires a multichannel compression technique suitable to spatially recorded speech signals

IntechOpen

Crossref

OPUS - University of Technology Sydney

Research Online