113 research outputs found

    Multi-Channel Masking with Learnable Filterbank for Sound Source Separation

    Full text link
    This work proposes a learnable filterbank based on a multi-channel masking framework for multi-channel source separation. The learnable filterbank is a 1D Conv layer, which transforms the raw waveform into a 2D representation. In contrast to the conventional single-channel masking method, we estimate a mask for each individual microphone channel. The estimated masks are then applied to the transformed waveform representation like in the traditional filter-and-sum beamforming operation. Specifically, each mask is used to multiply the corresponding channel's 2D representation, and the masked output of all channels are then summed. At last, a 1D transposed Conv layer is used to convert the summed masked signal into the waveform domain. The experimental results show our method outperforms single-channel masking with a learnable filterbank and can outperform multi-channel complex masking with STFT complex spectrum in the STGCSEN model if a learnable filterbank is transformed to a higher feature dimension. The spatial response analysis also verifies that multi-channel masking in the learnable filterbank domain has spatial selectivity

    Feasibility of discriminating UAV propellers noise from distress signals to locate people in enclosed environments using MEMS microphone arrays

    Get PDF
    Producción CientíficaDetecting and finding people are complex tasks when visibility is reduced. This happens, for example, if a fire occurs. In these situations, heat sources and large amounts of smoke are generated. Under these circumstances, locating survivors using thermal or conventional cameras is not possible and it is necessary to use alternative techniques. The challenge of this work was to analyze if it is feasible the integration of an acoustic camera, developed at the University of Valladolid, on an unmanned aerial vehicle (UAV) to locate, by sound, people who are calling for help, in enclosed environments with reduced visibility. The acoustic array, based on MEMS (micro-electro-mechanical system) microphones, locates acoustic sources in space, and the UAV navigates autonomously by closed enclosures. This paper presents the first experimental results locating the angles of arrival of multiple sound sources, including the cries for help of a person, in an enclosed environment. The results are promising, as the system proves able to discriminate the noise generated by the propellers of the UAV, at the same time it identifies the angles of arrival of the direct sound signal and its first echoes reflected on the reflective surfaces.Junta de Castilla y León (project VA082G18

    A spherical microphone array based system for immersive audio scene rendering

    Get PDF
    For many applications it is necessary to capture an acoustic field and present it for human listeners, creating the same acoustic perception for them as if they were actually present in the scene. Possible applications of this technique include entertainment, education, military training, remote telepresence, surveillance, and others. Recently, there is much interest on the use of spherical microphone arrays in acoustic scene capture and reproduction application. We describe a 32-microphone spherical array based system implemented for spatial audio capture and reproduction. The array embeds hardware that is traditionally external, such as preamplifiers, filters, digital-to-analog converters, and USB interface adapter, resulting in a portable lightweight solution and requiring no hardware on PC side whatsoever other than a high-speed USB port. We provide capability analysis of the array and describe software suite developed for the application

    Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

    Get PDF
    We address the problem of online localization and tracking of multiple moving speakers in reverberant environments. The paper has the following contributions. We use the direct-path relative transfer function (DP-RTF), an inter-channel feature that encodes acoustic information robust against reverberation, and we propose an online algorithm well suited for estimating DP-RTFs associated with moving audio sources. Another crucial ingredient of the proposed method is its ability to properly assign DP-RTFs to audio-source directions. Towards this goal, we adopt a maximum-likelihood formulation and we propose to use an exponentiated gradient (EG) to efficiently update source-direction estimates starting from their currently available values. The problem of multiple speaker tracking is computationally intractable because the number of possible associations between observed source directions and physical speakers grows exponentially with time. We adopt a Bayesian framework and we propose a variational approximation of the posterior filtering distribution associated with multiple speaker tracking, as well as an efficient variational expectation-maximization (VEM) solver. The proposed online localization and tracking method is thoroughly evaluated using two datasets that contain recordings performed in real environments.Comment: IEEE Journal of Selected Topics in Signal Processing, 201

    Distributed Microphone Array System for Two-way Audio Communication

    Get PDF
    Tässä työssä esitellään hajautettu mikrofoniryhmäjärjestelmä kahdensuuntaisessa äänikommunikaatiossa. Järjestelmän tavoitteena on paikallistaa hallitseva puhuja ja tallentaa puhesignaali mahdollisimman korkealaatuisesti. Työssä esiteltävässä järjestelmässä jokainen mikrofoniryhmä toimii polynomirakenteella parametrisoituna keilanmuodostajana (PBF), joka mahdollistaa jatkuvan keilanohjauksen. Hallitsevan puhelähteen suunta päätellään PBF:n jokaisen keilan ulostulotehoista. Lopuksi yhdistämällä jokaisen PBF:n kaikkien keilojen ulostulotehot muodostetaan avaruudellinen todennäköisyysfunktio (SLF), jonka suurin arvo määrää puhujan paikan. Puhesignaali tallennetaan ohjaamalla puhujaa lähinnä olevan PBF:n keila puhujan suuntaan. Tässä työssä esiteltävän järjestelmän toiminta arvioitiin simuloidulla ja mitatulla datalla. Arvionti näyttää, että toteutettu järjestelmä pystyy paikallistamaan puhujan noin 40 cm paikannustarkkuudella ja järjestelmä vaimentaa muista suunnista tulevia häiriölähteitä noin 15 dB. Lopuksi järjestelmä toteutettiin reaaliakaisena systeeminä Pure Data signaalinkäsittelyympäristössä.In this work a distributed microphone array system for two-way audio communication is presented. The goal of the system is to locate the dominant speaker and capture the speech signal with highest possible quality. In the presented system each microphone array works as a Polynomial Beamformer (PBF) thus enabling continuous beam steering. The output power of each PBF beam is used to determine the direction of the dominant speech source. Finally, a Spatial Likelihood Function (SLF) is formed by combining the output beam powers of each microphone array and the speaker is determined to be in the point that has highest value of SLF. The audio signal capture is done by steering the closest microphone array to the direction of the speaker. The presented audio capture front-end was evaluated with simulated and measured data. The evaluation shows that the implemented system gives approximately 40 cm localization accuracy and 15 dB attenuation of interference sources. Finally the system was implemented to run in real-time in the Pure Data signal processing environment

    Implementation and evaluation of a low complexity microphone array for speaker recognition

    Get PDF
    Includes bibliographical references (leaves 83-86).This thesis discusses the application of a microphone array employing a noise canceling beamforming technique for improving the robustness of speaker recognition systems in a diffuse noise field

    Development and testing of a dual accelerometer vector sensor for AUV acoustic surveys

    Get PDF
    This paper presents the design, manufacturing and testing of a Dual Accelerometer Vector Sensor (DAVS). The device was built within the activities of theWiMUST project, supported under the Horizon 2020 Framework Programme, which aims to improve the efficiency of the methodologies used to perform geophysical acoustic surveys at sea by the use of Autonomous Underwater Vehicles (AUVs). The DAVS has the potential to contribute to this aim in various ways, for example, owing to its spatial filtering capability, it may reduce the amount of post processing by discriminating the bottom from the surface reflections. Additionally, its compact size allows easier integration with AUVs and hence facilitates the vehicle manoeuvrability compared to the classical towed arrays. The present paper is focused on results related to acoustic wave azimuth estimation as an example of its spatial filtering capabilities. The DAVS device consists of two tri-axial accelerometers and one hydrophone moulded in one unit. Sensitivity and directionality of these three sensors were measured in a tank, whilst the direction estimation capabilities of the accelerometers paired with the hydrophone, forming a vector sensor, were evaluated on a Medusa Class AUV, which was sailing around a deployed sound source. Results of these measurements are presented in this paper.European Union [645141]info:eu-repo/semantics/publishedVersio
    corecore