49 research outputs found

    Acoustic Echo Estimation using the model-based approach with Application to Spatial Map Construction in Robotics

    Get PDF

    Estimation of acoustic echoes using expectation-maximization methods

    Get PDF

    Sound Source Separation

    Get PDF
    This is the author's accepted pre-print of the article, first published as G. Evangelista, S. Marchand, M. D. Plumbley and E. Vincent. Sound source separation. In U. Zölzer (ed.), DAFX: Digital Audio Effects, 2nd edition, Chapter 14, pp. 551-588. John Wiley & Sons, March 2011. ISBN 9781119991298. DOI: 10.1002/9781119991298.ch14file: Proof:e\EvangelistaMarchandPlumbleyV11-sound.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:e\EvangelistaMarchandPlumbleyV11-sound.pdf:PDF owner: markp timestamp: 2011.04.2

    Application of sound source separation methods to advanced spatial audio systems

    Full text link
    This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in twochannel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to extract the different objects that compose the stereo scene. Unfortunately, most stereo mixtures are underdetermined, i.e., there are more sound sources than audio channels. This condition makes the SSS problem especially difficult and stronger assumptions have to be taken, often related to the sparsity of the sources under some signal transformation. This thesis is focused on the application of SSS techniques to the spatial sound reproduction field. As a result, its contributions can be categorized within these two areas. First, two underdetermined SSS methods are proposed to deal efficiently with the separation of stereo sound mixtures. These techniques are based on a multi-level thresholding segmentation approach, which enables to perform a fast and unsupervised separation of sound sources in the time-frequency domain. Although both techniques rely on the same clustering type, the features considered by each of them are related to different localization cues that enable to perform separation of either instantaneous or real mixtures.Additionally, two post-processing techniques aimed at improving the isolation of the separated sources are proposed. The performance achieved by several SSS methods in the resynthesis of WFS sound scenes is afterwards evaluated by means of listening tests, paying special attention to the change observed in the perceived spatial attributes. Although the estimated sources are distorted versions of the original ones, the masking effects involved in their spatial remixing make artifacts less perceptible, which improves the overall assessed quality. Finally, some novel developments related to the application of time-frequency processing to source localization and enhanced sound reproduction are presented.Cobos Serrano, M. (2009). Application of sound source separation methods to advanced spatial audio systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8969Palanci

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Improvement of detection and tracking techniques in multistatic passive radar systems. (Mejora de técnicas de detección y seguimiento en sistemas radar pasivos multiestáticos)

    Get PDF
    Esta tesis doctoral es el resultado de una intensa actividad investigadora centrada en los sensores radar pasivos para la mejora de las capacidades de detección y seguimiento en escenarios complejos con blancos terrestres y pequeños drones. El trabajo de investigación se ha llevado a cabo en el grupo de investigación coordinado por la Dra. María Pilar Jarabo Amores, dentro del marco diferentes proyectos: IDEPAR (“Improved DEtection techniques for PAssive Radars”), MASTERSAT (“MultichAnnel paSsive radar receiver exploiting TERrestrial and SATellite Illuminators”) y KRIPTON (“A Knowledge based appRoach to passIve radar detection using wideband sPace adapTive prOcessiNg”) financiados por el Ministerio de Economía y Competitividad de España; MAPIS (Multichannel passive ISAR imaging for military applications) y JAMPAR (“JAMmer-based PAssive Radar”), financiados por la Agencia Europea de Defensa (EDA) . El objetivo principal es la mejora de las técnicas de detección y seguimiento en radares pasivos con configuraciones biestáticas y multiestaticas. En el documento se desarrollan algoritmos para el aprovechamiento de señales procedentes de distintos iluminadores de oportunidad (transmisores DVB-T, satélites DVB-S y señales GPS). Las soluciones propuestas han sido integradas en el demostrador tecnológico IDEPAR, desarrollado y actualizado bajo los proyectos mencionados, y validadas en escenarios reales declarados de interés por potenciales usuarios finales (Direccion general de armamento y material, instituto nacional de tecnología aeroespacial y la armada española). Para el desarrollo y evaluación de cadenas de las cadenas de procesado, se plantean dos casos de estudio: blancos terrestres en escenarios semiurbanos edificios y pequeños blancos aéreos en escenarios rurales y costeros. Las principales contribuciones se pueden resumir en los siguientes puntos: • Diseño de técnicas de seguimiento 2D en el espacio de trabajo rango biestático-frecuencia Doppler: se desarrollan técnicas de seguimiento para los dos casos de estudio, localización de blancos terrestres y pequeños drones. Para es último se implementan técnicas capaces de seguir tanto el movimiento del dron como su firma Doppler, lo que permite implementar técnicas de clasificación de blancos. • Diseño de técnicas de seguimiento de blancos capaces de integrar información en el espacio 3D (rango, Doppler y acimut): se diseñan técnicas basadas en procesado en dos etapas, una primera con seguimiento en 2D para el filtrado de falsas alarmas y la segunda para el seguimiento en 3D y la conversión de coordenadas a un plano local cartesiano. Se comparan soluciones basadas en filtros de Kalman para sistemas tanto lineales como no lineales. • Diseño de cadenas de procesado para sistemas multiestáticos: la información estimada del blanco sobre múltiples geometrías biestáticas es utilizada para incremento de las capacidades de localización del blanco en el plano cartesiano local. Se presentan soluciones basadas en filtros de Kalman para sistemas no lineales explotando diferentes medidas biestáticas en el proceso de transformación de coordenadas, analizando las mejoras de precisión en la localización del blanco. • Diseño de etapas de procesado para radares pasivos basados en señales satelitales de las constelaciones GPS DVB-S. Se estudian las características de las señales satelitales identificando sus inconvenientes y proponiendo cadenas de procesado que permitan su utilización para la detección y seguimiento de blancos terrestres. • Estudio del uso de señales DVB-T multicanal con gaps de transmisión entre los diferentes canales en sistemas radares pasivos. Con ello se incrementa la resolución del sistema, y las capacidades de detección, seguimiento y localización. Se estudia el modelo de señal multicanal, sus efectos sobre el procesado coherente y se proponen cadenas de procesado para paliar los efectos adversos de este tipo de señales

    실내 다중 음원 환경에 적용 가능한 음향 신호 처리 기법과 그 응용

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2022. 8. 김성철.최근 음향 신호 처리에 대한 연구가 증가하고 있다. 음향 신호 처리를 통해 유의미한 정보를 얻어내 유용하게 활용할 수 있기 때문이다. 따라서 본 논문에서는 실내 환경에서 취득한 소리에 적용 가능한 음향 신호 처리 기법에 관한 내용을 다룬다. 처음으로는 잔향이 높고 잡음이 많은 실내 환경에서 녹음한 음원 신호로부터 음원 위치를 추정하는 기법을 소개한다. 기존 음원 위치 추정 기법인 에너지 기반 위치 추정, 시간 지연 기반 위치 추정 및 SRP-PHAT 기반 위치추정 기법의 경우 잔향이 높아 소리가 울리는 실내 환경에 적용하면 그 정확도가 떨어진다. 반면 본 논문에서는 여러개의 마이크로 구성된 마이크 어레이로 부터 최적의 성능을 낼 수 있는 마이크의 조합을 찾아낼 수 있는 비용 함수를 새로이 정의한다. 이 비용함수 값이 최저가 되는 마이크 조합을 찾아내 해당 마이크로 음원 위치 추정을 진행한 결과 기존 기법 대비 거리 오차가 줄어든 것을 확인하였다. 다음으로는 손실이 발생한 녹음 음원에서 손실된 값을 복원하는 기법을 소개한다. 본 기법에서 목표로 삼는 음원은 여러 개의 사인파형 신호가 합쳐져서 들어오는 음원이다. 무향실에는 여러개의 음원이 존재하지만 마이크는 단 한개만 있는 상황을 가정한다. 사인 파형은 오일러 공식에 기반해 지수 함수 꼴로 변형할 수 있고, 만약 지수함수 구성 항 중 일부가 등비수열을 따르는 경우 본 논문에서 소개하는 기법을 이용해 해당 등비수열의 구성값을 구할 수 있다. 본 문제를 풀기 위해 랜덤 포크라는 개념을 새로이 도입했다. 본 기법을 이용해 신호를 복원한 결과, 신호 복원 정확도는 기존의 압축 센싱 기반 복원기법 및 DNN 기반 복원 기법보다 그 정확도가 높았다. 마지막으로 본 논문에서는 이전에 소개한 SSRF 기법을 기반으로 합쳐진 신호를 분리하는 기법을 소개한다. 본 기법에서는 이전과 같이 사인 파형의 신호가 합쳐져서 들어오는 상황을 가정한다. 거기에 더해 이전 기법에서는 모든 사인 파형이 동시에 재생되는 상황을 가정한 반면, 본 기법에서는 각기 다른 음원이 마이크로 부터 각각 다른 거리만큼 떨어져 있어서 모두 다른 시간 지연을 가지고 마이크로 도달하는 상황을 가정한다. 이렇게 서로 다른 시간지연을 갖고 하나의 마이크로 도달하는 사인파형의 신호가 합쳐진 상황에서 각각의 신호를 분리한다. 본 논문에서 소개하는 기법은 크게 음원 갯수 추정, 시간 지연 추정 및 신호 분리의 세 개 단계로 구성된다. 기존의 음향 신호 분리 기법들이 음원의 갯수에 대한 정보를 미리 알아야 한다거나, 시간지연이 없는 신호에 대해서만 적용이 가능했다면, 본 기법은 사전에 음원 갯수에 대한 정보가 없어도 적용 가능하다는 장점이 있다. 해당 기법은 SSRF 기법을 기반으로 하는데, SSRF 문제를 푸는 과정에서 구해지는 방정식의 계수 값이 변하는 지점을 시간 지연으로 추정한다. 그리고 시간 지연 값의 변화가 몇 번 발생하는가에 따라 음원의 갯수를 추정한다. 마지막으로 모든 신호가 합쳐진 최종 구간에서 SSRF 문제를 풀어 개별 신호를 구성하는 값을 구해내 신호 분리를 완료한다. 본 기법은 여러 가정이 필요한 기존의 ICA 기반 음향 신호 분리 및 YG 음향 신호 분리에 비해 더 정확한 신호분리 결과를 내는 것을 확인하였다.Recently, research on acoustic signal processing is increasing. This is because meaningful information can be obtained and utilized usefully from acoustic signal processing. Therefore, this paper deals with the acoustic signal processing techniques for sound recorded in the indoor environment. First, we introduce a method for estimating the location of a sound source under indoor environment where there are high reverberation and lots of noise. In the case of existing methods such as interaural level difference (ILD) based localization, time difference of arrival (TDoA) based localization, and steered response power phase transformation (SRP-PHAT) based localization, the accuracy is lowered when applied under recordings from indoor environment with high reverberation. However in this paper, we define a new cost function that can find an optimal combination of microphone pair which results in highest performance. The microphone pair with the lowest value of cost function was chosen as an optimal pair, and the source location was estimated with the optimal microphone pair. It was confirmed that the distance error was reduced compared to existing methods. Next, a technique for recovering the lost sample value from the recorded signal called sketching and stacking with random fork (SSRF) is introduced. In this technique, the target sound source is a superposition of several sinusoidal signals. It is assumed that there are multiple sound sources in the anechoic chamber, but there is only one microphone. It is trivial that a sinusiodal wave can be transformed into an exponential function based on Euler's formula. If some of the terms of the exponential function follow a geometric sequence, those values can be obtained using SSRF. To solve this problem, the concept of a random fork is newly introduced. Comparing the recovery error based on SSRF with existing methods such as compressive sensing based technique and deep neural network (DNN) based technique, the accuracy of SSRF based signal recovery was higher. Finally, this paper introduces a blind source separation (BSS) technique for based on the previously introduced SSRF technique. In this technique, as before, it is assumed that the sinusoidal waves are superposed. In addition, while the previous technique assumed a situation where all sinusoidal waves were emitted simultaneously, this technique assumed a situation where different sound sources were separated by different distances from the microphone and arrived at the microphone with different time delays. Under these assumptions, a new BSS method for separating single signals from the mixture based on SSRF is introduced. The SSRF BSS is mainly composed of three steps: estimation of the number of sound sources, estimation of time delay, and signal separation. While the existing BSS methods require information on the source number to be known a priori, SSRF BSS does not require source number. Whereas existing BSS methods can only be applied to signals without time delay, SSRF BSS method has the advantage in that it can be applied to the mixture of signals with different time delays. It was confirmed that SSRF BSS produces more accurate separation results compared to the existing independent component analysis (ICA) BSS and Yu Gang (YG) BSS.1 INTRODUCTION 2 IMPROVING ACOUSTIC LOCALIZATION PERFORMANCE BY FINDING OPTIMAL PAIR OF MICROPHONES BASED ON COST FUNCTION 5 2.1 Motivation 5 2.2 Conventional Acoustic Localization Methods 8 2.2.1 Interaural Level Difference 8 2.2.2 Time Difference of Arrival 12 2.2.3 Steered Response Power Phase Transformation 14 2.3 System Model 17 2.3.1 Experimental Scenarios 17 2.3.2 Definition of Cost Function 18 2.4 Results and Discussion 20 2.5 Summary 22 3 ACOUSTIC SIGNAL RECOVERY BASED ON SKETCHING AND STACKING WITH RANDOM FORK 24 3.1 Motivation 24 3.2 SSRF Signal Model 26 3.2.1 Source Signal Model 26 3.2.2 Sampled Signal Model 26 3.2.3 Corrupted Signal Model 27 3.3 SSRF Problem Statement 28 3.4 SSRF Methodology 28 3.4.1 Geometric Sequential Representation 29 3.4.2 Definition of Random Fork 30 3.4.3 Informative Matrix 31 3.4.4 Data Augmentation 32 3.4.5 Solution of SSRF Problem 33 3.4.6 Reconstruction of Corrupted Samples 37 3.5 Performance Analysis 37 3.5.1 Simulation Set-up 37 3.5.2 Reconstruction Error According to Bernoulli Parameter and Number of Signals 38 3.5.3 Detailed Comparison between SSRF and DNN 40 3.5.4 SSRF Result for Signal with Additive White Gaussian Noise 42 3.6 Summary 43 4 SINGLE CHANNEL ACOUSTIC SOURCE NUMBER ESTIMATION AND BLIND SOURCE SEPARATION BASED ON SKETCHING AND STACKING WITH RANDOM FORK 44 4.1 Motivation 44 4.2 SSRF based BSS System Model 48 4.2.1 Simulation Scenarios 48 4.3 SSRF based BSS Methodology 52 4.3.1 Source Number and ToA Estimation based on SSRF 52 4.3.2 Signal Separation 55 4.4 Results and Discussion 57 4.4.1 Source Number and ToA Estimation Results 57 4.4.2 Separation of the Signal 59 4.5 Summary 61 5 CONCLUSION 64 Abstract (In Korean) 75박
    corecore