659 research outputs found

    An embedded multichannel sound acquisition system for drone audition

    Get PDF
    Microphone array techniques can improve the acoustic sensing performance on drones, compared to the use of a single microphone. However, multichannel sound acquisition systems are not available in current commercial drone platforms. We present an embedded multichannel sound acquisition and recording system with eight microphones mounted on a quadcopter. The system is developed based on Bela, an embedded computing system for audio processing. The system can record the sound from multiple microphones simultaneously; can store the data locally for on-device processing; and can transmit the multichannel audio via wireless communication to a ground terminal for remote processing. We disclose the technical details of the hardware, software design and development of the system. We implement two setups that place the microphone array at different locations on the drone body. We present experimental results obtained by state-of-the-art drone audition algorithms applied to the sound recorded by the embedded system flying with a drone. It is shown that the ego-noise reduction performance achieved by the microphone array varies depending on the array placement and the location of the target sound. This observation provides valuable insights to hardware development for drone audition

    Software Defined Media: Virtualization of Audio-Visual Services

    Full text link
    Internet-native audio-visual services are witnessing rapid development. Among these services, object-based audio-visual services are gaining importance. In 2014, we established the Software Defined Media (SDM) consortium to target new research areas and markets involving object-based digital media and Internet-by-design audio-visual environments. In this paper, we introduce the SDM architecture that virtualizes networked audio-visual services along with the development of smart buildings and smart cities using Internet of Things (IoT) devices and smart building facilities. Moreover, we design the SDM architecture as a layered architecture to promote the development of innovative applications on the basis of rapid advancements in software-defined networking (SDN). Then, we implement a prototype system based on the architecture, present the system at an exhibition, and provide it as an SDM API to application developers at hackathons. Various types of applications are developed using the API at these events. An evaluation of SDM API access shows that the prototype SDM platform effectively provides 3D audio reproducibility and interactiveness for SDM applications.Comment: IEEE International Conference on Communications (ICC2017), Paris, France, 21-25 May 201

    Relative Transfer Function Vector Estimation for Acoustic Sensor Networks Exploiting Covariance Matrix Structure

    Full text link
    In many multi-microphone algorithms for noise reduction, an estimate of the relative transfer function (RTF) vector of the target speaker is required. The state-of-the-art covariance whitening (CW) method estimates the RTF vector as the principal eigenvector of the whitened noisy covariance matrix, where whitening is performed using an estimate of the noise covariance matrix. In this paper, we consider an acoustic sensor network consisting of multiple microphone nodes. Assuming uncorrelated noise between the nodes but not within the nodes, we propose two RTF vector estimation methods that leverage the block-diagonal structure of the noise covariance matrix. The first method modifies the CW method by considering only the diagonal blocks of the estimated noise covariance matrix. In contrast, the second method only considers the off-diagonal blocks of the noisy covariance matrix, but cannot be solved using a simple eigenvalue decomposition. When applying the estimated RTF vector in a minimum variance distortionless response beamformer, simulation results for real-world recordings in a reverberant environment with multiple noise sources show that the modified CW method performs slightly better than the CW method in terms of SNR improvement, while the off-diagonal selection method outperforms a biased RTF vector estimate obtained as the principal eigenvector of the noisy covariance matrix.Comment: Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz NY, USA, Oct. 202

    Speech enhancement using ego-noise references with a microphone array embedded in an unmanned aerial vehicle

    Full text link
    A method is proposed for performing speech enhancement using ego-noise references with a microphone array embedded in an unmanned aerial vehicle (UAV). The ego-noise reference signals are captured with microphones located near the UAV's propellers and used in the prior knowledge multichannel Wiener filter (PK-MWF) to obtain the speech correlation matrix estimate. Speech presence probability (SPP) can be estimated for detecting speech activity from an external microphone near the speech source, providing a performance benchmark, or from one of the embedded microphones, assuming a more realistic scenario. Experimental measurements are performed in a semi-anechoic chamber, with a UAV mounted on a stand and a loudspeaker playing a speech signal, while setting three distinct and fixed propeller rotation speeds, resulting in three different signal-to-noise ratios (SNRs). The recordings obtained and made available online are used to compare the proposed method to the use of the standard multichannel Wiener filter (MWF) estimated with and without the propellers' microphones being used in its formulation. Results show that compared to those, the use of PK-MWF achieves higher levels of improvement in speech intelligibility and quality, measured by STOI and PESQ, while the SNR improvement is similar

    RTF-Based Binaural MVDR Beamformer Exploiting an External Microphone in a Diffuse Noise Field

    Full text link
    Besides suppressing all undesired sound sources, an important objective of a binaural noise reduction algorithm for hearing devices is the preservation of the binaural cues, aiming at preserving the spatial perception of the acoustic scene. A well-known binaural noise reduction algorithm is the binaural minimum variance distortionless response beamformer, which can be steered using the relative transfer function (RTF) vector of the desired source, relating the acoustic transfer functions between the desired source and all microphones to a reference microphone. In this paper, we propose a computationally efficient method to estimate the RTF vector in a diffuse noise field, requiring an additional microphone that is spatially separated from the head-mounted microphones. Assuming that the spatial coherence between the noise components in the head-mounted microphone signals and the additional microphone signal is zero, we show that an unbiased estimate of the RTF vector can be obtained. Based on real-world recordings, experimental results for several reverberation times show that the proposed RTF estimator outperforms the widely used RTF estimator based on covariance whitening and a simple biased RTF estimator in terms of noise reduction and binaural cue preservation performance.Comment: Accepted at ITG Conference on Speech Communication 201

    A multimode SoC FPGA-based acoustic camera for wireless sensor networks

    Get PDF
    Acoustic cameras allow the visualization of sound sources using microphone arrays and beamforming techniques. The required computational power increases with the number of microphones in the array, the acoustic images resolution, and in particular, when targeting real-time. Such computational demand leads to a prohibitive power consumption for Wireless Sensor Networks (WSNs). In this paper, we present a SoC FPGA based architecture to perform a low-power and real-time accurate acoustic imaging for WSNs. The high computational demand is satisfied by performing the acoustic acquisition and the beamforming technique on the FPGA side. The hard-core processor enhances and compresses the acoustic images before transmitting to the WSN. As a result, the WSN manages the supported configuration modes of the acoustic camera. For instance, the resolution of the acoustic images can be adapted on-demand to satisfy the available network's BW while performing real-time acoustic imaging. Our performance measurements show that acoustic images are generated on the FPGA in real time with resolutions of 160x120 pixels operating at 32 frames-per-second. Nevertheless, higher resolutions are achievable thanks to the exploitation of the hard-core processor available in SoC FPGAs such as Zynq

    A Low-cost and Portable Active Noise Control Unit

    Full text link
    The objective of this research is to employ cutting-edge active noise control methodologies in order to mitigate the noise emissions produced by electrical appliances, such as a coffee machine. The algorithm utilized in this study is the modified Filtered-X Least Mean Square (FXLMS) algorithm. This algorithm aims to generate an anti-noise waveform by utilizing measurements from both the reference microphone and the error microphone. The desired outcome of this approach is to achieve a residual noise level of zero. The primary difficulty lies in conducting the experiment in an open space setting, as conventional active noise control systems are designed to function within enclosed environments, such as closed rooms or relatively confined spaces like the volume inside headphones. A validation test bench is established, employing the Sigma Studio software to oversee the entire system, with the ADAU1452 digital signal processor being chosen. This study presents an introduction to different Active Noise Control systems and algorithms, followed by the execution of simulations for representative techniques. Subsequently, this section provides a comprehensive account of the procedures involved in executing the experiments, followed by an exploration of potential avenues for further research.Comment: A final year project report presented to the Nanyang Technological Universit
    • …
    corecore