659 research outputs found
An embedded multichannel sound acquisition system for drone audition
Microphone array techniques can improve the acoustic sensing performance on drones, compared to the use of a single microphone. However, multichannel sound acquisition systems are not available in current commercial drone platforms. We present an embedded multichannel sound acquisition and recording system with eight microphones mounted on a quadcopter. The system is developed based on Bela, an embedded computing system for audio processing. The system can record the sound from multiple microphones simultaneously; can store the data locally for on-device processing; and can transmit the multichannel audio via wireless communication to a ground terminal for remote processing. We disclose the technical details of the hardware, software design and development of the system. We implement two setups that place the microphone array at different locations on the drone body. We present experimental results obtained by state-of-the-art drone audition algorithms applied to the sound recorded by the embedded system flying with a drone. It is shown that the ego-noise reduction performance achieved by the microphone array varies depending on the array placement and the location of the target sound. This observation provides valuable insights to hardware development for drone audition
Software Defined Media: Virtualization of Audio-Visual Services
Internet-native audio-visual services are witnessing rapid development. Among
these services, object-based audio-visual services are gaining importance. In
2014, we established the Software Defined Media (SDM) consortium to target new
research areas and markets involving object-based digital media and
Internet-by-design audio-visual environments. In this paper, we introduce the
SDM architecture that virtualizes networked audio-visual services along with
the development of smart buildings and smart cities using Internet of Things
(IoT) devices and smart building facilities. Moreover, we design the SDM
architecture as a layered architecture to promote the development of innovative
applications on the basis of rapid advancements in software-defined networking
(SDN). Then, we implement a prototype system based on the architecture, present
the system at an exhibition, and provide it as an SDM API to application
developers at hackathons. Various types of applications are developed using the
API at these events. An evaluation of SDM API access shows that the prototype
SDM platform effectively provides 3D audio reproducibility and interactiveness
for SDM applications.Comment: IEEE International Conference on Communications (ICC2017), Paris,
France, 21-25 May 201
Relative Transfer Function Vector Estimation for Acoustic Sensor Networks Exploiting Covariance Matrix Structure
In many multi-microphone algorithms for noise reduction, an estimate of the
relative transfer function (RTF) vector of the target speaker is required. The
state-of-the-art covariance whitening (CW) method estimates the RTF vector as
the principal eigenvector of the whitened noisy covariance matrix, where
whitening is performed using an estimate of the noise covariance matrix. In
this paper, we consider an acoustic sensor network consisting of multiple
microphone nodes. Assuming uncorrelated noise between the nodes but not within
the nodes, we propose two RTF vector estimation methods that leverage the
block-diagonal structure of the noise covariance matrix. The first method
modifies the CW method by considering only the diagonal blocks of the estimated
noise covariance matrix. In contrast, the second method only considers the
off-diagonal blocks of the noisy covariance matrix, but cannot be solved using
a simple eigenvalue decomposition. When applying the estimated RTF vector in a
minimum variance distortionless response beamformer, simulation results for
real-world recordings in a reverberant environment with multiple noise sources
show that the modified CW method performs slightly better than the CW method in
terms of SNR improvement, while the off-diagonal selection method outperforms a
biased RTF vector estimate obtained as the principal eigenvector of the noisy
covariance matrix.Comment: Proc. IEEE Workshop on Applications of Signal Processing to Audio and
Acoustics (WASPAA), New Paltz NY, USA, Oct. 202
Speech enhancement using ego-noise references with a microphone array embedded in an unmanned aerial vehicle
A method is proposed for performing speech enhancement using ego-noise
references with a microphone array embedded in an unmanned aerial vehicle
(UAV). The ego-noise reference signals are captured with microphones located
near the UAV's propellers and used in the prior knowledge multichannel Wiener
filter (PK-MWF) to obtain the speech correlation matrix estimate. Speech
presence probability (SPP) can be estimated for detecting speech activity from
an external microphone near the speech source, providing a performance
benchmark, or from one of the embedded microphones, assuming a more realistic
scenario. Experimental measurements are performed in a semi-anechoic chamber,
with a UAV mounted on a stand and a loudspeaker playing a speech signal, while
setting three distinct and fixed propeller rotation speeds, resulting in three
different signal-to-noise ratios (SNRs). The recordings obtained and made
available online are used to compare the proposed method to the use of the
standard multichannel Wiener filter (MWF) estimated with and without the
propellers' microphones being used in its formulation. Results show that
compared to those, the use of PK-MWF achieves higher levels of improvement in
speech intelligibility and quality, measured by STOI and PESQ, while the SNR
improvement is similar
RTF-Based Binaural MVDR Beamformer Exploiting an External Microphone in a Diffuse Noise Field
Besides suppressing all undesired sound sources, an important objective of a
binaural noise reduction algorithm for hearing devices is the preservation of
the binaural cues, aiming at preserving the spatial perception of the acoustic
scene. A well-known binaural noise reduction algorithm is the binaural minimum
variance distortionless response beamformer, which can be steered using the
relative transfer function (RTF) vector of the desired source, relating the
acoustic transfer functions between the desired source and all microphones to a
reference microphone. In this paper, we propose a computationally efficient
method to estimate the RTF vector in a diffuse noise field, requiring an
additional microphone that is spatially separated from the head-mounted
microphones. Assuming that the spatial coherence between the noise components
in the head-mounted microphone signals and the additional microphone signal is
zero, we show that an unbiased estimate of the RTF vector can be obtained.
Based on real-world recordings, experimental results for several reverberation
times show that the proposed RTF estimator outperforms the widely used RTF
estimator based on covariance whitening and a simple biased RTF estimator in
terms of noise reduction and binaural cue preservation performance.Comment: Accepted at ITG Conference on Speech Communication 201
A multimode SoC FPGA-based acoustic camera for wireless sensor networks
Acoustic cameras allow the visualization of sound sources using microphone arrays and beamforming techniques. The required computational power increases with the number of microphones in the array, the acoustic images resolution, and in particular, when targeting real-time. Such computational demand leads to a prohibitive power consumption for Wireless Sensor Networks (WSNs). In this paper, we present a SoC FPGA based architecture to perform a low-power and real-time accurate acoustic imaging for WSNs. The high computational demand is satisfied by performing the acoustic acquisition and the beamforming technique on the FPGA side. The hard-core processor enhances and compresses the acoustic images before transmitting to the WSN. As a result, the WSN manages the supported configuration modes of the acoustic camera. For instance, the resolution of the acoustic images can be adapted on-demand to satisfy the available network's BW while performing real-time acoustic imaging. Our performance measurements show that acoustic images are generated on the FPGA in real time with resolutions of 160x120 pixels operating at 32 frames-per-second. Nevertheless, higher resolutions are achievable thanks to the exploitation of the hard-core processor available in SoC FPGAs such as Zynq
A Low-cost and Portable Active Noise Control Unit
The objective of this research is to employ cutting-edge active noise control
methodologies in order to mitigate the noise emissions produced by electrical
appliances, such as a coffee machine. The algorithm utilized in this study is
the modified Filtered-X Least Mean Square (FXLMS) algorithm. This algorithm
aims to generate an anti-noise waveform by utilizing measurements from both the
reference microphone and the error microphone. The desired outcome of this
approach is to achieve a residual noise level of zero. The primary difficulty
lies in conducting the experiment in an open space setting, as conventional
active noise control systems are designed to function within enclosed
environments, such as closed rooms or relatively confined spaces like the
volume inside headphones. A validation test bench is established, employing the
Sigma Studio software to oversee the entire system, with the ADAU1452 digital
signal processor being chosen. This study presents an introduction to different
Active Noise Control systems and algorithms, followed by the execution of
simulations for representative techniques. Subsequently, this section provides
a comprehensive account of the procedures involved in executing the
experiments, followed by an exploration of potential avenues for further
research.Comment: A final year project report presented to the Nanyang Technological
Universit
- …