25 research outputs found
Bayesian framework for multiple acoustic source tracking
Acoustic source (speaker) tracking in the room environment plays an important role in many
speech and audio applications such as multimedia, hearing aids and hands-free speech communication
and teleconferencing systems; the position information can be fed into a higher
processing stage for high-quality speech acquisition, enhancement of a specific speech signal
in the presence of other competing talkers, or keeping a camera focused on the speaker in
a video-conferencing scenario. Most of existing systems focus on the single source tracking
problem, which assumes one and only one source is active all the time, and the state to be estimated
is simply the source position. However, in practical scenarios, multiple speakers may
be simultaneously active, and the tracking algorithm should be able to localise each individual
source and estimate the number of sources. This thesis contains three contributions towards
solutions to multiple acoustic source tracking in a moderate noisy and reverberant environment.
The first contribution of this thesis is proposing a time-delay of arrival (TDOA) estimation
approach for multiple sources. Although the phase transform (PHAT) weighted generalised
cross-correlation (GCC) method has been employed to extract the TDOAs of multiple sources,
it is primarily used for a single source scenario and its performance for multiple TDOA estimation
has not been comprehensively studied. The proposed approach combines the degenerate
unmixing estimation technique (DUET) and GCC method. Since the speech mixtures are assumed
window-disjoint orthogonal (WDO) in the time-frequency domain, the spectrograms can
be separated by employing DUET, and the GCC method can then be applied to the spectrogram
of each individual source. The probabilities of detection and false alarm are also proposed to
evaluate the TDOA estimation performance under a series of experimental parameters.
Next, considering multiple acoustic sources may appear nonconcurrently, an extended Kalman
particle filtering (EKPF) is developed for a special multiple acoustic source tracking problem,
namely ânonconcurrent multiple acoustic tracking (NMAT)â. The extended Kalman filter
(EKF) is used to approximate the optimum weights, and the subsequent particle filtering (PF)
naturally takes the previous position estimates as well as the current TDOA measurements into
account. The proposed approach is thus able to lock on the sharp change of the source position
quickly, and avoid the tracking-lag in the general sequential importance resampling (SIR) PF.
Finally, these investigations are extended into an approach to track the multiple unknown and
time-varying number of acoustic sources. The DUET-GCC method is used to obtain the TDOA
measurements for multiple sources and a random finite set (RFS) based Rao-blackwellised PF
is employed and modified to track the sources. Each particle has a RFS form encapsulating
the states of all sources and is capable of addressing source dynamics: source survival, new
source appearance and source deactivation. A data association variable is defined to depict the
source dynamic and its relation to the measurements. The Rao-blackwellisation step is used
to decompose the state: the source positions are marginalised by using an EKF, and only the
data association variable needs to be handled by a PF. The performances of all the proposed
approaches are extensively studied under different noisy and reverberant environments, and are
favorably comparable with the existing tracking techniques
Distribution shift mitigation at test time with performance guarantees
Due to inappropriate sample selection and limited training data, a
distribution shift often exists between the training and test sets. This shift
can adversely affect the test performance of Graph Neural Networks (GNNs).
Existing approaches mitigate this issue by either enhancing the robustness of
GNNs to distribution shift or reducing the shift itself. However, both
approaches necessitate retraining the model, which becomes unfeasible when the
model structure and parameters are inaccessible. To address this challenge, we
propose FR-GNN, a general framework for GNNs to conduct feature reconstruction.
FRGNN constructs a mapping relationship between the output and input of a
well-trained GNN to obtain class representative embeddings and then uses these
embeddings to reconstruct the features of labeled nodes. These reconstructed
features are then incorporated into the message passing mechanism of GNNs to
influence the predictions of unlabeled nodes at test time. Notably, the
reconstructed node features can be directly utilized for testing the
well-trained model, effectively reducing the distribution shift and leading to
improved test performance. This remarkable achievement is attained without any
modifications to the model structure or parameters. We provide theoretical
guarantees for the effectiveness of our framework. Furthermore, we conduct
comprehensive experiments on various public datasets. The experimental results
demonstrate the superior performance of FRGNN in comparison to mainstream
methods
Virtual Simulation Platform for Training Semi-Autonomous Robotic Vehiclesâ Operators
This chapter covers the development of a virtual simulation platform for training a semiautonomous robotic vehicle (SARV) operator via an open-source game engine called Unity3D. The SARV such as remotely operated vehicles (ROVs) is becoming increasingly popular in the maritime industry for risky jobs in inhospitable environments. The primary element in carrying out underwater missions in a hostile environment lies within the skills and experience of an ROV pilot. Training for ROV pilots is essential to prevent damage to expensive field equipment during the real operations. The proposed simulator differs from the existing simulators in the market is the use of modern game engine software to develop a âserious gameâ for ROV pilot trainee at much lower cost and shorter time-to-market. The results revealed that proposed virtual simulator can develop a high-fidelity virtual reality training for the underwater operation guided by classification society
Particle filtering approaches for multiple acoustic source detection and 2-D direction of arrival estimation using a single acoustic vector sensor
This paper considers the problem of tracking multiple acoustic sources using a single acoustic vector sensor (AVS). Firstly, a particle filtering (PF) approach is developed to track the direction of arrivals of fixed and known number of sources. Secondly, a more realistic tracking scenario which assumes that the number of acoustic sources is unknown and time-varying is considered. A random finite set (RFS) framework is employed to characterize the randomness of the state process, i.e., the dynamics of source motion and the number of active sources, as well as the measurement process. As deriving a closed-form solution for the multi-source probability density is difficult, a particle filtering approach is employed to arrive at a computationally tractable approximation of the RFS densities. The proposed RFS-PF algorithm is able to simultaneously detect and track multiple sources. Simulations under different tracking scenarios demonstrate the ability of the proposed approaches in tracking multiple acoustic sources
A random finite set approach for joint detection and tracking of multiple wideband sources using a distributed acoustic vector sensor array
This paper considers the problem of tracking multiple wideband acoustic sources in three dimensional (3-D) space using a distributed acoustic vector sensor (AVS) array. Least square approaches have been proposed to fuse the DOA measurements and estimate the 3-D position. However, the performance of position estimation can be seriously degraded by inaccurate DOA estimates, and also multiple source localization is impossible unless the DOA estimates can be associated to each source correctly. In this paper, A random finite set (RFS) approach is developed to jointly detect and track multiple wideband acoustic sources. An RFS is able to characterize the randomness of the state process (i.e., the dynamics of source motion and the number of active sources) as well as the measurement process (i.e., source detections, false alarms and miss detections). Since deriving a closed-form solution does not exist for the multi-source probability density, a particle filtering approach is employed to arrive at a computationally tractable approximation of the RFS densities. Simulations in different tracking scenarios demonstrate the ability of the proposed approaches in multiple acoustic source detection and tracking.Published versio
A dynamic Bayesian nonparametric model for blind calibration of sensor networks
We consider the problem of blind calibration of a sensor network, where the sensor gains and offsets are estimated from noisy observations of unknown signals. This is in general a nonidentifiable problem, unless restrictive assumptions on the signal subspace or sensor observations are imposed. We show that if each signal observed by the sensors follows a known dynamic model with additive noise, then the sensor gains and offsets are identifiable. We propose a dynamic Bayesian nonparametric model to infer the sensorsâ gains and offsets. Our model allows different sensor clusters to observe different unknown signals, without knowing the sensor clusters a priori . We develop an offline algorithm using block Gibbs sampling and a linearized forward filtering backward sampling method that estimates the sensor clusters, gains, and offsets jointly. Furthermore, for practical implementation, we also propose an online inference algorithm based on particle filtering and local Markov chain Monte Carlo. Simulations using a synthetic dataset, and experiments on two real datasets suggest that our proposed methods perform better than several other blind calibration methods, including a sparse Bayesian learning approach, and methods that first cluster the sensor observations and then estimate the gains and offsets.NRF (Natl Research Foundation, Sâpore)MOE (Min. of Education, Sâpore)Accepted versio
Performance study of vertical acoustic vector sensor array based 3-D position tracking in a shallow ocean environment
This paper considers the problem of tracking an acoustic sources in three dimensional (3-D) space by using a vertical acoustic vector sensor (AVS) array in a shallow ocean environment. The innovations of this work are double fold: 1) a particle filtering (PF) approach is developed to track the position of an acoustic source; and 2) based on the source motion and wave propagation models, the posterior Cramér-Rao bound (PCRB) is derived to provide a lower performance bound of 3-D position tracking in shallow ocean. The PF approach uses a number of samples to approximate the posterior distribution of interested parameters, by which a complex 3-D search can be avoided for 3-D position estimation. Also, due to incorporating both the source dynamic and measurement information, the tracking approach is able to provide a lower performance bound than the traditional localization approach. The tracking performance is further demonstrated by numerical experiments
Particle filtering and posterior Cramér-Rao bound for 2-D direction of arrival tracking using an acoustic vector sensor
Acoustic vector sensor (AVS) measures acoustic pressure as well as particle velocity, and therefore AVS signal contains 2-D (azimuth and elevation) DOA information of an acoustic source. Existing DOA estimation techniques assume that the source is static and extensively rely on the localization methods. In this paper, a particle filtering (PF) tracking approach is developed to estimate the 2-D DOA from signals collected by an AVS. A constant velocity model is employed to model the source dynamics and the likelihood function is derived based on a maximum likelihood estimation of the source amplitude and the noise variance. The posterior Cramér-Rao bound (PCRB) is also derived to provide a lower performance bound for AVS signal based tracking problem. Since PCRB incorporates the information from the source dynamics and measurement models, it is usually lower than traditional Cramér-Rao bound which only employs measurement model information. Experiments show that the proposed PF tracking algorithm significantly outperforms Capon beamforming based localization method and is much closer to the PCRB even in a challenging environment (e.g., SNR = -10 dB)