25 research outputs found

    Bayesian framework for multiple acoustic source tracking

    Get PDF
    Acoustic source (speaker) tracking in the room environment plays an important role in many speech and audio applications such as multimedia, hearing aids and hands-free speech communication and teleconferencing systems; the position information can be fed into a higher processing stage for high-quality speech acquisition, enhancement of a specific speech signal in the presence of other competing talkers, or keeping a camera focused on the speaker in a video-conferencing scenario. Most of existing systems focus on the single source tracking problem, which assumes one and only one source is active all the time, and the state to be estimated is simply the source position. However, in practical scenarios, multiple speakers may be simultaneously active, and the tracking algorithm should be able to localise each individual source and estimate the number of sources. This thesis contains three contributions towards solutions to multiple acoustic source tracking in a moderate noisy and reverberant environment. The first contribution of this thesis is proposing a time-delay of arrival (TDOA) estimation approach for multiple sources. Although the phase transform (PHAT) weighted generalised cross-correlation (GCC) method has been employed to extract the TDOAs of multiple sources, it is primarily used for a single source scenario and its performance for multiple TDOA estimation has not been comprehensively studied. The proposed approach combines the degenerate unmixing estimation technique (DUET) and GCC method. Since the speech mixtures are assumed window-disjoint orthogonal (WDO) in the time-frequency domain, the spectrograms can be separated by employing DUET, and the GCC method can then be applied to the spectrogram of each individual source. The probabilities of detection and false alarm are also proposed to evaluate the TDOA estimation performance under a series of experimental parameters. Next, considering multiple acoustic sources may appear nonconcurrently, an extended Kalman particle filtering (EKPF) is developed for a special multiple acoustic source tracking problem, namely “nonconcurrent multiple acoustic tracking (NMAT)”. The extended Kalman filter (EKF) is used to approximate the optimum weights, and the subsequent particle filtering (PF) naturally takes the previous position estimates as well as the current TDOA measurements into account. The proposed approach is thus able to lock on the sharp change of the source position quickly, and avoid the tracking-lag in the general sequential importance resampling (SIR) PF. Finally, these investigations are extended into an approach to track the multiple unknown and time-varying number of acoustic sources. The DUET-GCC method is used to obtain the TDOA measurements for multiple sources and a random finite set (RFS) based Rao-blackwellised PF is employed and modified to track the sources. Each particle has a RFS form encapsulating the states of all sources and is capable of addressing source dynamics: source survival, new source appearance and source deactivation. A data association variable is defined to depict the source dynamic and its relation to the measurements. The Rao-blackwellisation step is used to decompose the state: the source positions are marginalised by using an EKF, and only the data association variable needs to be handled by a PF. The performances of all the proposed approaches are extensively studied under different noisy and reverberant environments, and are favorably comparable with the existing tracking techniques

    Distribution shift mitigation at test time with performance guarantees

    Full text link
    Due to inappropriate sample selection and limited training data, a distribution shift often exists between the training and test sets. This shift can adversely affect the test performance of Graph Neural Networks (GNNs). Existing approaches mitigate this issue by either enhancing the robustness of GNNs to distribution shift or reducing the shift itself. However, both approaches necessitate retraining the model, which becomes unfeasible when the model structure and parameters are inaccessible. To address this challenge, we propose FR-GNN, a general framework for GNNs to conduct feature reconstruction. FRGNN constructs a mapping relationship between the output and input of a well-trained GNN to obtain class representative embeddings and then uses these embeddings to reconstruct the features of labeled nodes. These reconstructed features are then incorporated into the message passing mechanism of GNNs to influence the predictions of unlabeled nodes at test time. Notably, the reconstructed node features can be directly utilized for testing the well-trained model, effectively reducing the distribution shift and leading to improved test performance. This remarkable achievement is attained without any modifications to the model structure or parameters. We provide theoretical guarantees for the effectiveness of our framework. Furthermore, we conduct comprehensive experiments on various public datasets. The experimental results demonstrate the superior performance of FRGNN in comparison to mainstream methods

    Virtual Simulation Platform for Training Semi-Autonomous Robotic Vehicles’ Operators

    Get PDF
    This chapter covers the development of a virtual simulation platform for training a semiautonomous robotic vehicle (SARV) operator via an open-source game engine called Unity3D. The SARV such as remotely operated vehicles (ROVs) is becoming increasingly popular in the maritime industry for risky jobs in inhospitable environments. The primary element in carrying out underwater missions in a hostile environment lies within the skills and experience of an ROV pilot. Training for ROV pilots is essential to prevent damage to expensive field equipment during the real operations. The proposed simulator differs from the existing simulators in the market is the use of modern game engine software to develop a “serious game” for ROV pilot trainee at much lower cost and shorter time-to-market. The results revealed that proposed virtual simulator can develop a high-fidelity virtual reality training for the underwater operation guided by classification society

    Particle filtering approaches for multiple acoustic source detection and 2-D direction of arrival estimation using a single acoustic vector sensor

    No full text
    This paper considers the problem of tracking multiple acoustic sources using a single acoustic vector sensor (AVS). Firstly, a particle filtering (PF) approach is developed to track the direction of arrivals of fixed and known number of sources. Secondly, a more realistic tracking scenario which assumes that the number of acoustic sources is unknown and time-varying is considered. A random finite set (RFS) framework is employed to characterize the randomness of the state process, i.e., the dynamics of source motion and the number of active sources, as well as the measurement process. As deriving a closed-form solution for the multi-source probability density is difficult, a particle filtering approach is employed to arrive at a computationally tractable approximation of the RFS densities. The proposed RFS-PF algorithm is able to simultaneously detect and track multiple sources. Simulations under different tracking scenarios demonstrate the ability of the proposed approaches in tracking multiple acoustic sources

    A random finite set approach for joint detection and tracking of multiple wideband sources using a distributed acoustic vector sensor array

    No full text
    This paper considers the problem of tracking multiple wideband acoustic sources in three dimensional (3-D) space using a distributed acoustic vector sensor (AVS) array. Least square approaches have been proposed to fuse the DOA measurements and estimate the 3-D position. However, the performance of position estimation can be seriously degraded by inaccurate DOA estimates, and also multiple source localization is impossible unless the DOA estimates can be associated to each source correctly. In this paper, A random finite set (RFS) approach is developed to jointly detect and track multiple wideband acoustic sources. An RFS is able to characterize the randomness of the state process (i.e., the dynamics of source motion and the number of active sources) as well as the measurement process (i.e., source detections, false alarms and miss detections). Since deriving a closed-form solution does not exist for the multi-source probability density, a particle filtering approach is employed to arrive at a computationally tractable approximation of the RFS densities. Simulations in different tracking scenarios demonstrate the ability of the proposed approaches in multiple acoustic source detection and tracking.Published versio

    A dynamic Bayesian nonparametric model for blind calibration of sensor networks

    No full text
    We consider the problem of blind calibration of a sensor network, where the sensor gains and offsets are estimated from noisy observations of unknown signals. This is in general a nonidentifiable problem, unless restrictive assumptions on the signal subspace or sensor observations are imposed. We show that if each signal observed by the sensors follows a known dynamic model with additive noise, then the sensor gains and offsets are identifiable. We propose a dynamic Bayesian nonparametric model to infer the sensors’ gains and offsets. Our model allows different sensor clusters to observe different unknown signals, without knowing the sensor clusters a priori . We develop an offline algorithm using block Gibbs sampling and a linearized forward filtering backward sampling method that estimates the sensor clusters, gains, and offsets jointly. Furthermore, for practical implementation, we also propose an online inference algorithm based on particle filtering and local Markov chain Monte Carlo. Simulations using a synthetic dataset, and experiments on two real datasets suggest that our proposed methods perform better than several other blind calibration methods, including a sparse Bayesian learning approach, and methods that first cluster the sensor observations and then estimate the gains and offsets.NRF (Natl Research Foundation, S’pore)MOE (Min. of Education, S’pore)Accepted versio

    Performance study of vertical acoustic vector sensor array based 3-D position tracking in a shallow ocean environment

    No full text
    This paper considers the problem of tracking an acoustic sources in three dimensional (3-D) space by using a vertical acoustic vector sensor (AVS) array in a shallow ocean environment. The innovations of this work are double fold: 1) a particle filtering (PF) approach is developed to track the position of an acoustic source; and 2) based on the source motion and wave propagation models, the posterior Cramér-Rao bound (PCRB) is derived to provide a lower performance bound of 3-D position tracking in shallow ocean. The PF approach uses a number of samples to approximate the posterior distribution of interested parameters, by which a complex 3-D search can be avoided for 3-D position estimation. Also, due to incorporating both the source dynamic and measurement information, the tracking approach is able to provide a lower performance bound than the traditional localization approach. The tracking performance is further demonstrated by numerical experiments

    Particle filtering and posterior Cramér-Rao bound for 2-D direction of arrival tracking using an acoustic vector sensor

    No full text
    Acoustic vector sensor (AVS) measures acoustic pressure as well as particle velocity, and therefore AVS signal contains 2-D (azimuth and elevation) DOA information of an acoustic source. Existing DOA estimation techniques assume that the source is static and extensively rely on the localization methods. In this paper, a particle filtering (PF) tracking approach is developed to estimate the 2-D DOA from signals collected by an AVS. A constant velocity model is employed to model the source dynamics and the likelihood function is derived based on a maximum likelihood estimation of the source amplitude and the noise variance. The posterior Cramér-Rao bound (PCRB) is also derived to provide a lower performance bound for AVS signal based tracking problem. Since PCRB incorporates the information from the source dynamics and measurement models, it is usually lower than traditional Cramér-Rao bound which only employs measurement model information. Experiments show that the proposed PF tracking algorithm significantly outperforms Capon beamforming based localization method and is much closer to the PCRB even in a challenging environment (e.g., SNR = -10 dB)
    corecore