54 research outputs found
Inferring Room Geometries
Determining the geometry of an acoustic enclosure using microphone arrays
has become an active area of research. Knowledge gained about the acoustic
environment, such as the location of reflectors, can be advantageous for
applications such as sound source localization, dereverberation and adaptive
echo cancellation by assisting in tracking environment changes and helping
the initialization of such algorithms.
A methodology to blindly infer the geometry of an acoustic enclosure by estimating
the location of reflective surfaces based on acoustic measurements
using an arbitrary array geometry is developed and analyzed. The starting
point of this work considers a geometric constraint, valid both in two
and three-dimensions, that converts time-of-arrival and time-difference-pf-arrival information into elliptical constraints about the location of reflectors.
Multiple constraints are combined to yield the line or plane parameters of
the reflectors by minimizing a specific cost function in the least-squares
sense. An iterative constrained least-squares estimator, along with a closed-form estimator, that performs optimally in a noise-free scenario, solve the
associated common tangent estimation problem that arises from the geometric
constraint. Additionally, a Hough transform based data fusion and
estimation technique, that considers acquisitions from multiple source positions,
refines the reflector localization even in adverse conditions.
An extension to the geometric inference framework, that includes the estimation
of the actual speed of sound to improve the accuracy under temperature
variations, is presented that also reduces the required prior information
needed such that only relative microphone positions in the array are
required for the localization of acoustic reflectors. Simulated and real-world
experiments demonstrate the feasibility of the proposed method.Open Acces
Sparse Modeling of Grouped Line Spectra
This licentiate thesis focuses on clustered parametric models for estimation of line spectra, when the spectral content of a signal source is assumed to exhibit some form of grouping. Different from previous parametric approaches, which generally require explicit knowledge of the model orders, this thesis exploits sparse modeling, where the orders are implicitly chosen. For line spectra, the non-linear parametric model is approximated by a linear system, containing an overcomplete basis of candidate frequencies, called a dictionary, and a large set of linear response variables that selects and weights the components in the dictionary. Frequency estimates are obtained by solving a convex optimization program, where the sum of squared residuals is minimized. To discourage overfitting and to infer certain structure in the solution, different convex penalty functions are introduced into the optimization. The cost trade-off between fit and penalty is set by some user parameters, as to approximate the true number of spectral lines in the signal, which implies that the response variable will be sparse, i.e., have few non-zero elements. Thus, instead of explicit model orders, the orders are implicitly set by this trade-off. For grouped variables, the dictionary is customized, and appropriate convex penalties selected, so that the solution becomes group sparse, i.e., has few groups with non-zero variables. In an array of sensors, the specific time-delays and attenuations will depend on the source and sensor positions. By modeling this, one may estimate the location of a source. In this thesis, a novel joint location and grouped frequency estimator is proposed, which exploits sparse modeling for both spectral and spatial estimates, showing robustness against sources with overlapping frequency content. For audio signals, this thesis uses two different features for clustering. Pitch is a perceptual property of sound that may be described by the harmonic model, i.e., by a group of spectral lines at integer multiples of a fundamental frequency, which we estimate by exploiting a novel adaptive total variation penalty. The other feature, chroma, is a concept in musical theory, collecting pitches at powers of 2 from each other into groups. Using a chroma dictionary, together with appropriate group sparse penalties, we propose an automatic transcription of the chroma content of a signal
Recent Advances in Indoor Localization Systems and Technologies
Despite the enormous technical progress seen in the past few years, the maturity of indoor localization technologies has not yet reached the level of GNSS solutions. The 23 selected papers in this book present the recent advances and new developments in indoor localization systems and technologies, propose novel or improved methods with increased performance, provide insight into various aspects of quality control, and also introduce some unorthodox positioning methods
Acoustic Source Localisation in constrained environments
Acoustic Source Localisation (ASL) is a problem with real-world applications
across multiple domains, from smart assistants to acoustic detection and tracking.
And yet, despite the level of attention in recent years, a technique for rapid and
robust ASL remains elusive β not least in the constrained environments in which
such techniques are most likely to be deployed.
In this work, we seek to address some of these current limitations by presenting
improvements to the ASL method for three commonly encountered constraints: the
number and configuration of sensors; the limited signal sampling potentially available;
and the nature and volume of training data required to accurately estimate Direction
of Arrival (DOA) when deploying a particular supervised machine learning technique.
In regard to the number and configuration of sensors, we find that accuracy can be
maintained at state-of-the-art levels, Steered Response Power (SRP), while reducing
computation sixfold, based on direct optimisation of well known ASL formulations.
Moreover, we find that the circular microphone configuration is the least desirable
as it yields the highest localisation error.
In regard to signal sampling, we demonstrate that the computer vision inspired
algorithm presented in this work, which extracts selected keypoints from the signal spectrogram, and uses them to select signal samples, outperforms an audio
fingerprinting baseline while maintaining a compression ratio of 40:1.
In regard to the training data employed in machine learning ASL techniques,
we show that the use of music training data yields an improvement of 19% against
a noise data baseline while maintaining accuracy using only 25% of the training
data, while training with speech as opposed to noise improves DOA estimation by
an average of 17%, outperforming the Generalised Cross-Correlation technique by
125% in scenarios in which the test and training acoustic environments are matched.Heriot-Watt University James Watt
Scholarship (JSW) in the School of Engineering & Physical Sciences
Computer Vision without Vision : Methods and Applications of Radio and Audio Based SLAM
The central problem of this thesis is estimating receiver-sender node positions from measured receiver-sender distances or equivalent measurements. This problem arises in many applications such as microphone array calibration, radio antenna array calibration, mapping and positioning using ultra-wideband and mapping and positioning using round-trip-time measurements between mobile phones and Wi-Fi-units. Previous research has explored some of these problems, creating minimal solvers for instance, but these solutions lack real world implementation. Due to the nature of using different media, finding reliable receiver-sender distances is tough, with many of the measurements being erroneous or to a worse extent missing. Therefore in this thesis, we explore using minimal solvers to create robust solutions, that encompass small erroneous measurements and work around missing and grossly erroneous measurements.This thesis focuses mainly on Time-of-Arrival measurements using radio technologies such as Two-way-Ranging in Ultra-Wideband and a new IEEE standard 802.11mc found on many WiFi modules. The methods investigated, also related to Computer Vision problems such as Stucture-from-Motion. As part of this thesis, a range of new commercial radio technologies are characterised in terms of ranging in real world enviroments. In doing so, we have shown how these technologies can be used as a more accurate alternative to the Global Positioning System in indoor enviroments. Further to these solutions, more methods are proposed for large scale problems when multiple users will collect the data, commonly known as Big Data. For these cases, more data is not always better, so a method is proposed to try find the relevant data to calibrate large systems
- β¦