54 research outputs found

    Inferring Room Geometries

    No full text
    Determining the geometry of an acoustic enclosure using microphone arrays has become an active area of research. Knowledge gained about the acoustic environment, such as the location of reflectors, can be advantageous for applications such as sound source localization, dereverberation and adaptive echo cancellation by assisting in tracking environment changes and helping the initialization of such algorithms. A methodology to blindly infer the geometry of an acoustic enclosure by estimating the location of reflective surfaces based on acoustic measurements using an arbitrary array geometry is developed and analyzed. The starting point of this work considers a geometric constraint, valid both in two and three-dimensions, that converts time-of-arrival and time-difference-pf-arrival information into elliptical constraints about the location of reflectors. Multiple constraints are combined to yield the line or plane parameters of the reflectors by minimizing a specific cost function in the least-squares sense. An iterative constrained least-squares estimator, along with a closed-form estimator, that performs optimally in a noise-free scenario, solve the associated common tangent estimation problem that arises from the geometric constraint. Additionally, a Hough transform based data fusion and estimation technique, that considers acquisitions from multiple source positions, refines the reflector localization even in adverse conditions. An extension to the geometric inference framework, that includes the estimation of the actual speed of sound to improve the accuracy under temperature variations, is presented that also reduces the required prior information needed such that only relative microphone positions in the array are required for the localization of acoustic reflectors. Simulated and real-world experiments demonstrate the feasibility of the proposed method.Open Acces

    Sparse Modeling of Grouped Line Spectra

    Get PDF
    This licentiate thesis focuses on clustered parametric models for estimation of line spectra, when the spectral content of a signal source is assumed to exhibit some form of grouping. Different from previous parametric approaches, which generally require explicit knowledge of the model orders, this thesis exploits sparse modeling, where the orders are implicitly chosen. For line spectra, the non-linear parametric model is approximated by a linear system, containing an overcomplete basis of candidate frequencies, called a dictionary, and a large set of linear response variables that selects and weights the components in the dictionary. Frequency estimates are obtained by solving a convex optimization program, where the sum of squared residuals is minimized. To discourage overfitting and to infer certain structure in the solution, different convex penalty functions are introduced into the optimization. The cost trade-off between fit and penalty is set by some user parameters, as to approximate the true number of spectral lines in the signal, which implies that the response variable will be sparse, i.e., have few non-zero elements. Thus, instead of explicit model orders, the orders are implicitly set by this trade-off. For grouped variables, the dictionary is customized, and appropriate convex penalties selected, so that the solution becomes group sparse, i.e., has few groups with non-zero variables. In an array of sensors, the specific time-delays and attenuations will depend on the source and sensor positions. By modeling this, one may estimate the location of a source. In this thesis, a novel joint location and grouped frequency estimator is proposed, which exploits sparse modeling for both spectral and spatial estimates, showing robustness against sources with overlapping frequency content. For audio signals, this thesis uses two different features for clustering. Pitch is a perceptual property of sound that may be described by the harmonic model, i.e., by a group of spectral lines at integer multiples of a fundamental frequency, which we estimate by exploiting a novel adaptive total variation penalty. The other feature, chroma, is a concept in musical theory, collecting pitches at powers of 2 from each other into groups. Using a chroma dictionary, together with appropriate group sparse penalties, we propose an automatic transcription of the chroma content of a signal

    Recent Advances in Indoor Localization Systems and Technologies

    Get PDF
    Despite the enormous technical progress seen in the past few years, the maturity of indoor localization technologies has not yet reached the level of GNSS solutions. The 23 selected papers in this book present the recent advances and new developments in indoor localization systems and technologies, propose novel or improved methods with increased performance, provide insight into various aspects of quality control, and also introduce some unorthodox positioning methods

    Acoustic Source Localisation in constrained environments

    Get PDF
    Acoustic Source Localisation (ASL) is a problem with real-world applications across multiple domains, from smart assistants to acoustic detection and tracking. And yet, despite the level of attention in recent years, a technique for rapid and robust ASL remains elusive – not least in the constrained environments in which such techniques are most likely to be deployed. In this work, we seek to address some of these current limitations by presenting improvements to the ASL method for three commonly encountered constraints: the number and configuration of sensors; the limited signal sampling potentially available; and the nature and volume of training data required to accurately estimate Direction of Arrival (DOA) when deploying a particular supervised machine learning technique. In regard to the number and configuration of sensors, we find that accuracy can be maintained at state-of-the-art levels, Steered Response Power (SRP), while reducing computation sixfold, based on direct optimisation of well known ASL formulations. Moreover, we find that the circular microphone configuration is the least desirable as it yields the highest localisation error. In regard to signal sampling, we demonstrate that the computer vision inspired algorithm presented in this work, which extracts selected keypoints from the signal spectrogram, and uses them to select signal samples, outperforms an audio fingerprinting baseline while maintaining a compression ratio of 40:1. In regard to the training data employed in machine learning ASL techniques, we show that the use of music training data yields an improvement of 19% against a noise data baseline while maintaining accuracy using only 25% of the training data, while training with speech as opposed to noise improves DOA estimation by an average of 17%, outperforming the Generalised Cross-Correlation technique by 125% in scenarios in which the test and training acoustic environments are matched.Heriot-Watt University James Watt Scholarship (JSW) in the School of Engineering & Physical Sciences

    Computer Vision without Vision : Methods and Applications of Radio and Audio Based SLAM

    Get PDF
    The central problem of this thesis is estimating receiver-sender node positions from measured receiver-sender distances or equivalent measurements. This problem arises in many applications such as microphone array calibration, radio antenna array calibration, mapping and positioning using ultra-wideband and mapping and positioning using round-trip-time measurements between mobile phones and Wi-Fi-units. Previous research has explored some of these problems, creating minimal solvers for instance, but these solutions lack real world implementation. Due to the nature of using different media, finding reliable receiver-sender distances is tough, with many of the measurements being erroneous or to a worse extent missing. Therefore in this thesis, we explore using minimal solvers to create robust solutions, that encompass small erroneous measurements and work around missing and grossly erroneous measurements.This thesis focuses mainly on Time-of-Arrival measurements using radio technologies such as Two-way-Ranging in Ultra-Wideband and a new IEEE standard 802.11mc found on many WiFi modules. The methods investigated, also related to Computer Vision problems such as Stucture-from-Motion. As part of this thesis, a range of new commercial radio technologies are characterised in terms of ranging in real world enviroments. In doing so, we have shown how these technologies can be used as a more accurate alternative to the Global Positioning System in indoor enviroments. Further to these solutions, more methods are proposed for large scale problems when multiple users will collect the data, commonly known as Big Data. For these cases, more data is not always better, so a method is proposed to try find the relevant data to calibrate large systems

    Fundamental Frequency and Direction-of-Arrival Estimation for Multichannel Speech Enhancement

    Get PDF
    • …
    corecore