379 research outputs found

    Spatial Sound Localization via Multipath Euclidean Distance Matrix Recovery

    Get PDF
    A novel localization approach is proposed in order to find the position of an individual source using recordings of a single microphone in a reverberant enclosure. The multipath propagation is modeled by multiple virtual microphones as images of the actual single microphone and a multipath distance matrix is constructed whose components consist of the squared distances between the pairs of microphones (real or virtual) or the squared distances between the microphones and the source. The distances between the actual and virtual microphones are computed from the geometry of the enclosure. The microphone-source distances correspond to the support of the early reflections in the room impulse response associated with the source signal acquisition. The low-rank property of the Euclidean distance matrix is exploited to identify this correspondence. Source localization is achieved through optimizing the location of the source matching those measurements. The recording time of the microphone and generation of the source signal is asynchronous and estimated via the proposed procedure. Furthermore, a theoretically optimal joint localization and synchronization algorithm is derived by formulating the source localization as minimization of a quartic cost function. It is shown that the global minimum of the proposed cost function can be efficiently computed by converting it to a generalized trust region subproblem. Numerical simulations on synthetic data and real data recordings obtained by practical tests show the effectiveness of the proposed approach

    Compressive Matched-Field Processing

    Full text link
    Source localization by matched-field processing (MFP) generally involves solving a number of computationally intensive partial differential equations. This paper introduces a technique that mitigates this computational workload by "compressing" these computations. Drawing on key concepts from the recently developed field of compressed sensing, it shows how a low-dimensional proxy for the Green's function can be constructed by backpropagating a small set of random receiver vectors. Then, the source can be located by performing a number of "short" correlations between this proxy and the projection of the recorded acoustic data in the compressed space. Numerical experiments in a Pekeris ocean waveguide are presented which demonstrate that this compressed version of MFP is as effective as traditional MFP even when the compression is significant. The results are particularly promising in the broadband regime where using as few as two random backpropagations per frequency performs almost as well as the traditional broadband MFP, but with the added benefit of generic applicability. That is, the computationally intensive backpropagations may be computed offline independently from the received signals, and may be reused to locate any source within the search grid area

    Raking the Cocktail Party

    Get PDF
    We present the concept of an acoustic rake receiver---a microphone beamformer that uses echoes to improve the noise and interference suppression. The rake idea is well-known in wireless communications; it involves constructively combining different multipath components that arrive at the receiver antennas. Unlike spread-spectrum signals used in wireless communications, speech signals are not orthogonal to their shifts. Therefore, we focus on the spatial structure, rather than temporal. Instead of explicitly estimating the channel, we create correspondences between early echoes in time and image sources in space. These multiple sources of the desired and the interfering signal offer additional spatial diversity that we can exploit in the beamformer design. We present several "intuitive" and optimal formulations of acoustic rake receivers, and show theoretically and numerically that the rake formulation of the maximum signal-to-interference-and-noise beamformer offers significant performance boosts in terms of noise and interference suppression. Beyond signal-to-noise ratio, we observe gains in terms of the \emph{perceptual evaluation of speech quality} (PESQ) metric for the speech quality. We accompany the paper by the complete simulation and processing chain written in Python. The code and the sound samples are available online at \url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on Selected Topics in Signal Processing (Special Issue on Spatial Audio

    Identifying High-Traffic Patterns in the Workplace With Radio Tomographic Imaging in 3D Wireless Sensor Networks

    Get PDF
    The rapid progress of wireless communication and embedded mircro-sensing electro-mechanical systems (MEMS) technologies has resulted in a growing confidence in the use of wireless sensor networks (WSNs) comprised of low-cost, low-power devices performing various monitoring tasks. Radio Tomographic Imaging (RTI) is a technology for localizing, tracking, and imaging device-free objects in a WSN using the change in received signal strength (RSS) of the radio links the object is obstructing. This thesis employs an experimental indoor three-dimensional (3-D) RTI network constructed of 80 wireless radios in a 100 square foot area. Experimental results are presented from a series of stationary target localization and target tracking experiments using one and two targets. Preliminary results demonstrate a 3-D RTI network can be effectively used to generate 3-D RSS-based images to extract target features such as size and height, and identify high-traffic patterns in the workplace by tracking asset movement

    Greedy routing and virtual coordinates for future networks

    Get PDF
    At the core of the Internet, routers are continuously struggling with ever-growing routing and forwarding tables. Although hardware advances do accommodate such a growth, we anticipate new requirements e.g. in data-oriented networking where each content piece has to be referenced instead of hosts, such that current approaches relying on global information will not be viable anymore, no matter the hardware progress. In this thesis, we investigate greedy routing methods that can achieve similar routing performance as today but use much less resources and which rely on local information only. To this end, we add specially crafted name spaces to the network in which virtual coordinates represent the addressable entities. Our scheme enables participating routers to make forwarding decisions using only neighbourhood information, as the overarching pseudo-geometric name space structure already organizes and incorporates "vicinity" at a global level. A first challenge to the application of greedy routing on virtual coordinates to future networks is that of "routing dead-ends" that are local minima due to the difficulty of consistent coordinates attribution. In this context, we propose a routing recovery scheme based on a multi-resolution embedding of the network in low-dimensional Euclidean spaces. The recovery is performed by routing greedily on a blurrier view of the network. The different network detail-levels are obtained though the embedding of clustering-levels of the graph. When compared with higher-dimensional embeddings of a given network, our method shows a significant diminution of routing failures for similar header and control-state sizes. A second challenge to the application of virtual coordinates and greedy routing to future networks is the support of "customer-provider" as well as "peering" relationships between participants, resulting in a differentiated services environment. Although an application of greedy routing within such a setting would combine two very common fields of today's networking literature, such a scenario has, surprisingly, not been studied so far. In this context we propose two approaches to address this scenario. In a first approach we implement a path-vector protocol similar to that of BGP on top of a greedy embedding of the network. This allows each node to build a spatial map associated with each of its neighbours indicating the accessible regions. Routing is then performed through the use of a decision-tree classifier taking the destination coordinates as input. When applied on a real-world dataset (the CAIDA 2004 AS graph) we demonstrate an up to 40% compression ratio of the routing control information at the network's core as well as a computationally efficient decision process comparable to methods such as binary trees and tries. In a second approach, we take inspiration from consensus-finding in social sciences and transform the three-dimensional distance data structure (where the third dimension encodes the service differentiation) into a two-dimensional matrix on which classical embedding tools can be used. This transformation is achieved by agreeing on a set of constraints on the inter-node distances guaranteeing an administratively-correct greedy routing. The computed distances are also enhanced to encode multipath support. We demonstrate a good greedy routing performance as well as an above 90% satisfaction of multipath constraints when relying on the non-embedded obtained distances on synthetic datasets. As various embeddings of the consensus distances do not fully exploit their multipath potential, the use of compression techniques such as transform coding to approximate the obtained distance allows for better routing performances

    A parallel hypothesis method of autonomous underwater vehicle navigation

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2009This research presents a parallel hypothesis method for autonomous underwater vehicle navigation that enables a vehicle to expand the operating envelope of existing long baseline acoustic navigation systems by incorporating information that is not normally used. The parallel hypothesis method allows the in-situ identification of acoustic multipath time-of-flight measurements between a vehicle and an external transponder and uses them in real-time to augment the navigation algorithm during periods when direct-path time-of-flight measurements are not available. A proof of concept was conducted using real-world data obtained by the Woods Hole Oceanographic Institution Deep Submergence Lab's Autonomous Benthic Explorer (ABE) and Sentry autonomous underwater vehicles during operations on the Juan de Fuca Ridge. This algorithm uses a nested architecture to break the navigation solution down into basic building blocks for each type of available external information. The algorithm classifies external information as either line of position or gridded observations. For any line of position observation, the algorithm generates a multi-modal block of parallel position estimate hypotheses. The multimodal hypotheses are input into an arbiter which produces a single unimodal output. If a priori maps of gridded information are available, they are used within the arbiter structure to aid in the elimination of false hypotheses. For the proof of concept, this research uses ranges from a single external acoustic transponder in the hypothesis generation process and grids of low-resolution bathymetric data from a ship-based multibeam sonar in the arbitration process. The major contributions of this research include the in-situ identification of acoustic multipath time-of-flight measurements, the multiscale utilization of a priori low-resolution bathymetric data in a high-resolution navigation algorithm, and the design of a navigation algorithm with a exible architecture. This flexible architecture allows the incorporation of multimodal beliefs without requiring a complex mechanism for real-time hypothesis generation and culling, and it allows the real-time incorporation of multiple types of external information as they become available in situ into the overall navigation solution

    Structured Sparsity Models for Reverberant Speech Separation

    Get PDF
    We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization exploiting joint sparsity model formulated upon spatio-spectral sparsity of concurrent speech representation. The acoustic parameters are then incorporated for separating individual speech signals through either structured sparse recovery or inverse filtering the acoustic channels. The experiments conducted on real data recordings demonstrate the effectiveness of the proposed approach for multi-party speech recovery and recognition

    Listening to Distances and Hearing Shapes:Inverse Problems in Room Acoustics and Beyond

    Get PDF
    A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources--channel--receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract \emph{room}, and to replace it by more familiar \emph{point sets}. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know \emph{a priori} the matching between the peaks and the points in space, and solving the inverse problem is achieved by \emph{echo sorting}---a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of-flight measurements. Equipped with this perspective, we first address the ``Can one hear the shape of a room?'' question, and we answer it with a qualified ``yes''. Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only first-order echoes and a few microphones. Next, we show how different problems of localization benefit from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown location---for example a finger snap---and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal defined on Earth's surface lives on a sphere. This resulted in the first slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using finite rate-of-innovation principles and apply it to various signal processing problems on the sphere

    Source localization via time difference of arrival

    Get PDF
    Accurate localization of a signal source, based on the signals collected by a number of receiving sensors deployed in the source surrounding area is a problem of interest in various fields. This dissertation aims at exploring different techniques to improve the localization accuracy of non-cooperative sources, i.e., sources for which the specific transmitted symbols and the time of the transmitted signal are unknown to the receiving sensors. With the localization of non-cooperative sources, time difference of arrival (TDOA) of the signals received at pairs of sensors is typically employed. A two-stage localization method in multipath environments is proposed. During the first stage, TDOA of the signals received at pairs of sensors is estimated. In the second stage, the actual location is computed from the TDOA estimates. This later stage is referred to as hyperbolic localization and it generally involves a non-convex optimization. For the first stage, a TDOA estimation method that exploits the sparsity of multipath channels is proposed. This is formulated as an f1-regularization problem, where the f1-norm is used as channel sparsity constraint. For the second stage, three methods are proposed to offer high accuracy at different computational costs. The first method takes a semi-definite relaxation (SDR) approach to relax the hyperbolic localization to a convex optimization. The second method follows a linearized formulation of the problem and seeks a biased estimate of improved accuracy. A third method is proposed to exploit the source sparsity. With this, the hyperbolic localization is formulated as an an f1-regularization problem, where the f1-norm is used as source sparsity constraint. The proposed methods compare favorably to other existing methods, each of them having its own advantages. The SDR method has the advantage of simplicity and low computational cost. The second method may perform better than the SDR approach in some situations, but at the price of higher computational cost. The l1-regularization may outperform the first two methods, but is sensitive to the choice of a regularization parameter. The proposed two-stage localization approach is shown to deliver higher accuracy and robustness to noise, compared to existing TDOA localization methods. A single-stage source localization method is explored. The approach is coherent in the sense that, in addition to the TDOA information, it utilizes the relative carrier phases of the received signals among pairs of sensors. A location estimator is constructed based on a maximum likelihood metric. The potential of accuracy improvement by the coherent approach is shown through the Cramer Rao lower bound (CRB). However, the technique has to contend with high peak sidelobes in the localization metric, especially at low signal-to-noise ratio (SNR). Employing a small antenna array at each sensor is shown to lower the sidelobes level in the localization metric. Finally, the performance of time delay and amplitude estimation from samples of the received signal taken at rates lower than the conventional Nyquist rate is evaluated. To this end, a CRB is developed and its variation with system parameters is analyzed. It is shown that while with noiseless low rate sampling there is no estimation accuracy loss compared to Nyquist sampling, in the presence of additive noise the performance degrades significantly. However, increasing the low sampling rate by a small factor leads to significant performance improvement, especially for time delay estimation

    Mathematical modelling ano optimization strategies for acoustic source localization in reverberant environments

    Get PDF
    La presente Tesis se centra en el uso de técnicas modernas de optimización y de procesamiento de audio para la localización precisa y robusta de personas dentro de un entorno reverberante dotado con agrupaciones (arrays) de micrófonos. En esta tesis se han estudiado diversos aspectos de la localización sonora, incluyendo el modelado, la algoritmia, así como el calibrado previo que permite usar los algoritmos de localización incluso cuando la geometría de los sensores (micrófonos) es desconocida a priori. Las técnicas existentes hasta ahora requerían de un número elevado de micrófonos para obtener una alta precisión en la localización. Sin embargo, durante esta tesis se ha desarrollado un nuevo método que permite una mejora de más del 30\% en la precisión de la localización con un número reducido de micrófonos. La reducción en el número de micrófonos es importante ya que se traduce directamente en una disminución drástica del coste y en un aumento de la versatilidad del sistema final. Adicionalmente, se ha realizado un estudio exhaustivo de los fenómenos que afectan al sistema de adquisición y procesado de la señal, con el objetivo de mejorar el modelo propuesto anteriormente. Dicho estudio profundiza en el conocimiento y modelado del filtrado PHAT (ampliamente utilizado en localización acústica) y de los aspectos que lo hacen especialmente adecuado para localización. Fruto del anterior estudio, y en colaboración con investigadores del instituto IDIAP (Suiza), se ha desarrollado un sistema de auto-calibración de las posiciones de los micrófonos a partir del ruido difuso presente en una sala en silencio. Esta aportación relacionada con los métodos previos basados en la coherencia. Sin embargo es capaz de reducir el ruido atendiendo a parámetros físicos previamente conocidos (distancia máxima entre los micrófonos). Gracias a ello se consigue una mejor precisión utilizando un menor tiempo de cómputo. El conocimiento de los efectos del filtro PHAT ha permitido crear un nuevo modelo que permite la representación 'sparse' del típico escenario de localización. Este tipo de representación se ha demostrado ser muy conveniente para localización, permitiendo un enfoque sencillo del caso en el que existen múltiples fuentes simultáneas. La última aportación de esta tesis, es el de la caracterización de las Matrices TDOA (Time difference of arrival -Diferencia de tiempos de llegada, en castellano-). Este tipo de matrices son especialmente útiles en audio pero no están limitadas a él. Además, este estudio transciende a la localización con sonido ya que propone métodos de reducción de ruido de las medias TDOA basados en una representación matricial 'low-rank', siendo útil, además de en localización, en técnicas tales como el beamforming o el autocalibrado
    corecore