193 research outputs found

    Self-localization in Ad Hoc Indoor Acoustic Networks

    Get PDF
    The increasing use of mobile technology in everyday life has aroused interest into developing new ways of utilizing the data collected by devices such as mobile phones and wearable devices. Acoustic sensors can be used to localize sound sources if the positions of spatially separate sensors are known or can be determined. However, the process of determining the 3D coordinates by manual measurements is tedious especially with increasing number of sensors. Therefore, the localization process has to be automated. Satellite based positioning is imprecise for many applications and requires line-of-sight to the sky. This thesis studies localization methods for wireless acoustic sensor networks and the process is called self-localization.This thesis focuses on self-localization from sound, and therefore the term acoustic is used. Furthermore, the development of the methods aims at utilizing ad hoc sensor networks, which means that the sensors are not necessarily installed in the premises like meeting rooms and other purpose-built spaces, which often have dedicated audio hardware for spatial audio applications. Instead of relying on such spaces and equipment, mobile devices are used, which are combined to form sensor networks.For instance, a few mobile phones laid on a table can be used to create a sensor network built for an event and it is inherently dismantled once the event is over, which explains the use of the term ad hoc. Once positions of the devices are estimated, the network can be used for spatial applications such as sound source localization and audio enhancement via spatial ïŹltering. The main purpose of this thesis is to present the methods for self-localization of such an ad hoc acoustic sensor network. Using off-the-shelf ad hoc devices to establish sensor networks enables implementation of many spatial algorithms basically in any environment.Several acoustic self-localization methods have been introduced over the years. However, they often rely on specialized hardware and calibration signals. This thesis presents methods that are passive and utilize environmental sounds such as speech from which, by using time delay estimation, the spatial information of the sensor network can be determined. Many previous self-localization methods assume that audio captured by the sensors is synchronized. This assumption cannot be made in an ad hoc sensor network, since the different sensors are unaware of each other without speciïŹc signaling that is not available without special arrangement.The methods developed in this thesis are evaluated with simulations and real data recordings. Scenarios in which the targets of positioning are stationary and in motion are studied. The real world recordings are made in closed spaces such as meeting rooms. The targets are approximately 1 – 5 meters apart. The positioning accuracy is approximately ïŹve centimeters in a stationary scenario, and ten centimeters in a moving-target scenario on average. The most important result of this thesis is presenting the ïŹrst self-localization method that uses environmental sounds and off-the-shelf unsynchronized devices, and allows the targets of self-localization to move

    Combining Range and Direction for Improved Localization

    Get PDF
    Self-localization of nodes in a sensor network is typically achieved using either range or direction measurements; in this paper, we show that a constructive combination of both improves the estimation. We propose two localization algorithms that make use of the differences between the sensors’ coordinates, or edge vectors; these can be calculated from measured distances and angles. Our first method improves the existing edge-multidimensional scaling algorithm (E-MDS) by introducing additional constraints that enforce geometric consistency between the edge vectors. On the other hand, our second method decomposes the edge vectors onto 1-dimensional spaces and introduces the concept of coordinate difference matrices (CDMs) to independently regularize each projection. This solution is optimal when Gaussian noise is added to the edge vectors. We demonstrate in numerical simulations that both algorithms outperform state-of-the-art solutions

    Euclidean Distance Matrices:Properties, Algorithms and Applications

    Get PDF
    Euclidean distance matrices (EDMs) are central players in many diverse fields including psychometrics, NMR spectroscopy, machine learning and sensor networks. However, they are not often exploited in signal processing. In this thesis, we analyze attributes of EDMs and derive new key properties of them. These analyses allow us to propose algorithms to approximate EDMs and provide analytic bounds on the performance of our methods. We use these techniques to suggest new solutions for several practical problems in signal processing. Together with these properties, algorithms and applications, EDMs can thus be considered as a fundamental toolbox to be used in signal processing. In more detail, we start by introducing the structure and properties of EDMs. In particular, we focus on their rank property; the rank of an EDM is at most the dimension of the set of points generating it plus 2. Using this property, we introduce the use of low rank matrix completion methods for approximating and completing noisy and partially revealed EDMs. We apply this algorithm to the problem of sensor position calibration in ultrasound tomography devices. By adapting the matrix completion framework, in addition to proposing a self calibration process for these devices, we also provide analytic bounds for the calibration error. We then study the problem of sensor localization using distance information by minimizing a non-linear cost function known as the s-stress function in the multidimensional scaling (MDS) community. We derive key properties of this cost function that can be used to reduce the search domain for finding its global minimum. We provide an efficient, low cost and distributed algorithm for minimizing this cost function for incomplete networks and noisy measurements. In randomized experiments, the proposed method converges to the global minimum of the s-stress in more than 99% of the cases. We also address the open problem of existence of non-global minimizers of the s-stress and reduce this problem to a hypothesis. If the hypothesis is true then the cost function has only global minimizers, otherwise, it has non-global minimizers. Using the rank property of EDMs and the proposed minimization algorithm for approximating them, we address an interesting and practical problem in acoustics. We show that using five microphones and one loudspeaker, we can hear the shape of a room. We reformulate this problem as finding the locations of the image sources of the loudspeaker with respect to the walls. We propose an algorithm to find these positions only using first-order echoes. We prove that the reconstruction of the room is almost surely unique. We further introduce a new algorithm for locating a microphone inside a known room using only one loudspeaker. Our experimental evaluations conducted on the EPFL campus and also in the Lausanne cathedral, confirm the robustness and accuracy of the proposed methods. By integrating further properties of EDMs into the matrix completion framework, we propose a new method for calibrating microphone arrays in a diffuse noise field. We use a specific characterization of diffuse noise fields to relate the coherence of recorded signals by two microphones to their mutual distance. As this model is not reliable for large distances between microphones, we use matrix completion coupled with other properties of EDMs to estimate these distances and calibrate the microphone array. Evaluation of our algorithm using real data measurements demonstrates, for the first time, the possibility of accurately calibrating large ad-hoc microphone arrays in a diffuse noise field. The last part of the thesis addresses a central problem in signal processing; the design of discrete-time filters (equivalently window functions) that are compact both in time and frequency. By properly adapting the definitions of compactness in the continuous time to discrete time, we formulate the search for maximally compact sequences as solving a semi-definite program. We show that the spectra of maximally compact sequences are a special class of Mathieu’s cosine functions. Using the asymptotic behavior of these functions, we provide a tight bound for the time-frequency spread of discrete-time sequences. Our analysis shows that the Heisenberg uncertainty bound on the time-frequency spread of sequences is not tight and the lower bound depends on the frequency spread, unlike in the continuous time case

    Listening to Distances and Hearing Shapes:Inverse Problems in Room Acoustics and Beyond

    Get PDF
    A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources--channel--receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract \emph{room}, and to replace it by more familiar \emph{point sets}. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know \emph{a priori} the matching between the peaks and the points in space, and solving the inverse problem is achieved by \emph{echo sorting}---a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of-flight measurements. Equipped with this perspective, we first address the ``Can one hear the shape of a room?'' question, and we answer it with a qualified ``yes''. Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only first-order echoes and a few microphones. Next, we show how different problems of localization benefit from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown location---for example a finger snap---and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal defined on Earth's surface lives on a sphere. This resulted in the first slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using finite rate-of-innovation principles and apply it to various signal processing problems on the sphere

    Localization using Distance Geometry : Minimal Solvers and Robust Methods for Sensor Network Self-Calibration

    Get PDF
    In this thesis, we focus on the problem of estimating receiver and sender node positions given some form of distance measurements between them. This kind of localization problem has several applications, e.g., global and indoor positioning, sensor network calibration, molecular conformations, data visualization, graph embedding, and robot kinematics. More concretely, this thesis makes contributions in three different areas.First, we present a method for simultaneously registering and merging maps. The merging problem occurs when multiple maps of an area have been constructed and need to be combined into a single representation. If there are no absolute references and the maps are in different coordinate systems, they also need to be registered. In the second part, we construct robust methods for sensor network self-calibration using both Time of Arrival (TOA) and Time Difference of Arrival (TDOA) measurements. One of the difficulties is that corrupt measurements, so-called outliers, are present and should be excluded from the model fitting. To achieve this, we use hypothesis-and-test frameworks together with minimal solvers, resulting in methods that are robust to noise, outliers, and missing data. Several new minimal solvers are introduced to accommodate a range of receiver and sender configurations in 2D and 3D space. These solvers are formulated as polynomial equation systems which are solvedusing methods from algebraic geometry.In the third part, we focus specifically on the problems of trilateration and multilateration, and we present a method that approximates the Maximum Likelihood (ML) estimator for different noise distributions. The proposed approach reduces to an eigendecomposition problem for which there are good solvers. This results in a method that is faster and more numerically stable than the state-of-the-art, while still being easy to implement. Furthermore, we present a robust trilateration method that incorporates a motion model. This enables the removal of outliers in the distance measurements at the same time as drift in the motion model is canceled

    CALIBRATION OF AN ULTRASONIC TRANSMISSIVE COMPUTED TOMOGRAPHY SYSTEM

    Get PDF
    Tato dizertace je zaměƙena na medicĂ­nskou zobrazovacĂ­ modalitu – ultrazvukovou počítačovou tomografii – a algoritmy zlepĆĄujĂ­cĂ­ kvalitu zobrazenĂ­, zejmĂ©na kalibraci USCT pƙístroje. USCT je novou modalitou kombinujĂ­cĂ­ ultrazvukovĂœ pƙenos signĂĄlĆŻ a principy tomografickĂ© rekonstrukce obrazĆŻ vyvĂ­jenĂœch pro jinĂ© tomografickĂ© systĂ©my. V principu lze vytvoƙit kvantitativnĂ­ 3D obrazovĂ© objemy s vysokĂœm rozliĆĄenĂ­m a kontrastem. USCT je primĂĄrně určeno pro diagnĂłzu rakoviny prsu. Autor spolupracoval na projektu Institutu ZpracovĂĄnĂ­ dat a Elektroniky, Forschungszentrum Karlsruhe, kde je USCT systĂ©m vyvĂ­jen. Jeden ze zĂĄsadnĂ­ch problĂ©mĆŻ prototypu USCT v Karlsruhe byla absence kalibrace. TisĂ­ce ultrazvukovĂœch měničƯ se liĆĄĂ­ v citlivosti, směrovosti a frekvenčnĂ­ odezvě. Tyto parametry jsou navĂ­c proměnnĂ© v čase. DalĆĄĂ­ a mnohem zĂĄvaĆŸnějĆĄĂ­ problĂ©m byl v pozičnĂ­ch odchylkĂĄch jednotlivĂœch měničƯ. VĆĄechny tyto aspekty majĂ­ vliv na konečnou kvalitu rekonstruovanĂœch obrazĆŻ. ProblĂ©m kalibrace si autor zvolil jako hlavnĂ­ tĂ©ma dizertace. Tato dizertace popisuje novĂ© metody v oblastech rekonstrukce ĂștlumovĂœch obrazĆŻ, kalibrace citlivosti měničƯ a zejmĂ©na geometrickĂĄ kalibrace pozic měničƯ. Tyto metody byly implementovĂĄny a otestovĂĄny na reĂĄlnĂœch datech pochĂĄzejĂ­cĂ­ch z prototypu USCT z Karlsruhe.This dissertation is centered on a medical imaging modality – the ultrasonic computed tomography (USCT) – and algorithms which improve the resulting image quality, namely the calibration of a USCT device. The USCT is a novel imaging modality which combines the phenomenon of ultrasound and image reconstruction principles developed for other tomographic systems. It is capable of producing quantitative 3D image volumes with high resolution and tissue contrast and is primarily aimed at breast cancer diagnosis. The author was involved in a joint research project at the Institute of Data Processing and Electronics, Forschungszentrum Karlsruhe (German National Research Center), where a USCT system is being developed. One of the main problems in the Karlsruhe USCT prototype was the absence of any calibration. The thousands of transducers used in the system have deviations in sensitivity, directivity, and frequency response. These parameters change over time as the transducers age. Also the mechanical positioning of the transducer elements is not precise. All these aspects greatly affect the overall quality of the reconstructed images. The problem of calibration of a USCT system was chosen as the main topic for this dissertation. The dissertation thesis presents novel methods in the area of reconstruction of attenuation images, sensitivity calibration, and mainly geometrical calibration. The methods were implemented and tested on real data generated by the Karlsruhe USCT device.

    Enhancing the measurement of clinical outcomes using Microsoft Kinect

    Get PDF
    There is a growing body of applications leveraging Microsoft Kinect and the associated Windows Software Development Kit in health and wellness. In particular, this platform has been valuable in developing interactive solutions for rehabilitation including creating more engaging exercise regimens and ensuring that exercises are performed correctly for optimal outcomes. Clinical trials rely upon robust and validated methodologies to measure health status and to detect treatment-related changes over time to enable the efficacy and safety of new drug treatments to be assessed and measured. In many therapeutic areas, traditional outcome measures rely on subjective investigator and patient ratings. Subjective ratings are not always sensitive to detecting small improvements, are subject to inter- and intra-rater variability and limited in their ability to record detailed or subtle aspects of movement and mobility. For these reasons, objective measurements may provide greater sensitivity to detect treatment-related changes where they exist. In this review paper, we explore the use of the Kinect platform to develop low-cost approaches to objectively measure aspects of movement. We consider published applications that measure aspects of gait and balance, upper extremity movement, chest wall motion and facial analysis. In each case, we explore the utility of the approach for clinical trials, and the precision and accuracy of estimates derived from the Kinect output. We conclude that the use of games platforms such as Microsoft Kinect to measure clinical outcomes offer a versatile, easy to use and low-cost approach that may add significant value and utility to clinical drug development, in particular in replacing conventional subjective measures and providing richer information about movement than previously possible in large scale clinical trials, especially in the measurement of gross spatial movements. Regulatory acceptance of clinical outcomes collected in this way will be subject to comprehensive assessment of validity and clinical relevance, and this will require good quality peer-reviewed publications of scientific evidence

    Suivi Multi-Locuteurs avec des Informations Audio-Visuelles pour la Perception des Robots

    Get PDF
    Robot perception plays a crucial role in human-robot interaction (HRI). Perception system provides the robot information of the surroundings and enables the robot to give feedbacks. In a conversational scenario, a group of people may chat in front of the robot and move freely. In such situations, robots are expected to understand where are the people, who are speaking, or what are they talking about. This thesis concentrates on answering the first two questions, namely speaker tracking and diarization. We use different modalities of the robot’s perception system to achieve the goal. Like seeing and hearing for a human-being, audio and visual information are the critical cues for a robot in a conversational scenario. The advancement of computer vision and audio processing of the last decade has revolutionized the robot perception abilities. In this thesis, we have the following contributions: we first develop a variational Bayesian framework for tracking multiple objects. The variational Bayesian framework gives closed-form tractable problem solutions, which makes the tracking process efficient. The framework is first applied to visual multiple-person tracking. Birth and death process are built jointly with the framework to deal with the varying number of the people in the scene. Furthermore, we exploit the complementarity of vision and robot motorinformation. On the one hand, the robot’s active motion can be integrated into the visual tracking system to stabilize the tracking. On the other hand, visual information can be used to perform motor servoing. Moreover, audio and visual information are then combined in the variational framework, to estimate the smooth trajectories of speaking people, and to infer the acoustic status of a person- speaking or silent. In addition, we employ the model to acoustic-only speaker localization and tracking. Online dereverberation techniques are first applied then followed by the tracking system. Finally, a variant of the acoustic speaker tracking model based on von-Mises distribution is proposed, which is specifically adapted to directional data. All the proposed methods are validated on datasets according to applications.La perception des robots joue un rĂŽle crucial dans l’interaction homme-robot (HRI). Le systĂšme de perception fournit les informations au robot sur l’environnement, ce qui permet au robot de rĂ©agir en consequence. Dans un scĂ©nario de conversation, un groupe de personnes peut discuter devant le robot et se dĂ©placer librement. Dans de telles situations, les robots sont censĂ©s comprendre oĂč sont les gens, ceux qui parlent et de quoi ils parlent. Cette thĂšse se concentre sur les deux premiĂšres questions, Ă  savoir le suivi et la diarisation des locuteurs. Nous utilisons diffĂ©rentes modalitĂ©s du systĂšme de perception du robot pour remplir cet objectif. Comme pour l’humain, l’ouie et la vue sont essentielles pour un robot dans un scĂ©nario de conversation. Les progrĂšs de la vision par ordinateur et du traitement audio de la derniĂšre dĂ©cennie ont rĂ©volutionnĂ© les capacitĂ©s de perception des robots. Dans cette thĂšse, nous dĂ©veloppons les contributions suivantes : nous dĂ©veloppons d’abord un cadre variationnel bayĂ©sien pour suivre plusieurs objets. Le cadre bayĂ©sien variationnel fournit des solutions explicites, rendant le processus de suivi trĂšs efficace. Cette approche est d’abord appliquĂ© au suivi visuel de plusieurs personnes. Les processus de crĂ©ations et de destructions sont en adĂ©quation avecle modĂšle probabiliste proposĂ© pour traiter un nombre variable de personnes. De plus, nous exploitons la complĂ©mentaritĂ© de la vision et des informations du moteur du robot : d’une part, le mouvement actif du robot peut ĂȘtre intĂ©grĂ© au systĂšme de suivi visuel pour le stabiliser ; d’autre part, les informations visuelles peuvent ĂȘtre utilisĂ©es pour effectuer l’asservissement du moteur. Par la suite, les informations audio et visuelles sont combinĂ©es dans le modĂšle variationnel, pour lisser les trajectoires et dĂ©duire le statut acoustique d’une personne : parlant ou silencieux. Pour experimenter un scenario oĂč l’informationvisuelle est absente, nous essayons le modĂšle pour la localisation et le suivi des locuteurs basĂ© sur l’information acoustique uniquement. Les techniques de dĂ©rĂ©verbĂ©ration sont d’abord appliquĂ©es, dont le rĂ©sultat est fourni au systĂšme de suivi. Enfin, une variante du modĂšle de suivi des locuteurs basĂ©e sur la distribution de von-Mises est proposĂ©e, celle-ci Ă©tant plus adaptĂ©e aux donnĂ©es directionnelles. Toutes les mĂ©thodes proposĂ©es sont validĂ©es sur des bases de donnĂ©es specifiques Ă  chaque application

    Singing in Space(s): Singing performance in real and virtual acoustic environments - Singers' evaluation, performance analysis and listeners' perception

    Get PDF
    The Virtual Singing Studio (VSS), a loudspeaker-based room acoustic simulation, was developed in order to facilitate investigations into the correlations and interactions between room acoustic characteristics and vocal performance parameters. To this end, the VSS provides a virtual performance space with interactivity in real-time for an active sound source - meaning that singers can hear themselves sing as if in a real performance space. An objective evaluation of the simulation was carried out through measurement and comparison of room acoustic parameters of the simulation and the real performance space. Furthermore a subjective evaluation involved a number of professional singers who sang in the virtual and real performance spaces and reported their impressions of the experience. Singing performances recorded in the real and virtual spaces were compared via the analysis of tempo, vibrato rate, vibrato extent and measures of intonation accuracy and precision. A stimuli sorting task evaluated listeners' perception of the similarity between singing performances recorded in the real and simulated spaces. A multi-dimensional scaling analysis was undertaken on the data obtained and dimensions of the common perceptual space were identified using property fitting techniques in order to assess the relationship between performance attributes and the perceived similarities. In general significant proportions of the perceived similarity between recordings could be explained by differences in global tempo, vibrato extent and intonation precision. Although there were few statistically significant effects of room acoustic condition all singers self-reported changes to their singing according to the different room acoustic configurations, and listeners perceived these differences, especially in vibrato extent and global tempo. The present VSS has been shown to be not fully ``realistic'' enough to elicit variations in singing performance according to room acoustic conditions. Therefore, further improvements are suggested including the incorporation of visual aspect to the simulation. Nonetheless, the VSS is already able to provide a ``plausible'' interactive room acoustic simulation for singers to hear themselves in real-time as if in a real performance venue
    • 

    corecore