96 research outputs found

    Locating and extracting acoustic and neural signals

    Get PDF
    This dissertation presents innovate methodologies for locating, extracting, and separating multiple incoherent sound sources in three-dimensional (3D) space; and applications of the time reversal (TR) algorithm to pinpoint the hyper active neural activities inside the brain auditory structure that are correlated to the tinnitus pathology. Specifically, an acoustic modeling based method is developed for locating arbitrary and incoherent sound sources in 3D space in real time by using a minimal number of microphones, and the Point Source Separation (PSS) method is developed for extracting target signals from directly measured mixed signals. Combining these two approaches leads to a novel technology known as Blind Sources Localization and Separation (BSLS) that enables one to locate multiple incoherent sound signals in 3D space and separate original individual sources simultaneously, based on the directly measured mixed signals. These technologies have been validated through numerical simulations and experiments conducted in various non-ideal environments where there are non-negligible, unspecified sound reflections and reverberation as well as interferences from random background noise. Another innovation presented in this dissertation is concerned with applications of the TR algorithm to pinpoint the exact locations of hyper-active neurons in the brain auditory structure that are directly correlated to the tinnitus perception. Benchmark tests conducted on normal rats have confirmed the localization results provided by the TR algorithm. Results demonstrate that the spatial resolution of this source localization can be as high as the micrometer level. This high precision localization may lead to a paradigm shift in tinnitus diagnosis, which may in turn produce a more cost-effective treatment for tinnitus than any of the existing ones

    Acoustic sensor network geometry calibration and applications

    Get PDF
    In the modern world, we are increasingly surrounded by computation devices with communication links and one or more microphones. Such devices are, for example, smartphones, tablets, laptops or hearing aids. These devices can work together as nodes in an acoustic sensor network (ASN). Such networks are a growing platform that opens the possibility for many practical applications. ASN based speech enhancement, source localization, and event detection can be applied for teleconferencing, camera control, automation, or assisted living. For this kind of applications, the awareness of auditory objects and their spatial positioning are key properties. In order to provide these two kinds of information, novel methods have been developed in this thesis. Information on the type of auditory objects is provided by a novel real-time sound classification method. Information on the position of human speakers is provided by a novel localization and tracking method. In order to localize with respect to the ASN, the relative arrangement of the sensor nodes has to be known. Therefore, different novel geometry calibration methods were developed. Sound classification The first method addresses the task of identification of auditory objects. A novel application of the bag-of-features (BoF) paradigm on acoustic event classification and detection was introduced. It can be used for event and speech detection as well as for speaker identification. The use of both mel frequency cepstral coefficient (MFCC) and Gammatone frequency cepstral coefficient (GFCC) features improves the classification accuracy. By using soft quantization and introducing supervised training for the BoF model, superior accuracy is achieved. The method generalizes well from limited training data. It is working online and can be computed in a fraction of real-time. By a dedicated training strategy based on a hierarchy of stationarity, the detection of speech in mixtures with noise was realized. This makes the method robust against severe noises levels corrupting the speech signal. Thus it is possible to provide control information to a beamformer in order to realize blind speech enhancement. A reliable improvement is achieved in the presence of one or more stationary noise sources. Speaker localization The localization method enables each node to determine the direction of arrival (DoA) of concurrent sound sources. The author's neuro-biologically inspired speaker localization method for microphone arrays was refined for the use in ASNs. By implementing a dedicated cochlear and midbrain model, it is robust against the reverberation found in indoor rooms. In order to better model the unknown number of concurrent speakers, an application of the EM algorithm that realizes probabilistic clustering according to auditory scene analysis (ASA) principles was introduced. Based on this approach, a system for Euclidean tracking in ASNs was designed. Each node applies the node wise localization method and shares probabilistic DoA estimates together with an estimate of the spectral distribution with the network. As this information is relatively sparse, it can be transmitted with low bandwidth. The system is robust against jitter and transmission errors. The information from all nodes is integrated according to spectral similarity to correctly associate concurrent speakers. By incorporating the intersection angle in the triangulation, the precision of the Euclidean localization is improved. Tracks of concurrent speakers are computed over time, as is shown with recordings in a reverberant room. Geometry calibration The central task of geometry calibration has been solved with special focus on sensor nodes equipped with multiple microphones. Novel methods were developed for different scenarios. An audio-visual method was introduced for the calibration of ASNs in video conferencing scenarios. The DoAs estimates are fused with visual speaker tracking in order to provide sensor positions in a common coordinate system. A novel acoustic calibration method determines the relative positioning of the nodes from ambient sounds alone. Unlike previous methods that only infer the positioning of distributed microphones, the DoA is incorporated and thus it becomes possible to calibrate the orientation of the nodes with a high accuracy. This is very important for all applications using the spatial information, as the triangulation error increases dramatically with bad orientation estimates. As speech events can be used, the calibration becomes possible without the requirement of playing dedicated calibration sounds. Based on this, an online method employing a genetic algorithm with incremental measurements was introduced. By using the robust speech localization method, the calibration is computed in parallel to the tracking. The online method is be able to calibrate ASNs in real time, as is shown with recordings of natural speakers in a reverberant room. The informed acoustic sensor network All new methods are important building blocks for the use of ASNs. The online methods for localization and calibration both make use of the neuro-biologically inspired processing in the nodes which leads to state-of-the-art results, even in reverberant enclosures. The high robustness and reliability can be improved even more by including the event detection method in order to exclude non-speech events. When all methods are combined, both semantic information on what is happening in the acoustic scene as well as spatial information on the positioning of the speakers and sensor nodes is automatically acquired in real time. This realizes truly informed audio processing in ASNs. Practical applicability is shown by application to recordings in reverberant rooms. The contribution of this thesis is thus not only to advance the state-of-the-art in automatically acquiring information on the acoustic scene, but also pushing the practical applicability of such methods

    Spatial features of reverberant speech: estimation and application to recognition and diarization

    Get PDF
    Distant talking scenarios, such as hands-free calling or teleconference meetings, are essential for natural and comfortable human-machine interaction and they are being increasingly used in multiple contexts. The acquired speech signal in such scenarios is reverberant and affected by additive noise. This signal distortion degrades the performance of speech recognition and diarization systems creating troublesome human-machine interactions.This thesis proposes a method to non-intrusively estimate room acoustic parameters, paying special attention to a room acoustic parameter highly correlated with speech recognition degradation: clarity index. In addition, a method to provide information regarding the estimation accuracy is proposed. An analysis of the phoneme recognition performance for multiple reverberant environments is presented, from which a confusability metric for each phoneme is derived. This confusability metric is then employed to improve reverberant speech recognition performance. Additionally, room acoustic parameters can as well be used in speech recognition to provide robustness against reverberation. A method to exploit clarity index estimates in order to perform reverberant speech recognition is introduced. Finally, room acoustic parameters can also be used to diarize reverberant speech. A room acoustic parameter is proposed to be used as an additional source of information for single-channel diarization purposes in reverberant environments. In multi-channel environments, the time delay of arrival is a feature commonly used to diarize the input speech, however the computation of this feature is affected by reverberation. A method is presented to model the time delay of arrival in a robust manner so that speaker diarization is more accurately performed.Open Acces

    Ambient acoustics as indicator of environmental change in the Beaufort Sea: experiments & methods for analysis

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2021.The Arctic Ocean is a vital component of Earth’s climate system experiencing dramatic environmental changes. The changes are reflected in its underwater ambient soundscape as its origin and propagation are primarily dependent on properties of the ice cover and water column. The first component of this work examines the effect on ambient noise characteristics due to changes to the Beaufort Sea sound speed profile (SSP) and ice cover. Specifically, the emergence of a warm water intrusion near 70 m depth has altered the historical Arctic SSP while the ice cover has become thinner and younger due to the rise in average global temperature. Hypothesized shifts to the ambient soundscape and surface noise generation due to these changes are verified by comparing the measured noise data during two experiments to modeled results. These changes include a broadside notch in noise vertical directionality as well as a shift from uniform surface noise generation to discrete generation at specific ranges. Motivated by our data analyses, the second component presents several tools to facilitate ambient noise characterization and generation monitoring. One is a convolutional neural network (CNN) approach to noise range estimation. Its robustness to SSP and bottom depth mismatch is compared with conventional matched field processing. We further explore how the CNN approach achieves its performance by examining its intermediate outputs. Another tool is a frequency domain, transient event detection algorithm that leverages image processing and hierarchical clustering to identify and categorize noise transients in data spectrograms. The spectral content retained by this method enables insight into the generation mechanism of the detected events by the ice cover. Lastly, we present the deployment of a seismo-acoustic system to localize transient events. Two forward approaches that utilize time-difference-ofarrival are described and compared with a more conventional, inverse technique. The examination of this system’s performance prompts recommendations for future deployments. With our ambient noise analysis and algorithm development, we hope these contributions provide a stronger foundation for continued study of the Arctic ambient soundscape as the region continues to grow in significance.Office of Naval Research (ONR) via the University of California - San Diego (UCSD) under award number N00014-16-1-2129. Defense Advanced Research Projects Agency (DARPA) via Applied Physical Sciences Corp. (APS) under award number HR0011-18-C-0008. Office of Naval Research (ONR) under award number N00014-17-1-2474. Office of Naval Research (ONR) under award number N00014-19-1-2741. National Science Foundation (NSF) under grant number 2389237

    Underwater Source Localization based on Modal Propagation and Acoustic Signal Processing

    Get PDF
    Acoustic localization plays a pivotal role in underwater vehicle systems and marine mammal detection. Previous efforts adopt synchronized arrays of sensors to extract some features like direction of arrival (DOA) or time of flight (TOF) from the received signal. However, installing and synchronizing several hydrophones over a large area is costly and challenging. To tackle this problem, we use a single-hydrophone localization system which relies on acoustic signal processing methods rather than multiple hydrophones. This system takes modal dispersion into consideration and estimates the distance between sound source and receiver (range) based on dispersion curves. It is shown that the larger the range is, the more separable the modes are. To make the modes more distinguishable, a non-linear signal processing technique, called warping, is utilized. Propagation model of low-frequency signals, such as dolphin sound, is well-studied in shallow water environment (depth D\u3c200 m), and it was demonstrated that at large ranges (range r\u3e1 km), modal dispersion is utterly visible at time frequency (TF) domain. We used Peker is model for the aforementioned situation to localize both synthetic and real underwater acoustic signals. The accuracy of the localization system is examined with various sounds, including impulsive signal, sounds with known Fourier transform, and signals with estimated source phase. Experimental results show that the warping technique can considerably lessen the localization error, especially when prior knowledge about the source signal and waveguide are available

    Nonlinear filtering for narrow-band time delay estimation

    Get PDF
    Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.Includes bibliographical references (p. 101-103).This thesis presents a method for improving passive acoustic tracking. A large family of acoustic tracking systems combine estimates of the time difference of arrival (TDoA) between pairs of spatially separated sensors - this work improves those estimates by independently tracking each TDoA using a Bayesian filter. This tracking is particularly useful for overcoming spatial aliasing, which results from tracking narrowband, high frequency sources. I develop a theoretical model for the evolution of each TDoA from a bound placed on the velocity of the target being tracked. This model enables an efficient form of exact marginalization. I then present simulation and experimental results demonstrating improved performance over a simpler nonlinear preprocessor and Kalman filtering, so long as this bound is chosen appropriately.by Mark M. Tobenkin.M.Eng

    Realization Limits of Impulse-Radio UWB Indoor Localization Systems

    Get PDF
    In this work, the realization limits of an impulse-based Ultra-Wideband (UWB) localization system for indoor applications have been thoroughly investigated and verified by measurements. The analysis spans from the position calculation algorithms, through hardware realization and modeling, up to the localization experiments conducted in realistic scenarios. The main focus was put on identification and characterization of limiting factors as well as developing methods to overcome them

    The Future of the Operating Room: Surgical Preplanning and Navigation using High Accuracy Ultra-Wideband Positioning and Advanced Bone Measurement

    Get PDF
    This dissertation embodies the diversity and creativity of my research, of which much has been peer-reviewed, published in archival quality journals, and presented nationally and internationally. Portions of the work described herein have been published in the fields of image processing, forensic anthropology, physical anthropology, biomedical engineering, clinical orthopedics, and microwave engineering. The problem studied is primarily that of developing the tools and technologies for a next-generation surgical navigation system. The discussion focuses on the underlying technologies of a novel microwave positioning subsystem and a bone analysis subsystem. The methodologies behind each of these technologies are presented in the context of the overall system with the salient results helping to elucidate the difficult facets of the problem. The microwave positioning system is currently the highest accuracy wireless ultra-wideband positioning system that can be found in the literature. The challenges in producing a system with these capabilities are many, and the research and development in solving these problems should further the art of high accuracy pulse-based positioning

    Radiotherapy dosimetry with ultrasound contrast agents

    Get PDF

    Radiotherapy dosimetry with ultrasound contrast agents

    Get PDF
    • …
    corecore