609 research outputs found

    Speaker Localization with Moving Microphone Arrays

    Get PDF
    Speaker localization algorithms often assume static location for all sensors. This assumption simplifies the models used, since all acoustic transfer functions are linear time invariant. In many applications this assumption is not valid. In this paper we address the localization challenge with moving microphone arrays. We propose two algorithms to find the speaker position. The first approach is a batch algorithm based on the maximum likelihood criterion, optimized via expectationmaximization iterations. The second approach is a particle filter for sequential Bayesian estimation. The performance of both approaches is evaluated and compared for simulated reverberant audio data from a microphone array with two sensors

    Distributed Affine Projection Algorithm Over Acoustically Coupled Sensor Networks

    Full text link
    [EN] In this paper, we present a distributed affine projection (AP) algorithm for an acoustic sensor network where the nodes are acoustically coupled. Every acoustic node is composed of a microphone, a processor, and an actuator to control the sound field. This type of networks can use distributed adaptive algorithms to deal with the active noise control (ANC) problem in a cooperative manner, providing more flexible and scalable ANC systems. In this regard, we introduce here a distributed version of the multichannel filtered-x AP algorithm over an acoustic sensor network that it is called distributed filtered-x AP (DFxAP) algorithm. The analysis of the mean and the mean-square deviation performance of the algorithm at each node is given for a network with a ring topology and without constraints in the communication layer. The theoretical results are validated through several simulations. Moreover, simulations show that the proposed DFxAP outperforms the previously reported distributed multiple error filtered-x least mean square algorithm.This work was supported in part by EU together with Spanish Government under Grant TEC2015-67387-C4-1-R (MINECO/FEDER), and in part by Generalitat Valenciana under PROMETEOII/2014/003.Ferrer Contreras, M.; Gonzalez, A.; Diego Antón, MD.; Piñero, G. (2017). Distributed Affine Projection Algorithm Over Acoustically Coupled Sensor Networks. IEEE Transactions on Signal Processing. 65(24):6423-6434. https://doi.org/10.1109/TSP.2017.2742987S64236434652

    Sequential estimation techniques and application to multiple speaker tracking and language modeling

    Get PDF
    For many real-word applications, the considered data is given as a time sequence that becomes available in an orderly fashion, where the order incorporates important information about the entities of interest. The work presented in this thesis deals with two such cases by introducing new sequential estimation solutions. More precisely, we introduce a: I. Sequential Bayesian estimation framework to solve the multiple speaker localization, detection and tracking problem. This framework is a complete pipeline that includes 1) new observation estimators, which extract a fixed number of potential locations per time frame; 2) new unsupervised Bayesian detectors, which classify these estimates into noise/speaker classes and 3) new Bayesian filters, which use the speaker class estimates to track multiple speakers. This framework was developed to tackle the low overlap detection rate of multiple speakers and to reduce the number of constraints generally imposed in standard solutions. II. Sequential neural estimation framework for language modeling, which overcomes some of the shortcomings of standard approaches through merging of different models in a hybrid architecture. That is, we introduce two solutions that tightly merge particular models and then show how a generalization can be achieved through a new mixture model. In order to speed-up the training of large vocabulary language models, we introduce a new extension of the noise contrastive estimation approach to batch training.Bei vielen Anwendungen kommen Daten als zeitliche Sequenz vor, deren Reihenfolge wichtige Informationen über die betrachteten Entitäten enthält. In der vorliegenden Arbeit werden zwei derartige Fälle bearbeitet, indem neue sequenzielle Schätzverfahren eingeführt werden: I. Ein Framework für ein sequenzielles bayessches Schätzverfahren zur Lokalisation, Erkennung und Verfolgung mehrerer Sprecher. Es besteht aus 1) neuen Beobachtungsschätzern, welche pro Zeitfenster eine bestimmte Anzahl möglicher Aufenthaltsorte bestimmen; 2) neuen, unüberwachten bayesschen Erkennern, die diese Abschätzungen nach Sprechern/Rauschen klassifizieren und 3) neuen bayesschen Filtern, die Schätzungen aus der Sprecher-Klasse zur Verfolgung mehrerer Sprecher verwenden. Dieses Framework wurde speziell zur Verbesserung der i.A. niedrigen Erkennungsrate bei gleichzeitig Sprechenden entwickelt und benötigt weniger Randbedingungen als Standardlösungen. II. Ein sequenzielles neuronales Vorhersageframework für Sprachmodelle, das einige Nachteile von Standardansätzen durch das Zusammenführen verschiedener Modelle in einer Hybridarchitektur beseitigt. Konkret stellen wir zwei Lösungen vor, die bestimmte Modelle integrieren, und leiten dann eine Verallgemeinerung durch die Verwendung eines neuen Mischmodells her. Um das Trainieren von Sprachmodellen mit sehr großem Vokabular zu beschleunigen, wird eine Erweiterung des rauschkontrastiven Schätzverfahrens für Batch-Training vorgestellt

    Studies on noise robust automatic speech recognition

    Get PDF
    Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

    Data-Efficient Learning via Minimizing Hyperspherical Energy

    Full text link
    Deep learning on large-scale data is dominant nowadays. The unprecedented scale of data has been arguably one of the most important driving forces for the success of deep learning. However, there still exist scenarios where collecting data or labels could be extremely expensive, e.g., medical imaging and robotics. To fill up this gap, this paper considers the problem of data-efficient learning from scratch using a small amount of representative data. First, we characterize this problem by active learning on homeomorphic tubes of spherical manifolds. This naturally generates feasible hypothesis class. With homologous topological properties, we identify an important connection -- finding tube manifolds is equivalent to minimizing hyperspherical energy (MHE) in physical geometry. Inspired by this connection, we propose a MHE-based active learning (MHEAL) algorithm, and provide comprehensive theoretical guarantees for MHEAL, covering convergence and generalization analysis. Finally, we demonstrate the empirical performance of MHEAL in a wide range of applications on data-efficient learning, including deep clustering, distribution matching, version space sampling and deep active learning

    Underwater simulation and mapping using imaging sonar through ray theory and Hilbert maps

    Get PDF
    Mapping, sometimes as part of a SLAM system, is an active topic of research and has remarkable solutions using laser scanners, but most of the underwater mapping is focused on 2D maps, treating the environment as a floor plant, or on 2.5D maps of the seafloor. The reason for the problematic of underwater mapping originates in its sensor, i.e. sonars. In contrast to lasers (LIDARs), sonars are unprecise high-noise sensors. Besides its noise, imaging sonars have a wide sound beam effectuating a volumetric measurement. The first part of this dissertation develops an underwater simulator for highfrequency single-beam imaging sonars capable of replicating multipath, directional gain and typical noise effects on arbitrary environments. The simulation relies on a ray theory based method and explanations of how this theory follows from first principles under short-wavelegnth assumption are provided. In the second part of this dissertation, the simulator is combined to a continous map algorithm based on Hilbert Maps. Hilbert maps arise as a machine learning technique over Hilbert spaces, using features maps, applied to the mapping context. The embedding of a sonar response in such a map is a contribution. A qualitative comparison between the simulator ground truth and the reconstucted map reveal Hilbert maps as a promising technique to noisy sensor mapping and, also, indicates some hard to distinguish characteristics of the surroundings, e.g. corners and non smooth features.O mapeamento, às vezes como parte de um sistema SLAM, é um tema de pesquisa ativo e tem soluções notáveis usando scanners a laser, mas a maioria do mapeamento subaquático é focada em mapas 2D, que tratam o ambiente como uma planta, ou mapas 2.5D do fundo do mar. A razão para a dificuldade do mapeamento subaquático origina-se no seu sensor, i.e. sonares. Em contraste com lasers (LIDARs), os sonares são sensores imprecisos e com alto nível de ruído. Além do seu ruído, os sonares do tipo imaging têm um feixe sonoro muito amplo e, com isso, efetuam uma medição volumétrica, ou seja, sobre todo um volume. Na primeira parte dessa dissertação se desenvolve um simulador para sonares do tipo imaging de feixo único de alta frequência capaz de replicar os efeitos típicos de multicaminho, ganho direcional e ruído de fundo em ambientes arbitrários. O simulador implementa um método baseado na teoria geométrica de raios, com todo seu desenvolvimento partindo da acústica subaquática. Na segunda parte dessa dissertação, o simulador é incorporado em um algoritmo de reconstrução de mapas contínuos baseado em Hilbert Maps. Hilbert Maps surge como uma técnica de aprendizado de máquina sobre espaços de Hilbert, usando mapas de características, aplicadas ao contexto de mapeamento. A incorporação de uma resposta de sonar em um tal mapa é uma contribuição desse trabalho. Uma comparação qualitativa entre o ambiente de referência fornecido ao simulador e o mapa reconstruído pela técnica proposta, revela Hilbert Maps como uma técnica promissora para mapeamento atráves de sensores ruidosos e, também, aponta para algumas características do ambiente difíceis de se distinguir, e.g. cantos e regiões não suaves

    Towards Cognizant Hearing Aids: Modeling of Content, Affect and Attention

    Get PDF

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
    corecore