149 research outputs found

    Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays

    Get PDF
    In the analysis of acoustic scenes, often the occurring sounds have to be detected in time, recognized, and localized in space. Usually, each of these tasks is done separately. In this paper, a model-based approach to jointly carry them out for the case of multiple simultaneous sources is presented and tested. The recognized event classes and their respective room positions are obtained with a single system that maximizes the combination of a large set of scores, each one resulting from a different acoustic event model and a different beamformer output signal, which comes from one of several arbitrarily-located small microphone arrays. By using a two-step method, the experimental work for a specific scenario consisting of meeting-room acoustic events, either isolated or overlapped with speech, is reported. Tests carried out with two datasets show the advantage of the proposed approach with respect to some usual techniques, and that the inclusion of estimated priors brings a further performance improvement.Comment: Computational acoustic scene analysis, microphone array signal processing, acoustic event detectio

    雑音特性の変動を伴う多様な環境で実用可能な音声強調

    Get PDF
    筑波大学 (University of Tsukuba)201

    Linear Transmit-Receive Strategies for Multi-user MIMO Wireless Communications

    Get PDF
    Die Notwendigkeit zur Unterdrueckung von Interferenzen auf der einen Seite und zur Ausnutzung der durch Mehrfachzugriffsverfahren erzielbaren Gewinne auf der anderen Seite rueckte die raeumlichen Mehrfachzugriffsverfahren (Space Division Multiple Access, SDMA) in den Fokus der Forschung. Ein Vertreter der raeumlichen Mehrfachzugriffsverfahren, die lineare Vorkodierung, fand aufgrund steigender Anzahl an Nutzern und Antennen in heutigen und zukuenftigen Mobilkommunikationssystemen besondere Beachtung, da diese Verfahren das Design von Algorithmen zur Vorcodierung vereinfachen. Aus diesem Grund leistet diese Dissertation einen Beitrag zur Entwicklung linearer Sende- und Empfangstechniken fuer MIMO-Technologie mit mehreren Nutzern. Zunaechst stellen wir ein Framework zur Approximation des Datendurchsatzes in Broadcast-MIMO-Kanaelen mit mehreren Nutzern vor. In diesem Framework nehmen wir das lineare Vorkodierverfahren regularisierte Blockdiagonalisierung (RBD) an. Durch den Vergleich von Dirty Paper Coding (DPC) und linearen Vorkodieralgorithmen (z.B. Zero Forcing (ZF) und Blockdiagonalisierung (BD)) ist es uns moeglich, untere und obere Schranken fuer den Unterschied bezueglich Datenraten und bezueglich Leistung zwischen beiden anzugeben. Im Weiteren entwickeln wir einen Algorithmus fuer koordiniertes Beamforming (Coordinated Beamforming, CBF), dessen Loesung sich in geschlossener Form angeben laesst. Dieser CBF-Algorithmus basiert auf der SeDJoCo-Transformation und loest bisher vorhandene Probleme im Bereich CBF. Im Anschluss schlagen wir einen iterativen CBF-Algorithmus namens FlexCoBF (flexible coordinated beamforming) fuer MIMO-Broadcast-Kanaele mit mehreren Nutzern vor. Im Vergleich mit bis dato existierenden iterativen CBF-Algorithmen kann als vielversprechendster Vorteil die freie Wahl der linearen Sende- und Empfangsstrategie herausgestellt werden. Das heisst, jede existierende Methode der linearen Vorkodierung kann als Sendestrategie genutzt werden, waehrend die Strategie zum Empfangsbeamforming frei aus MRC oder MMSE gewaehlt werden darf. Im Hinblick auf Szenarien, in denen Mobilfunkzellen in Clustern zusammengefasst sind, erweitern wir FlexCoBF noch weiter. Hier wurde das Konzept der koordinierten Mehrpunktverbindung (Coordinated Multipoint (CoMP) transmission) integriert. Zuletzt stellen wir drei Moeglichkeiten vor, Kanalzustandsinformationen (Channel State Information, CSI) unter verschiedenen Kanalumstaenden zu erlangen. Die Qualitaet der Kanalzustandsinformationen hat einen starken Einfluss auf die Guete des Uebertragungssystems. Die durch unsere neuen Algorithmen erzielten Verbesserungen haben wir mittels numerischer Simulationen von Summenraten und Bitfehlerraten belegt.In order to combat interference and exploit large multiplexing gains of the multi-antenna systems, a particular interest in spatial division multiple access (SDMA) techniques has emerged. Linear precoding techniques, as one of the SDMA strategies, have obtained more attention due to the fact that an increasing number of users and antennas involved into the existing and future mobile communication systems requires a simplification of the precoding design. Therefore, this thesis contributes to the design of linear transmit and receive strategies for multi-user MIMO broadcast channels in a single cell and clustered multiple cells. First, we present a throughput approximation framework for multi-user MIMO broadcast channels employing regularized block diagonalization (RBD) linear precoding. Comparing dirty paper coding (DPC) and linear precoding algorithms (e.g., zero forcing (ZF) and block diagonalization (BD)), we further quantify lower and upper bounds of the rate and power offset between them as a function of the system parameters such as the number of users and antennas. Next, we develop a novel closed-form coordinated beamforming (CBF) algorithm (i.e., SeDJoCo based closed-form CBF) to solve the existing open problem of CBF. Our new algorithm can support a MIMO system with an arbitrary number of users and transmit antennas. Moreover, the application of our new algorithm is not only for CBF, but also for blind source separation (BSS), since the same mathematical model has been used in BSS application.Then, we further propose a new iterative CBF algorithm (i.e., flexible coordinated beamforming (FlexCoBF)) for multi-user MIMO broadcast channels. Compared to the existing iterative CBF algorithms, the most promising advantage of our new algorithm is that it provides freedom in the choice of the linear transmit and receive beamforming strategies, i.e., any existing linear precoding method can be chosen as the transmit strategy and the receive beamforming strategy can be flexibly chosen from MRC or MMSE receivers. Considering clustered multiple cell scenarios, we extend the FlexCoBF algorithm further and introduce the concept of the coordinated multipoint (CoMP) transmission. Finally, we present three strategies for channel state information (CSI) acquisition regarding various channel conditions and channel estimation strategies. The CSI knowledge is required at the base station in order to implement SDMA techniques. The quality of the obtained CSI heavily affects the system performance. The performance enhancement achieved by our new strategies has been demonstrated by numerical simulation results in terms of the system sum rate and the bit error rate

    Source Localization for Dual Speech Enhancement Technology

    Get PDF

    Assessment of Measurement Distortions in GNSS Antenna Array Space-Time Processing

    Get PDF
    Antenna array processing techniques are studied in GNSS as effective tools to mitigate interference in spatial and spatiotemporal domains. However, without specific considerations, the array processing results in biases and distortions in the cross-ambiguity function (CAF) of the ranging codes. In space-time processing (STP) the CAF misshaping can happen due to the combined effect of space-time processing and the unintentional signal attenuation by filtering. This paper focuses on characterizing these degradations for different controlled signal scenarios and for live data from an antenna array. The antenna array simulation method introduced in this paper enables one to perform accurate analyses in the field of STP. The effects of relative placement of the interference source with respect to the desired signal direction are shown using overall measurement errors and profile of the signal strength. Analyses of contributions from each source of distortion are conducted individually and collectively. Effects of distortions on GNSS pseudorange errors and position errors are compared for blind, semi-distortionless, and distortionless beamforming methods. The results from characterization can be useful for designing low distortion filters that are especially important for high accuracy GNSS applications in challenging environments

    Studies on noise robust automatic speech recognition

    Get PDF
    Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

    Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

    Full text link
    Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation. In parallel, the integration of time domain network structure and beamforming also gains significant attention. This study proposes a novel all-neural beamforming method in time domain and makes an attempt to unify the all-neural beamforming pipelines for time domain and frequency domain multichannel speech separation. The proposed model consists of two modules: separation and beamforming. Both modules perform temporal-spectral-spatial modeling and are trained from end-to-end using a joint loss function. The novelty of this study lies in two folds. Firstly, a time domain directional feature conditioned on the direction of the target speaker is proposed, which can be jointly optimized within the time domain architecture to enhance target signal estimation. Secondly, an all-neural beamforming network in time domain is designed to refine the pre-separated results. This module features with parametric time-variant beamforming coefficient estimation, without explicitly following the derivation of optimal filters that may lead to an upper bound. The proposed method is evaluated on simulated reverberant overlapped speech data derived from the AISHELL-1 corpus. Experimental results demonstrate significant performance improvements over frequency domain state-of-the-arts, ideal magnitude masks and existing time domain neural beamforming methods
    corecore