7 research outputs found
An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony
In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique
An investigation of the utility of monaural sound source separation via nonnegative matrix factorization applied to acoustic echo and reverberation mitigation for hands-free telephony
In this thesis we investigate the applicability and utility of Monaural Sound Source Separation (MSSS) via Nonnegative Matrix Factorization (NMF) for various problems related to audio for hands-free telephony. We first investigate MSSS via NMF as an alternative acoustic echo reduction approach to existing approaches such as Acoustic Echo Cancellation (AEC). To this end, we present the single-channel acoustic echo problem as an MSSS problem, in which the objective is to extract the users signal from a mixture also containing acoustic echo and noise. To perform separation, NMF is used to decompose the near-end microphone signal onto the union of two nonnegative bases in the magnitude Short Time Fourier Transform domain. One of these bases is for the spectral energy of the acoustic echo signal, and is formed from the in- coming far-end user’s speech, while the other basis is for the spectral energy of the near-end speaker, and is trained with speech data a priori. In comparison to AEC, the speaker extraction approach obviates Double-Talk Detection (DTD), and is demonstrated to attain its maximal echo mitigation performance immediately upon initiation and to maintain that performance during and after room changes for similar computational requirements. Speaker extraction is also shown to introduce distortion of the near-end speech signal during double-talk, which is quantified by means of a speech distortion measure and compared to that of AEC. Subsequently, we address Double-Talk Detection (DTD) for block-based AEC algorithms. We propose a novel block-based DTD algorithm that uses the available signals and the estimate of the echo signal that is produced by NMF-based speaker extraction to compute a suitably normalized correlation-based decision variable, which is compared to a fixed threshold to decide on doubletalk. Using a standard evaluation technique, the proposed algorithm is shown to have comparable detection performance to an existing conventional block-based DTD algorithm. It is also demonstrated to inherit the room change insensitivity of speaker extraction, with the proposed DTD algorithm generating minimal false doubletalk indications upon initiation and in response to room changes in comparison to the existing conventional DTD. We also show that this property allows its paired AEC to converge at a rate close to the optimum. Another focus of this thesis is the problem of inverting a single measurement of a non- minimum phase Room Impulse Response (RIR). We describe the process by which percep- tually detrimental all-pass phase distortion arises in reverberant speech filtered by the inverse of the minimum phase component of the RIR; in short, such distortion arises from inverting the magnitude response of the high-Q maximum phase zeros of the RIR. We then propose two novel partial inversion schemes that precisely mitigate this distortion. One of these schemes employs NMF-based MSSS to separate the all-pass phase distortion from the target speech in the magnitude STFT domain, while the other approach modifies the inverse minimum phase filter such that the magnitude response of the maximum phase zeros of the RIR is not fully compensated. Subjective listening tests reveal that the proposed schemes generally produce better quality output speech than a comparable inversion technique
Advanced Algebraic Concepts for Efficient Multi-Channel Signal Processing
Unsere moderne Gesellschaft ist Zeuge eines fundamentalen Wandels in der Art und Weise
wie wir mit Technologie interagieren. Geräte werden zunehmend intelligenter - sie verfügen
über mehr und mehr Rechenleistung und häufiger über eigene Kommunikationsschnittstellen.
Das beginnt bei einfachen Haushaltsgeräten und reicht über Transportmittel bis zu großen
ĂĽberregionalen Systemen wie etwa dem Stromnetz. Die Erfassung, die Verarbeitung und der
Austausch digitaler Informationen gewinnt daher immer mehr an Bedeutung. Die Tatsache,
dass ein wachsender Anteil der Geräte heutzutage mobil und deshalb batteriebetrieben ist,
begrĂĽndet den Anspruch, digitale Signalverarbeitungsalgorithmen besonders effizient zu gestalten.
Dies kommt auch dem Wunsch nach einer Echtzeitverarbeitung der groĂźen anfallenden
Datenmengen zugute.
Die vorliegende Arbeit demonstriert Methoden zum Finden effizienter algebraischer Lösungen
für eine Vielzahl von Anwendungen mehrkanaliger digitaler Signalverarbeitung. Solche Ansätze
liefern nicht immer unbedingt die bestmögliche Lösung, kommen dieser jedoch häufig recht
nahe und sind gleichzeitig bedeutend einfacher zu beschreiben und umzusetzen. Die einfache
Beschreibungsform ermöglicht eine tiefgehende Analyse ihrer Leistungsfähigkeit, was für den
Entwurf eines robusten und zuverlässigen Systems unabdingbar ist. Die Tatsache, dass sie nur
gebräuchliche algebraische Hilfsmittel benötigen, erlaubt ihre direkte und zügige Umsetzung
und den Test unter realen Bedingungen.
Diese Grundidee wird anhand von drei verschiedenen Anwendungsgebieten demonstriert.
Zunächst wird ein semi-algebraisches Framework zur Berechnung der kanonisch polyadischen
(CP) Zerlegung mehrdimensionaler Signale vorgestellt. Dabei handelt es sich um ein sehr
grundlegendes Werkzeug der multilinearen Algebra mit einem breiten Anwendungsspektrum
von Mobilkommunikation ĂĽber Chemie bis zur Bildverarbeitung. Verglichen mit existierenden
iterativen Lösungsverfahren bietet das neue Framework die Möglichkeit, den Rechenaufwand
und damit die Güte der erzielten Lösung zu steuern. Es ist außerdem weniger anfällig gegen eine
schlechte Konditionierung der Ausgangsdaten. Das zweite Gebiet, das in der Arbeit besprochen
wird, ist die unterraumbasierte hochauflösende Parameterschätzung für mehrdimensionale Signale,
mit Anwendungsgebieten im RADAR, der Modellierung von Wellenausbreitung, oder
bildgebenden Verfahren in der Medizin. Es wird gezeigt, dass sich derartige mehrdimensionale
Signale mit Tensoren darstellen lassen. Dies erlaubt eine natĂĽrlichere Beschreibung und eine
bessere Ausnutzung ihrer Struktur als das mit Matrizen möglich ist. Basierend auf dieser Idee
entwickeln wir eine tensor-basierte Schätzung des Signalraums, welche genutzt werden kann
um beliebige existierende Matrix-basierte Verfahren zu verbessern. Dies wird im Anschluss
exemplarisch am Beispiel der ESPRIT-artigen Verfahren gezeigt, fĂĽr die verbesserte Versionen
vorgeschlagen werden, die die mehrdimensionale Struktur der Daten (Tensor-ESPRIT),
nichzirkuläre Quellsymbole (NC ESPRIT), sowie beides gleichzeitig (NC Tensor-ESPRIT) ausnutzen.
Um die endgültige Schätzgenauigkeit objektiv einschätzen zu können wird dann ein
Framework für die analytische Beschreibung der Leistungsfähigkeit beliebiger ESPRIT-artiger
Algorithmen diskutiert. Verglichen mit existierenden analytischen AusdrĂĽcken ist unser Ansatz
allgemeiner, da keine Annahmen ĂĽber die statistische Verteilung von Nutzsignal und
Rauschen benötigt werden und die Anzahl der zur Verfügung stehenden Schnappschüsse beliebig
klein sein kann. Dies fĂĽhrt auf vereinfachte AusdrĂĽcke fĂĽr den mittleren quadratischen
Schätzfehler, die Schlussfolgerungen über die Effizienz der Verfahren unter verschiedenen Bedingungen
zulassen. Das dritte Anwendungsgebiet ist der bidirektionale Datenaustausch mit
Hilfe von Relay-Stationen. Insbesondere liegt hier der Fokus auf Zwei-Wege-Relaying mit Hilfe
von Amplify-and-Forward-Relays mit mehreren Antennen, da dieser Ansatz ein besonders gutes
Kosten-Nutzen-Verhältnis verspricht. Es wird gezeigt, dass sich die nötige Kanalkenntnis
mit einem einfachen algebraischen Tensor-basierten Schätzverfahren gewinnen lässt. Außerdem
werden Verfahren zum Finden einer günstigen Relay-Verstärkungs-Strategie diskutiert. Bestehende
Ansätze basieren entweder auf komplexen numerischen Optimierungsverfahren oder auf
Ad-Hoc-Ansätzen die keine zufriedenstellende Bitfehlerrate oder Summenrate liefern. Deshalb
schlagen wir algebraische Ansätze zum Finden der Relayverstärkungsmatrix vor, die von relevanten
Systemmetriken inspiriert sind und doch einfach zu berechnen sind. Wir zeigen das
algebraische ANOMAX-Verfahren zum Erreichen einer niedrigen Bitfehlerrate und seine Modifikation
RR-ANOMAX zum Erreichen einer hohen Summenrate. FĂĽr den Spezialfall, in dem
die Endgeräte nur eine Antenne verwenden, leiten wir eine semi-algebraische Lösung zum
Finden der Summenraten-optimalen Strategie (RAGES) her. Anhand von numerischen Simulationen
wird die Leistungsfähigkeit dieser Verfahren bezüglich Bitfehlerrate und erreichbarer
Datenrate bewertet und ihre Effektivität gezeigt.Modern society is undergoing a fundamental change in the way we interact with technology.
More and more devices are becoming "smart" by gaining advanced computation capabilities
and communication interfaces, from household appliances over transportation systems to large-scale
networks like the power grid. Recording, processing, and exchanging digital information
is thus becoming increasingly important. As a growing share of devices is nowadays mobile
and hence battery-powered, a particular interest in efficient digital signal processing techniques
emerges.
This thesis contributes to this goal by demonstrating methods for finding efficient algebraic
solutions to various applications of multi-channel digital signal processing. These may not
always result in the best possible system performance. However, they often come close while
being significantly simpler to describe and to implement. The simpler description facilitates a
thorough analysis of their performance which is crucial to design robust and reliable systems.
The fact that they rely on standard algebraic methods only allows their rapid implementation
and test under real-world conditions.
We demonstrate this concept in three different application areas. First, we present a semi-algebraic
framework to compute the Canonical Polyadic (CP) decompositions of multidimensional
signals, a very fundamental tool in multilinear algebra with applications ranging from
chemistry over communications to image compression. Compared to state-of-the art iterative
solutions, our framework offers a flexible control of the complexity-accuracy trade-off and
is less sensitive to badly conditioned data. The second application area is multidimensional
subspace-based high-resolution parameter estimation with applications in RADAR, wave propagation
modeling, or biomedical imaging. We demonstrate that multidimensional signals can
be represented by tensors, providing a convenient description and allowing to exploit the
multidimensional structure in a better way than using matrices only. Based on this idea,
we introduce the tensor-based subspace estimate which can be applied to enhance existing
matrix-based parameter estimation schemes significantly. We demonstrate the enhancements
by choosing the family of ESPRIT-type algorithms as an example and introducing enhanced
versions that exploit the multidimensional structure (Tensor-ESPRIT), non-circular source
amplitudes (NC ESPRIT), and both jointly (NC Tensor-ESPRIT). To objectively judge the
resulting estimation accuracy, we derive a framework for the analytical performance assessment
of arbitrary ESPRIT-type algorithms by virtue of an asymptotical first order perturbation
expansion. Our results are more general than existing analytical results since we do not need
any assumptions about the distribution of the desired signal and the noise and we do not
require the number of samples to be large. At the end, we obtain simplified expressions for the
mean square estimation error that provide insights into efficiency of the methods under various
conditions. The third application area is bidirectional relay-assisted communications. Due to
its particularly low complexity and its efficient use of the radio resources we choose two-way
relaying with a MIMO amplify and forward relay. We demonstrate that the required channel
knowledge can be obtained by a simple algebraic tensor-based channel estimation scheme. We
also discuss the design of the relay amplification matrix in such a setting. Existing approaches
are either based on complicated numerical optimization procedures or on ad-hoc solutions
that to not perform well in terms of the bit error rate or the sum-rate. Therefore, we propose
algebraic solutions that are inspired by these performance metrics and therefore perform well
while being easy to compute. For the MIMO case, we introduce the algebraic norm maximizing
(ANOMAX) scheme, which achieves a very low bit error rate, and its extension Rank-Restored
ANOMAX (RR-ANOMAX) that achieves a sum-rate close to an upper bound. Moreover, for
the special case of single antenna terminals we derive the semi-algebraic RAGES scheme which
finds the sum-rate optimal relay amplification matrix based on generalized eigenvectors. Numerical
simulations evaluate the resulting system performance in terms of bit error rate and
system sum rate which demonstrates the effectiveness of the proposed algebraic solutions
Algorithms and Systems for IoT and Edge Computing
The idea of distributing the signal processing along the path that starts with the acquisition and ends with the final application has given light to the Internet of Things and Edge Computing, which have demonstrated several advantages in terms of scalability, costs, and reliability. In this dissertation, we focus on designing and implementing algorithms and systems that allow performing a complex task on devices with limited resources.
Firstly, we assess the trade-off between compression and anomaly detection from both a theoretical and a practical point of view. Information theory provides the rate-distortion analysis that is extended to consider how information content is processed for detection purposes. Considering an actual Structural Health Monitoring application, two corner cases are analysed: detection in high distortion based on a feature extraction method and detection with low distortion based on Principal Component Analysis.
Secondly, we focus on streaming methods for Subspace Analysis. In this context, we revise and study state-of-the-art methods to target devices with limited computational resources. We also consider a real case of deployment of an algorithm for streaming Principal Component Analysis for signal compression in a Structural Health Monitoring application, discussing the trade-off between the possible implementation strategies.
Finally, we focus on an alternative compression framework suited for low-end devices that is Compressed Sensing. We propose a different decoding approach that splits the recovery problem into two stages and effectively adopts a deep neural network and basic linear algebra to reconstruct biomedical signals. This novel approach outperforms the state-of-the-art in terms of quality of reconstruction and requires lower computational resources
Abstracts on Radio Direction Finding (1899 - 1995)
The files on this record represent the various databases that originally composed the CD-ROM issue of "Abstracts on Radio Direction Finding" database, which is now part of the Dudley Knox Library's Abstracts and Selected Full Text Documents on Radio Direction Finding (1899 - 1995) Collection. (See Calhoun record https://calhoun.nps.edu/handle/10945/57364 for further information on this collection and the bibliography).
Due to issues of technological obsolescence preventing current and future audiences from accessing the bibliography, DKL exported and converted into the three files on this record the various databases contained in the CD-ROM.
The contents of these files are:
1) RDFA_CompleteBibliography_xls.zip [RDFA_CompleteBibliography.xls: Metadata for the complete bibliography, in Excel 97-2003 Workbook format; RDFA_Glossary.xls: Glossary of terms, in Excel 97-2003 Workbookformat; RDFA_Biographies.xls: Biographies of leading figures, in Excel 97-2003 Workbook format];
2) RDFA_CompleteBibliography_csv.zip [RDFA_CompleteBibliography.TXT: Metadata for the complete bibliography, in CSV format; RDFA_Glossary.TXT: Glossary of terms, in CSV format; RDFA_Biographies.TXT: Biographies of leading figures, in CSV format];
3) RDFA_CompleteBibliography.pdf: A human readable display of the bibliographic data, as a means of double-checking any possible deviations due to conversion