42 research outputs found

    Time-Varying Discrete-Time Wavelet Transforms

    Get PDF

    Systematic hybrid analog/digital signal coding

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.Includes bibliographical references (p. 201-206).This thesis develops low-latency, low-complexity signal processing solutions for systematic source coding, or source coding with side information at the decoder. We consider an analog source signal transmitted through a hybrid channel that is the composition of two channels: a noisy analog channel through which the source is sent unprocessed and a secondary rate-constrained digital channel; the source is processed prior to transmission through the digital channel. The challenge is to design a digital encoder and decoder that provide a minimum-distortion reconstruction of the source at the decoder, which has observations of analog and digital channel outputs. The methods described in this thesis have importance to a wide array of applications. For example, in the case of in-band on-channel (IBOC) digital audio broadcast (DAB), an existing noisy analog communications infrastructure may be augmented by a low-bandwidth digital side channel for improved fidelity, while compatibility with existing analog receivers is preserved. Another application is a source coding scheme which devotes a fraction of available bandwidth to the analog source and the rest of the bandwidth to a digital representation. This scheme is applicable in a wireless communications environment (or any environment with unknown SNR), where analog transmission has the advantage of a gentle roll-off of fidelity with SNR. A very general paradigm for low-latency, low-complexity source coding is composed of three basic cascaded elements: 1) a space rotation, or transformation, 2) quantization, and 3) lossless bitstream coding. The paradigm has been applied with great success to conventional source coding, and it applies equally well to systematic source coding. Focusing on the case involving a Gaussian source, Gaussian channel and mean-squared distortion, we determine optimal or near-optimal components for each of the three elements, each of which has analogous components in conventional source coding. The space rotation can take many forms such as linear block transforms, lapped transforms, or subband decomposition, all for which we derive conditions of optimality. For a very general case we develop algorithms for the design of locally optimal quantizers. For the Gaussian case, we describe a low-complexity scalar quantizer, the nested lattice scalar quantizer, that has performance very near that of the optimal systematic scalar quantizer. Analogous to entropy coding for conventional source coding, Slepian-Wolf coding is shown to be an effective lossless bitstream coding stage for systematic source coding.by Richard J. Barron.Ph.D

    A turbo-coded burst-by-burst adaptive wide-band speech transceiver

    Full text link

    Orthogonal transmultiplexers : extensions to digital subscriber line (DSL) communications

    Get PDF
    An orthogonal transmultiplexer which unifies multirate filter bank theory and communications theory is investigated in this dissertation. Various extensions of the orthogonal transmultiplexer techniques have been made for digital subscriber line communication applications. It is shown that the theoretical performance bounds of single carrier modulation based transceivers and multicarrier modulation based transceivers are the same under the same operational conditions. Single carrier based transceiver systems such as Quadrature Amplitude Modulation (QAM) and Carrierless Amplitude and Phase (CAP) modulation scheme, multicarrier based transceiver systems such as Orthogonal Frequency Division Multiplexing (OFDM) or Discrete Multi Tone (DMT) and Discrete Subband (Wavelet) Multicarrier based transceiver (DSBMT) techniques are considered in this investigation. The performance of DMT and DSBMT based transceiver systems for a narrow band interference and their robustness are also investigated. It is shown that the performance of a DMT based transceiver system is quite sensitive to the location and strength of a single tone (narrow band) interference. The performance sensitivity is highlighted in this work. It is shown that an adaptive interference exciser can alleviate the sensitivity problem of a DMT based system. The improved spectral properties of DSBMT technique reduces the performance sensitivity for variations of a narrow band interference. It is shown that DSBMT technique outperforms DMT and has a more robust performance than the latter. The superior performance robustness is shown in this work. Optimal orthogonal basis design using cosine modulated multirate filter bank is discussed. An adaptive linear combiner at the output of analysis filter bank is implemented to eliminate the intersymbol and interchannel interferences. It is shown that DSBMT is the most suitable technique for a narrow band interference environment. A blind channel identification and optimal MMSE based equalizer employing a nonmaximally decimated filter bank precoder / postequalizer structure is proposed. The performance of blind channel identification scheme is shown not to be sensitive to the characteristics of unknown channel. The performance of the proposed optimal MMSE based equalizer is shown to be superior to the zero-forcing equalizer

    Trennung und Schätzung der Anzahl von Audiosignalquellen mit Zeit- und Frequenzüberlappung

    Get PDF
    Everyday audio recordings involve mixture signals: music contains a mixture of instruments; in a meeting or conference, there is a mixture of human voices. For these mixtures, automatically separating or estimating the number of sources is a challenging task. A common assumption when processing mixtures in the time-frequency domain is that sources are not fully overlapped. However, in this work we consider some cases where the overlap is severe — for instance, when instruments play the same note (unison) or when many people speak concurrently ("cocktail party") — highlighting the need for new representations and more powerful models. To address the problems of source separation and count estimation, we use conventional signal processing techniques as well as deep neural networks (DNN). We first address the source separation problem for unison instrument mixtures, studying the distinct spectro-temporal modulations caused by vibrato. To exploit these modulations, we developed a method based on time warping, informed by an estimate of the fundamental frequency. For cases where such estimates are not available, we present an unsupervised model, inspired by the way humans group time-varying sources (common fate). This contribution comes with a novel representation that improves separation for overlapped and modulated sources on unison mixtures but also improves vocal and accompaniment separation when used as an input for a DNN model. Then, we focus on estimating the number of sources in a mixture, which is important for real-world scenarios. Our work on count estimation was motivated by a study on how humans can address this task, which lead us to conduct listening experiments, confirming that humans are only able to estimate the number of up to four sources correctly. To answer the question of whether machines can perform similarly, we present a DNN architecture, trained to estimate the number of concurrent speakers. Our results show improvements compared to other methods, and the model even outperformed humans on the same task. In both the source separation and source count estimation tasks, the key contribution of this thesis is the concept of “modulation”, which is important to computationally mimic human performance. Our proposed Common Fate Transform is an adequate representation to disentangle overlapping signals for separation, and an inspection of our DNN count estimation model revealed that it proceeds to find modulation-like intermediate features.Im Alltag sind wir von gemischten Signalen umgeben: Musik besteht aus einer Mischung von Instrumenten; in einem Meeting oder auf einer Konferenz sind wir einer Mischung menschlicher Stimmen ausgesetzt. Für diese Mischungen ist die automatische Quellentrennung oder die Bestimmung der Anzahl an Quellen eine anspruchsvolle Aufgabe. Eine häufige Annahme bei der Verarbeitung von gemischten Signalen im Zeit-Frequenzbereich ist, dass die Quellen sich nicht vollständig überlappen. In dieser Arbeit betrachten wir jedoch einige Fälle, in denen die Überlappung immens ist zum Beispiel, wenn Instrumente den gleichen Ton spielen (unisono) oder wenn viele Menschen gleichzeitig sprechen (Cocktailparty) —, so dass neue Signal-Repräsentationen und leistungsfähigere Modelle notwendig sind. Um die zwei genannten Probleme zu bewältigen, verwenden wir sowohl konventionelle Signalverbeitungsmethoden als auch tiefgehende neuronale Netze (DNN). Wir gehen zunächst auf das Problem der Quellentrennung für Unisono-Instrumentenmischungen ein und untersuchen die speziellen, durch Vibrato ausgelösten, zeitlich-spektralen Modulationen. Um diese Modulationen auszunutzen entwickelten wir eine Methode, die auf Zeitverzerrung basiert und eine Schätzung der Grundfrequenz als zusätzliche Information nutzt. Für Fälle, in denen diese Schätzungen nicht verfügbar sind, stellen wir ein unüberwachtes Modell vor, das inspiriert ist von der Art und Weise, wie Menschen zeitveränderliche Quellen gruppieren (Common Fate). Dieser Beitrag enthält eine neuartige Repräsentation, die die Separierbarkeit für überlappte und modulierte Quellen in Unisono-Mischungen erhöht, aber auch die Trennung in Gesang und Begleitung verbessert, wenn sie in einem DNN-Modell verwendet wird. Im Weiteren beschäftigen wir uns mit der Schätzung der Anzahl von Quellen in einer Mischung, was für reale Szenarien wichtig ist. Unsere Arbeit an der Schätzung der Anzahl war motiviert durch eine Studie, die zeigt, wie wir Menschen diese Aufgabe angehen. Dies hat uns dazu veranlasst, eigene Hörexperimente durchzuführen, die bestätigten, dass Menschen nur in der Lage sind, die Anzahl von bis zu vier Quellen korrekt abzuschätzen. Um nun die Frage zu beantworten, ob Maschinen dies ähnlich gut können, stellen wir eine DNN-Architektur vor, die erlernt hat, die Anzahl der gleichzeitig sprechenden Sprecher zu ermitteln. Die Ergebnisse zeigen Verbesserungen im Vergleich zu anderen Methoden, aber vor allem auch im Vergleich zu menschlichen Hörern. Sowohl bei der Quellentrennung als auch bei der Schätzung der Anzahl an Quellen ist ein Kernbeitrag dieser Arbeit das Konzept der “Modulation”, welches wichtig ist, um die Strategien von Menschen mittels Computern nachzuahmen. Unsere vorgeschlagene Common Fate Transformation ist eine adäquate Darstellung, um die Überlappung von Signalen für die Trennung zugänglich zu machen und eine Inspektion unseres DNN-Zählmodells ergab schließlich, dass sich auch hier modulationsähnliche Merkmale finden lassen

    Frequency-warped autoregressive modeling and filtering

    Get PDF
    This thesis consists of an introduction and nine articles. The articles are related to the application of frequency-warping techniques to audio signal processing, and in particular, predictive coding of wideband audio signals. The introduction reviews the literature and summarizes the results of the articles. Frequency-warping, or simply warping techniques are based on a modification of a conventional signal processing system so that the inherent frequency representation in the system is changed. It is demonstrated that this may be done for basically all traditional signal processing algorithms. In audio applications it is beneficial to modify the system so that the new frequency representation is close to that of human hearing. One of the articles is a tutorial paper on the use of warping techniques in audio applications. Majority of the articles studies warped linear prediction, WLP, and its use in wideband audio coding. It is proposed that warped linear prediction would be particularly attractive method for low-delay wideband audio coding. Warping techniques are also applied to various modifications of classical linear predictive coding techniques. This was made possible partly by the introduction of a class of new implementation techniques for recursive filters in one of the articles. The proposed implementation algorithm for recursive filters having delay-free loops is a generic technique. This inspired to write an article which introduces a generalized warped linear predictive coding scheme. One example of the generalized approach is a linear predictive algorithm using almost logarithmic frequency representation.reviewe

    Channel estimation techniques for filter bank multicarrier based transceivers for next generation of wireless networks

    Get PDF
    A dissertation submitted to Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in fulfillment of the requirements for the degree of Master of Science in Engineering (Electrical and Information Engineering), August 2017The fourth generation (4G) of wireless communication system is designed based on the principles of cyclic prefix orthogonal frequency division multiplexing (CP-OFDM) where the cyclic prefix (CP) is used to combat inter-symbol interference (ISI) and inter-carrier interference (ICI) in order to achieve higher data rates in comparison to the previous generations of wireless networks. Various filter bank multicarrier systems have been considered as potential waveforms for the fast emerging next generation (xG) of wireless networks (especially the fifth generation (5G) networks). Some examples of the considered waveforms are orthogonal frequency division multiplexing with offset quadrature amplitude modulation based filter bank, universal filtered multicarrier (UFMC), bi-orthogonal frequency division multiplexing (BFDM) and generalized frequency division multiplexing (GFDM). In perfect reconstruction (PR) or near perfect reconstruction (NPR) filter bank designs, these aforementioned FBMC waveforms adopt the use of well-designed prototype filters (which are used for designing the synthesis and analysis filter banks) so as to either replace or minimize the CP usage of the 4G networks in order to provide higher spectral efficiencies for the overall increment in data rates. The accurate designing of the FIR low-pass prototype filter in NPR filter banks results in minimal signal distortions thus, making the analysis filter bank a time-reversed version of the corresponding synthesis filter bank. However, in non-perfect reconstruction (Non-PR) the analysis filter bank is not directly a time-reversed version of the corresponding synthesis filter bank as the prototype filter impulse response for this system is formulated (in this dissertation) by the introduction of randomly generated errors. Hence, aliasing and amplitude distortions are more prominent for Non-PR. Channel estimation (CE) is used to predict the behaviour of the frequency selective channel and is usually adopted to ensure excellent reconstruction of the transmitted symbols. These techniques can be broadly classified as pilot based, semi-blind and blind channel estimation schemes. In this dissertation, two linear pilot based CE techniques namely the least square (LS) and linear minimum mean square error (LMMSE), and three adaptive channel estimation schemes namely least mean square (LMS), normalized least mean square (NLMS) and recursive least square (RLS) are presented, analyzed and documented. These are implemented while exploiting the near orthogonality properties of offset quadrature amplitude modulation (OQAM) to mitigate the effects of interference for two filter bank waveforms (i.e. OFDM/OQAM and GFDM/OQAM) for the next generation of wireless networks assuming conditions of both NPR and Non-PR in slow and fast frequency selective Rayleigh fading channel. Results obtained from the computer simulations carried out showed that the channel estimation schemes performed better in an NPR filter bank system as compared with Non-PR filter banks. The low performance of Non-PR system is due to the amplitude distortion and aliasing introduced from the random errors generated in the system that is used to design its prototype filters. It can be concluded that RLS, NLMS, LMS, LMMSE and LS channel estimation schemes offered the best normalized mean square error (NMSE) and bit error rate (BER) performances (in decreasing order) for both waveforms assuming both NPR and Non-PR filter banks. Keywords: Channel estimation, Filter bank, OFDM/OQAM, GFDM/OQAM, NPR, Non-PR, 5G, Frequency selective channel.CK201

    Localized discrete fourier transform spread OFDM (DFT-SOFDM) systems for 4G wireless communication

    Get PDF
    Master'sMASTER OF ENGINEERIN

    Convergence of packet communications over the evolved mobile networks; signal processing and protocol performance

    Get PDF
    In this thesis, the convergence of packet communications over the evolved mobile networks is studied. The Long Term Evolution (LTE) process is dominating the Third Generation Partnership Project (3GPP) in order to bring technologies to the markets in the spirit of continuous innovation. The global markets of mobile information services are growing towards the Mobile Information Society. The thesis begins with the principles and theories of the multiple-access transmission schemes, transmitter receiver techniques and signal processing algorithms. Next, packet communications and Internet protocols are referred from the IETF standards with the characteristics of mobile communications in the focus. The mobile network architecture and protocols bind together the evolved packet system of Internet communications to the radio access network technologies. Specifics of the traffic models are shortly visited for their statistical meaning in the radio performance analysis. Radio resource management algorithms and protocols, also procedures, are covered addressing their relevance for the system performance. Throughout these Chapters, the commonalities and differentiators of the WCDMA, WCDMA/HSPA and LTE are covered. The main outcome of the thesis is the performance analysis of the LTE technology beginning from the early discoveries to the analysis of various system features and finally converging to an extensive system analysis campaign. The system performance is analysed with the characteristics of voice over the Internet and best effort traffic of the Internet. These traffic classes represent the majority of the mobile traffic in the converged packet networks, and yet they are simple enough for a fair and generic analysis of technologies. The thesis consists of publications and inventions created by the author that proposed several improvements to the 3G technologies towards the LTE. In the system analysis, the LTE showed by the factor of at least 2.5 to 3 times higher system measures compared to the WCDMA/HSPA reference. The WCDMA/HSPA networks are currently available with over 400 million subscribers and showing increasing growth, in the meanwhile the first LTE roll-outs are scheduled to begin in 2010. Sophisticated 3G LTE mobile devices are expected to appear fluently for all consumer segments in the following years
    corecore