14 research outputs found

    Frequency-warped autoregressive modeling and filtering

    Get PDF
    This thesis consists of an introduction and nine articles. The articles are related to the application of frequency-warping techniques to audio signal processing, and in particular, predictive coding of wideband audio signals. The introduction reviews the literature and summarizes the results of the articles. Frequency-warping, or simply warping techniques are based on a modification of a conventional signal processing system so that the inherent frequency representation in the system is changed. It is demonstrated that this may be done for basically all traditional signal processing algorithms. In audio applications it is beneficial to modify the system so that the new frequency representation is close to that of human hearing. One of the articles is a tutorial paper on the use of warping techniques in audio applications. Majority of the articles studies warped linear prediction, WLP, and its use in wideband audio coding. It is proposed that warped linear prediction would be particularly attractive method for low-delay wideband audio coding. Warping techniques are also applied to various modifications of classical linear predictive coding techniques. This was made possible partly by the introduction of a class of new implementation techniques for recursive filters in one of the articles. The proposed implementation algorithm for recursive filters having delay-free loops is a generic technique. This inspired to write an article which introduces a generalized warped linear predictive coding scheme. One example of the generalized approach is a linear predictive algorithm using almost logarithmic frequency representation.reviewe

    Stereo linear predictive coding of audio

    Get PDF

    Estimation and Modeling Problems in Parametric Audio Coding

    Get PDF

    MIMO designs for filter bank multicarrier and multiantenna systems based on OQAM

    Get PDF
    From the perspective of increasingly data rate requirements in mobile communications, it is deemed necessary to do further research so that the future goals can be reached. To that end, the radio-based communications are resorting to multicarrier modulations and spatial diversity. Until today, the orthogonal frequency division multiplexing (OFDM) modulation is regarded as the dominant technology. On one hand, the OFDM modulation is able to accommodate multiantenna configurations in a very straightforward manner. On the other hand, the poor stopband attenuation exhibited by the OFDM modulation, highlights that a definitely tight synchronization is required. In addition, the cyclic prefix (CP) has to be sufficiently long to avoid inter-block interference, which may substantially reduce the spectral efficiency. In order to overcome the OFDM drawbacks, the filter bank multicarrier modulation based on OQAM (FBMC/OQAM) is introduced. This modulation does not need any CP and benefits from pulse shaping techniques. This aspect becomes crucial in cognitive radio networks and communication systems where nodes are unlikely to be synchronized. In principle, the poor frequency confinement exhibited by OFDM should tip the balance towards FBMC/OQAM. However, the perfect reconstruction property of FBMC/OQAM systems does not hold in presence of multipath fading. This means that the FBMC/OQAM modulation is affected by inter-symbol and inter-carrier interference, unless the channel is equalized to some extent. This observation highlights that the FBMC/OQAM extension to MIMO architectures becomes a big challenge due to the need to cope with both modulation- and multiantenna-induced interference. The goal of this thesis is to study how the FBMC/OQAM modulation scheme can benefit from the degrees of freedom provided by the spatial dimension. In this regard, the first attempt to put the research on track is based on designing signal processing techniques at reception. In this case the emphasis is on single-input-multiple-output (SIMO) architectures. Next, the possibility of pre-equalizing the channel at transmission is investigated. It is considered that multiple antennas are placed at the transmit side giving rise to a multiple-input-single-output (MISO) configuration. In this scenario, the research is not only focused on counteracting the channel but also on distributing the power among subcarriers. Finally, the joint transmitter and receiver design in multiple-input-multiple-output (MIMO) communication systems is covered. From the theory developed in this thesis, it is possible to conclude that the techniques originally devised in the OFDM context can be easily adapted to FBMC/OQAM systems if the channel frequency response is flat within the subchannels. However, metrics such as the peak to average power ratio or the sensitivity to the carrier frequency offset constraint the number of subcarriers, so that the frequency selectivity may be appreciable at the subcarrier level. Then, the flat fading assumption is not satisfied and the specificities of FBMC/OQAM systems have to be considered. In this situation, the proposed techniques allow FBMC/OQAM to remain competitive with OFDM. In addition, for some multiantenna configurations and propagation conditions FBMC/OQAM turns out to be the best choice. The simulation-based results together with the theoretical analysis conducted in this thesis contribute to make progress towards the application of FBMC/OQAM to MIMO channels. The signal processing techniques that are described in this dissertation allow designers to exploit the potentials of FBMC/OQAM and MIMO to improve the link reliability as well as the spectral efficiency

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

    Models and analysis of vocal emissions for biomedical applications

    Get PDF
    This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies

    Studies on noise robust automatic speech recognition

    Get PDF
    Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a research theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK

    Generalized linear-in-parameter models : theory and audio signal processing applications

    Get PDF
    This thesis presents a mathematically oriented perspective to some basic concepts of digital signal processing. A general framework for the development of alternative signal and system representations is attained by defining a generalized linear-in-parameter model (GLM) configuration. The GLM provides a direct view into the origins of many familiar methods in signal processing, implying a variety of generalizations, and it serves as a natural introduction to rational orthonormal model structures. In particular, the conventional division between finite impulse response (FIR) and infinite impulse response (IIR) filtering methods is reconsidered. The latter part of the thesis consists of audio oriented case studies, including loudspeaker equalization, musical instrument body modeling, and room response modeling. The proposed collection of IIR filter design techniques is submitted to challenging modeling tasks. The most important practical contribution of this thesis is the introduction of a procedure for the optimization of rational orthonormal filter structures, called the BU-method. More generally, the BU-method and its variants, including the (complex) warped extension, the (C)WBU-method, can be consider as entirely new IIR filter design strategies.reviewe

    High quality audio coding using a novel hybrid WLP-subband coding algorithm

    No full text
    corecore