Search CORE

59 research outputs found

Nonstationary Signal Processing with Application to Reverberation Cancellation in Acoustic Environments

Author: Hopgood James
Publication venue
Publication date: 01/04/2001
Field of study

Convolutive Blind Source Separation Methods

Author: Kjems Ulrik
Larsen Jan
Parra Lucas C.
Pedersen Michael Syskind
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2008
Field of study

In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio separation tasks

CiteSeerX

Online Research Database In Technology

Blind dereverberation of speech from moving and stationary speakers using sequential Monte Carlo methods

Author: Evers Christine
Publication venue: The University of Edinburgh
Publication date: 01/01/2010
Field of study

Speech signals radiated in confined spaces are subject to reverberation due to reflections of surrounding walls and obstacles. Reverberation leads to severe degradation of speech intelligibility and can be prohibitive for applications where speech is digitally recorded, such as audio conferencing or hearing aids. Dereverberation of speech is therefore an important field in speech enhancement. Driven by consumer demand, blind speech dereverberation has become a popular field in the research community and has led to many interesting approaches in the literature. However, most existing methods are dictated by their underlying models and hence suffer from assumptions that constrain the approaches to specific subproblems of blind speech dereverberation. For example, many approaches limit the dereverberation to voiced speech sounds, leading to poor results for unvoiced speech. Few approaches tackle single-sensor blind speech dereverberation, and only a very limited subset allows for dereverberation of speech from moving speakers. Therefore, the aim of this dissertation is the development of a flexible and extendible framework for blind speech dereverberation accommodating different speech sound types, single- or multiple sensor as well as stationary and moving speakers. Bayesian methods benefit from – rather than being dictated by – appropriate model choices. Therefore, the problem of blind speech dereverberation is considered from a Bayesian perspective in this thesis. A generic sequential Monte Carlo approach accommodating a multitude of models for the speech production mechanism and room transfer function is consequently derived. In this approach both the anechoic source signal and reverberant channel are estimated using their optimal estimators by means of Rao-Blackwellisation of the state-space of unknown variables. The remaining model parameters are estimated using sequential importance resampling. The proposed approach is implemented for two different speech production models for stationary speakers, demonstrating substantial reduction in reverberation for both unvoiced and voiced speech sounds. Furthermore, the channel model is extended to facilitate blind dereverberation of speech from moving speakers. Due to the structure of measurement model, single- as well as multi-microphone processing is facilitated, accommodating physically constrained scenarios where only a single sensor can be used as well as allowing for the exploitation of spatial diversity in scenarios where the physical size of microphone arrays is of no concern. This dissertation is concluded with a survey of possible directions for future research, including the use of switching Markov source models, joint target tracking and enhancement, as well as an extension to subband processing for improved computational efficiency

Edinburgh Research Archive

Dynamic texture synthesis in image and video processing.

Author
Publication venue
Publication date: 01/01/2008
Field of study

Xu, Leilei.Thesis submitted in: October 2007.Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.Includes bibliographical references (leaves 78-84).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Texture and Dynamic Textures --- p.1Chapter 1.2 --- Related work --- p.4Chapter 1.3 --- Thesis Outline --- p.7Chapter 2 --- Image/Video Processing --- p.8Chapter 2.1 --- Bayesian Analysis --- p.8Chapter 2.2 --- Markov Property --- p.10Chapter 2.3 --- Graph Cut --- p.12Chapter 2.4 --- Belief Propagation --- p.13Chapter 2.5 --- Expectation-Maximization --- p.15Chapter 2.6 --- Principle Component Analysis --- p.15Chapter 3 --- Linear Dynamic System --- p.17Chapter 3.1 --- System Model --- p.18Chapter 3.2 --- Degeneracy and Canonical Model Realization --- p.19Chapter 3.3 --- Learning of Dynamic Textures --- p.19Chapter 3.4 --- Synthesizing Dynamic Textures --- p.21Chapter 3.5 --- Summary --- p.21Chapter 4 --- Dynamic Color Texture Synthesis --- p.25Chapter 4.1 --- Related Work --- p.25Chapter 4.2 --- System Model --- p.26Chapter 4.2.1 --- Laplacian Pyramid-based DCTS Model --- p.28Chapter 4.2.2 --- RBF-based DCTS Model --- p.28Chapter 4.3 --- Experimental Results --- p.32Chapter 4.4 --- Summary --- p.42Chapter 5 --- Dynamic Textures using Multi-resolution Analysis --- p.43Chapter 5.1 --- System Model --- p.44Chapter 5.2 --- Multi-resolution Descriptors --- p.46Chapter 5.2.1 --- Laplacian Pyramids --- p.47Chapter 5.2.2 --- Haar Wavelets --- p.48Chapter 5.2.3 --- Steerable Pyramid --- p.49Chapter 5.3 --- Experimental Results --- p.51Chapter 5.4 --- Summary --- p.55Chapter 6 --- Motion Transfer --- p.59Chapter 6.1 --- Problem formulation --- p.60Chapter 6.1.1 --- Similarity on Appearance --- p.61Chapter 6.1.2 --- Similarity on Dynamic Behavior --- p.62Chapter 6.1.3 --- The Objective Function --- p.65Chapter 6.2 --- Further Work --- p.66Chapter 7 --- Conclusions --- p.67Chapter A --- List of Publications --- p.68Chapter B --- Degeneracy in LDS Model --- p.70Chapter B.l --- Equivalence Class --- p.70Chapter B.2 --- The Choice of the Matrix Q --- p.70Chapter B.3 --- Swapping the Column of C and A --- p.71Chapter C --- Probability Density Functions --- p.74Chapter C.1 --- Probability Distribution --- p.74Chapter C.2 --- Joint Probability Distributions --- p.75Bibliography --- p.7

CUHK Digital Repository

Time series forecasting using wavelet and support vector machine

Author: FONG KENG MUN
Publication venue
Publication date: 23/02/2005
Field of study

Master'sMASTER OF ENGINEERIN

ScholarBank@NUS

Blind image deconvolution: nonstationary Bayesian approaches to restoring blurred photos

Author: Bishop Tom E.
Publication venue: The University of Edinburgh
Publication date: 01/01/2009
Field of study

High quality digital images have become pervasive in modern scientific and everyday life — in areas from photography to astronomy, CCTV, microscopy, and medical imaging. However there are always limits to the quality of these images due to uncertainty and imprecision in the measurement systems. Modern signal processing methods offer the promise of overcoming some of these problems by postprocessing these blurred and noisy images. In this thesis, novel methods using nonstationary statistical models are developed for the removal of blurs from out of focus and other types of degraded photographic images. The work tackles the fundamental problem blind image deconvolution (BID); its goal is to restore a sharp image from a blurred observation when the blur itself is completely unknown. This is a “doubly illposed” problem — extreme lack of information must be countered by strong prior constraints about sensible types of solution. In this work, the hierarchical Bayesian methodology is used as a robust and versatile framework to impart the required prior knowledge. The thesis is arranged in two parts. In the first part, the BID problem is reviewed, along with techniques and models for its solution. Observation models are developed, with an emphasis on photographic restoration, concluding with a discussion of how these are reduced to the common linear spatially-invariant (LSI) convolutional model. Classical methods for the solution of illposed problems are summarised to provide a foundation for the main theoretical ideas that will be used under the Bayesian framework. This is followed by an indepth review and discussion of the various prior image and blur models appearing in the literature, and then their applications to solving the problem with both Bayesian and nonBayesian techniques. The second part covers novel restoration methods, making use of the theory presented in Part I. Firstly, two new nonstationary image models are presented. The first models local variance in the image, and the second extends this with locally adaptive noncausal autoregressive (AR) texture estimation and local mean components. These models allow for recovery of image details including edges and texture, whilst preserving smooth regions. Most existing methods do not model the boundary conditions correctly for deblurring of natural photographs, and a Chapter is devoted to exploring Bayesian solutions to this topic. Due to the complexity of the models used and the problem itself, there are many challenges which must be overcome for tractable inference. Using the new models, three different inference strategies are investigated: firstly using the Bayesian maximum marginalised a posteriori (MMAP) method with deterministic optimisation; proceeding with the stochastic methods of variational Bayesian (VB) distribution approximation, and simulation of the posterior distribution using the Gibbs sampler. Of these, we find the Gibbs sampler to be the most effective way to deal with a variety of different types of unknown blurs. Along the way, details are given of the numerical strategies developed to give accurate results and to accelerate performance. Finally, the thesis demonstrates state of the art results in blind restoration of synthetic and real degraded images, such as recovering details in out of focus photographs

Edinburgh Research Archive

Source Separation for Hearing Aid Applications

Author: Pedersen Michael Syskind
Publication venue: Technical University of Denmark
Publication date: 01/11/2006
Field of study

Online Research Database In Technology

Multiscale Methods in Image Modelling and Image Processing

Author: Alexander Simon
Publication venue: 'University of Waterloo'
Publication date: 01/01/2005
Field of study

The field of modelling and processing of 'images' has fairly recently become important, even crucial, to areas of science, medicine, and engineering. The inevitable explosion of imaging modalities and approaches stemming from this fact has become a rich source of mathematical applications. 'Imaging' is quite broad, and suffers somewhat from this broadness. The general question of 'what is an image?' or perhaps 'what is a natural image?' turns out to be difficult to address. To make real headway one may need to strongly constrain the class of images being considered, as will be done in part of this thesis. On the other hand there are general principles that can guide research in many areas. One such principle considered is the assertion that (classes of) images have multiscale relationships, whether at a pixel level, between features, or other variants. There are both practical (in terms of computational complexity) and more philosophical reasons (mimicking the human visual system, for example) that suggest looking at such methods. Looking at scaling relationships may also have the advantage of opening a problem up to many mathematical tools. This thesis will detail two investigations into multiscale relationships, in quite different areas. One will involve Iterated Function Systems (IFS), and the other a stochastic approach to reconstruction of binary images (binary phase descriptions of porous media). The use of IFS in this context, which has often been called 'fractal image coding', has been primarily viewed as an image compression technique. We will re-visit this approach, proposing it as a more general tool. Some study of the implications of that idea will be presented, along with applications inferred by the results. In the area of reconstruction of binary porous media, a novel, multiscale, hierarchical annealing approach is proposed and investigated

CiteSeerX

University of Waterloo's Institutional Repository

Recommended from our members

Modelling and extraction of fundamental frequency in speech signals

Author: Pawi Alipah
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2014
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.One of the most important parameters of speech is the fundamental frequency of vibration of voiced sounds. The audio sensation of the fundamental frequency is known as the pitch. Depending on the tonal/non-tonal category of language, the fundamental frequency conveys intonation, pragmatics and meaning. In addition the fundamental frequency and intonation carry speaker gender, age, identity, speaking style and emotional state. Accurate estimation of the fundamental frequency is critically important for functioning of speech processing applications such as speech coding, speech recognition, speech synthesis and voice morphing. This thesis makes contributions to the development of accurate pitch estimation research in three distinct ways: (1) an investigation of the impact of the window length on pitch estimation error, (2) an investigation of the use of the higher order moments and (3) an investigation of an analysis-synthesis method for selection of the best pitch value among N proposed candidates. Experimental evaluations show that the length of the speech window has a major impact on the accuracy of pitch estimation. Depending on the similarity criteria and the order of the statistical moment a window length of 37 to 80 ms gives the least error. In order to avoid excessive delay as a consequence of using a longer window, a method is proposed ii where the current short window is concatenated with the previous frames to form a longer signal window for pitch extraction. The use of second order and higher order moments, and the magnitude difference function, as the similarity criteria were explored and compared. A novel method of calculation of moments is introduced where the signal is split, i.e. rectified, into positive and negative valued samples. The moments for the positive and negative parts of the signal are computed separately and combined. The new method of calculation of moments from positive and negative parts and the higher order criteria provide competitive results. A challenging issue in pitch estimation is the determination of the best candidate from N extrema of the similarity criteria. The analysis-synthesis method proposed in this thesis selects the pitch candidate that provides the best reproduction (synthesis) of the harmonic spectrum of the original speech. The synthesis method must be such that the distortion increases with the increasing error in the estimate of the fundamental frequency. To this end a new method of spectral synthesis is proposed using an estimate of the spectral envelop and harmonically spaced asymmetric Gaussian pulses as excitation. The N-best method provides consistent reduction in pitch estimation error. The methods described in this thesis result in a significant improvement in the pitch accuracy and outperform the benchmark YIN method

Brunel University Research Archive

Recommended from our members

Bayesian methods in music modelling

Author: Peeling Paul
Publication venue: University of Cambridge
Publication date: 15/03/2011
Field of study

This thesis presents several hierarchical generative Bayesian models of musical signals designed to improve the accuracy of existing multiple pitch detection systems and other musical signal processing applications whilst remaining feasible for real-time computation. At the lowest level the signal is modelled as a set of overlapping sinusoidal basis functions. The parameters of these basis functions are built into a prior framework based on principles known from musical theory and the physics of musical instruments. The model of a musical note optionally includes phenomena such as frequency and amplitude modulations, damping, volume, timbre and inharmonicity. The occurrence of note onsets in a performance of a piece of music is controlled by an underlying tempo process and the alignment of the timings to the underlying score of the music. A variety of applications are presented for these models under differing inference constraints. Where full Bayesian inference is possible, reversible-jump Markov Chain Monte Carlo is employed to estimate the number of notes and partial frequency components in each frame of music. We also use approximate techniques such as model selection criteria and variational Bayes methods for inference in situations where computation time is limited or the amount of data to be processed is large. For the higher level score parameters, greedy search and conditional modes algorithms are found to be sufficiently accurate. We emphasize the links between the models and inference algorithms developed in this thesis with that in existing and parallel work, and demonstrate the effects of making modifications to these models both theoretically and by means of experimental results

Apollo (Cambridge)