56 research outputs found

    System Identification with Applications in Speech Enhancement

    No full text
    As the increasing popularity of integrating hands-free telephony on mobile portable devices and the rapid development of voice over internet protocol, identification of acoustic systems has become desirable for compensating distortions introduced to speech signals during transmission, and hence enhancing the speech quality. The objective of this research is to develop system identification algorithms for speech enhancement applications including network echo cancellation and speech dereverberation. A supervised adaptive algorithm for sparse system identification is developed for network echo cancellation. Based on the framework of selective-tap updating scheme on the normalized least mean squares algorithm, the MMax and sparse partial update tap-selection strategies are exploited in the frequency domain to achieve fast convergence performance with low computational complexity. Through demonstrating how the sparseness of the network impulse response varies in the transformed domain, the multidelay filtering structure is incorporated to reduce the algorithmic delay. Blind identification of SIMO acoustic systems for speech dereverberation in the presence of common zeros is then investigated. First, the problem of common zeros is defined and extended to include the presence of near-common zeros. Two clustering algorithms are developed to quantify the number of these zeros so as to facilitate the study of their effect on blind system identification and speech dereverberation. To mitigate such effect, two algorithms are developed where the two-stage algorithm based on channel decomposition identifies common and non-common zeros sequentially; and the forced spectral diversity approach combines spectral shaping filters and channel undermodelling for deriving a modified system that leads to an improved dereverberation performance. Additionally, a solution to the scale factor ambiguity problem in subband-based blind system identification is developed, which motivates further research on subbandbased dereverberation techniques. Comprehensive simulations and discussions demonstrate the effectiveness of the aforementioned algorithms. A discussion on possible directions of prospective research on system identification techniques concludes this thesis

    Multichannel Speech Enhancement

    Get PDF

    A Primal-Dual Proximal Algorithm for Sparse Template-Based Adaptive Filtering: Application to Seismic Multiple Removal

    Get PDF
    Unveiling meaningful geophysical information from seismic data requires to deal with both random and structured "noises". As their amplitude may be greater than signals of interest (primaries), additional prior information is especially important in performing efficient signal separation. We address here the problem of multiple reflections, caused by wave-field bouncing between layers. Since only approximate models of these phenomena are available, we propose a flexible framework for time-varying adaptive filtering of seismic signals, using sparse representations, based on inaccurate templates. We recast the joint estimation of adaptive filters and primaries in a new convex variational formulation. This approach allows us to incorporate plausible knowledge about noise statistics, data sparsity and slow filter variation in parsimony-promoting wavelet frames. The designed primal-dual algorithm solves a constrained minimization problem that alleviates standard regularization issues in finding hyperparameters. The approach demonstrates significantly good performance in low signal-to-noise ratio conditions, both for simulated and real field seismic data

    Doctor of Philosophy

    Get PDF
    dissertationHearing aids suffer from the problem of acoustic feedback that limits the gain provided by hearing aids. Moreover, the output sound quality of hearing aids may be compromised in the presence of background acoustic noise. Digital hearing aids use advanced signal processing to reduce acoustic feedback and background noise to improve the output sound quality. However, it is known that the output sound quality of digital hearing aids deteriorates as the hearing aid gain is increased. Furthermore, popular subband or transform domain digital signal processing in modern hearing aids introduces analysis-synthesis delays in the forward path. Long forward-path delays are not desirable because the processed sound combines with the unprocessed sound that arrives at the cochlea through the vent and changes the sound quality. In this dissertation, we employ a variable, frequency-dependent gain function that is lower at frequencies of the incoming signal where the information is perceptually insignificant. In addition, the method of this dissertation automatically identifies and suppresses residual acoustical feedback components at frequencies that have the potential to drive the system to instability. The suppressed frequency components are monitored and the suppression is removed when such frequencies no longer pose a threat to drive the hearing aid system into instability. Together, the method of this dissertation provides more stable gain over traditional methods by reducing acoustical coupling between the microphone and the loudspeaker of a hearing aid. In addition, the method of this dissertation performs necessary hearing aid signal processing with low-delay characteristics. The central idea for the low-delay hearing aid signal processing is a spectral gain shaping method (SGSM) that employs parallel parametric equalization (EQ) filters. Parameters of the parametric EQ filters and associated gain values are selected using a least-squares approach to obtain the desired spectral response. Finally, the method of this dissertation switches to a least-squares adaptation scheme with linear complexity at the onset of howling. The method adapts to the altered feedback path quickly and allows the patient to not lose perceivable information. The complexity of the least-squares estimate is reduced by reformulating the least-squares estimate into a Toeplitz system and solving it with a direct Toeplitz solver. The increase in stable gain over traditional methods and the output sound quality were evaluated with psychoacoustic experiments on normal-hearing listeners with speech and music signals. The results indicate that the method of this dissertation provides 8 to 12 dB more hearing aid gain than feedback cancelers with traditional fixed gain functions. Furthermore, experimental results obtained with real world hearing aid gain profiles indicate that the method of this dissertation provides less distortion in the output sound quality than classical feedback cancelers, enabling the use of more comfortable style hearing aids for patients with moderate to profound hearing loss. Extensive MATLAB simulations and subjective evaluations of the results indicate that the method of this dissertation exhibits much smaller forward-path delays with superior howling suppression capability

    Efficient Multiband Algorithms for Blind Source Separation

    Get PDF
    The problem of blind separation refers to recovering original signals, called source signals, from the mixed signals, called observation signals, in a reverberant environment. The mixture is a function of a sequence of original speech signals mixed in a reverberant room. The objective is to separate mixed signals to obtain the original signals without degradation and without prior information of the features of the sources. The strategy used to achieve this objective is to use multiple bands that work at a lower rate, have less computational cost and a quicker convergence than the conventional scheme. Our motivation is the competitive results of unequal-passbands scheme applications, in terms of the convergence speed. The objective of this research is to improve unequal-passbands schemes by improving the speed of convergence and reducing the computational cost. The first proposed work is a novel maximally decimated unequal-passbands scheme.This scheme uses multiple bands that make it work at a reduced sampling rate, and low computational cost. An adaptation approach is derived with an adaptation step that improved the convergence speed. The performance of the proposed scheme was measured in different ways. First, the mean square errors of various bands are measured and the results are compared to a maximally decimated equal-passbands scheme, which is currently the best performing method. The results show that the proposed scheme has a faster convergence rate than the maximally decimated equal-passbands scheme. Second, when the scheme is tested for white and coloured inputs using a low number of bands, it does not yield good results; but when the number of bands is increased, the speed of convergence is enhanced. Third, the scheme is tested for quick changes. It is shown that the performance of the proposed scheme is similar to that of the equal-passbands scheme. Fourth, the scheme is also tested in a stationary state. The experimental results confirm the theoretical work. For more challenging scenarios, an unequal-passbands scheme with over-sampled decimation is proposed; the greater number of bands, the more efficient the separation. The results are compared to the currently best performing method. Second, an experimental comparison is made between the proposed multiband scheme and the conventional scheme. The results show that the convergence speed and the signal-to-interference ratio of the proposed scheme are higher than that of the conventional scheme, and the computation cost is lower than that of the conventional scheme

    Control of feedback for assistive listening devices

    Get PDF
    Acoustic feedback refers to the undesired acoustic coupling between the loudspeaker and microphone in hearing aids. This feedback channel poses limitations to the normal operation of hearing aids under varying acoustic scenarios. This work makes contributions to improve the performance of adaptive feedback cancellation techniques and speech quality in hearing aids. For this purpose a two microphone approach is proposed and analysed; and probe signal injection methods are also investigated and improved upon

    MVDR broadband beamforming using polynomial matrix techniques

    Get PDF
    This thesis addresses the formulation of and solution to broadband minimum variance distortionless response (MVDR) beamforming. Two approaches to this problem are considered, namely, generalised sidelobe canceller (GSC) and Capon beamformers. These are examined based on a novel technique which relies on polynomial matrix formulations. The new scheme is based on the second order statistics of the array sensor measurements in order to estimate a space-time covariance matrix. The beamforming problem can be formulated based on this space-time covariance matrix. Akin to the narrowband problem, where an optimum solution can be derived from the eigenvalue decomposition (EVD) of a constant covariance matrix, this utility is here extended to the broadband case. The decoupling of the space-time covariance matrix in this case is provided by means of a polynomial matrix EVD. The proposed approach is initially exploited to design a GSC beamformer for a uniform linear array, and then extended to the constrained MVDR, or Capon, beamformer and also the GSC with an arbitrary array structure. The uniqueness of the designed GSC comes from utilising the polynomial matrix technique, and its ability to steer the array beam towards an off-broadside direction without the pre-steering stage that is associated with conventional approaches to broadband beamformers. To solve the broadband beamforming problem, this thesis addresses a number of additional tools. A first one is the accurate construction of both the steering vectors based on fractional delay filters, which are required for the broadband constraint formulation of a beamformer, as for the construction of the quiescent beamformer. In the GSC case, we also discuss how a block matrix can be obtained, and introduce a novel paraunitary matrix completion algorithm. For the Capon beamformer, the polynomial extension requires the inversion of a polynomial matrix, for which a residue-based method is proposed that offers better accuracy compared to previously utilised approaches. These proposed polynomial matrix techniques are evaluated in a number of simulations. The results show that the polynomial broadband beamformer (PBBF) steersthe main beam towards the direction of the signal of interest (SoI) and protects the signal over the specified bandwidth, and at the same time suppresses unwanted signals by placing nulls in their directions. In addition to that, the PBBF is compared to the standard time domain broadband beamformer in terms of their mean square error performance, beam-pattern, and computation complexity. This comparison shows that the PBBF can offer a significant reduction in computation complexity compared to its standard counterpart. Overall, the main benefits of this approach include beam steering towards an arbitrary look direction with no need for pre-steering step, and a potentially significant reduction in computational complexity due to the decoupling of dependencies of the quiescent beamformer, blocking matrix, and the adaptive filter compared to a standard broadband beamformer implementation.This thesis addresses the formulation of and solution to broadband minimum variance distortionless response (MVDR) beamforming. Two approaches to this problem are considered, namely, generalised sidelobe canceller (GSC) and Capon beamformers. These are examined based on a novel technique which relies on polynomial matrix formulations. The new scheme is based on the second order statistics of the array sensor measurements in order to estimate a space-time covariance matrix. The beamforming problem can be formulated based on this space-time covariance matrix. Akin to the narrowband problem, where an optimum solution can be derived from the eigenvalue decomposition (EVD) of a constant covariance matrix, this utility is here extended to the broadband case. The decoupling of the space-time covariance matrix in this case is provided by means of a polynomial matrix EVD. The proposed approach is initially exploited to design a GSC beamformer for a uniform linear array, and then extended to the constrained MVDR, or Capon, beamformer and also the GSC with an arbitrary array structure. The uniqueness of the designed GSC comes from utilising the polynomial matrix technique, and its ability to steer the array beam towards an off-broadside direction without the pre-steering stage that is associated with conventional approaches to broadband beamformers. To solve the broadband beamforming problem, this thesis addresses a number of additional tools. A first one is the accurate construction of both the steering vectors based on fractional delay filters, which are required for the broadband constraint formulation of a beamformer, as for the construction of the quiescent beamformer. In the GSC case, we also discuss how a block matrix can be obtained, and introduce a novel paraunitary matrix completion algorithm. For the Capon beamformer, the polynomial extension requires the inversion of a polynomial matrix, for which a residue-based method is proposed that offers better accuracy compared to previously utilised approaches. These proposed polynomial matrix techniques are evaluated in a number of simulations. The results show that the polynomial broadband beamformer (PBBF) steersthe main beam towards the direction of the signal of interest (SoI) and protects the signal over the specified bandwidth, and at the same time suppresses unwanted signals by placing nulls in their directions. In addition to that, the PBBF is compared to the standard time domain broadband beamformer in terms of their mean square error performance, beam-pattern, and computation complexity. This comparison shows that the PBBF can offer a significant reduction in computation complexity compared to its standard counterpart. Overall, the main benefits of this approach include beam steering towards an arbitrary look direction with no need for pre-steering step, and a potentially significant reduction in computational complexity due to the decoupling of dependencies of the quiescent beamformer, blocking matrix, and the adaptive filter compared to a standard broadband beamformer implementation

    Array signal processing algorithms for localization and equalization in complex acoustic channels

    No full text
    The reproduction of realistic soundscapes in consumer electronic applications has been a driving force behind the development of spatial audio signal processing techniques. In order to accurately reproduce or decompose a particular spatial sound field, being able to exploit or estimate the effects of the acoustic environment becomes essential. This requires both an understanding of the source of the complexity in the acoustic channel (the acoustic path between a source and a receiver) and the ability to characterize its spatial attributes. In this thesis, we explore how to exploit or overcome the effects of the acoustic channel for sound source localization and sound field reproduction. The behaviour of a typical acoustic channel can be visualized as a transformation of its free field behaviour, due to scattering and reflections off the measurement apparatus and the surfaces in a room. These spatial effects can be modelled using the solutions to the acoustic wave equation, yet the physical nature of these scatterers typically results in complex behaviour with frequency. The first half of this thesis explores how to exploit this diversity in the frequency-domain for sound source localization, a concept that has not been considered previously. We first extract down-converted subband signals from the broadband audio signal, and collate these signals, such that the spatial diversity is retained. A signal model is then developed to exploit the channel's spatial information using a signal subspace approach. We show that this concept can be applied to multi-sensor arrays on complex-shaped rigid bodies as well as the special case of binaural localization. In both c! ases, an improvement in the closely spaced source resolution is demonstrated over traditional techniques, through simulations and experiments using a KEMAR manikin. The binaural analysis further indicates that the human localization performance in certain spatial regions is limited by the lack of spatial diversity, as suggested in perceptual experiments in the literature. Finally, the possibility of exploiting known inter-subband correlated sources (e.g., speech) for localization in under-determined systems is demonstrated. The second half of this thesis considers reverberation control, where reverberation is modelled as a superposition of sound fields created by a number of spatially distributed sources. We consider the mode/wave-domain description of the sound field, and propose modelling the reverberant modes as linear transformations of the desired sound field modes. This is a novel concept, as we consider each mode transformation to be independent of other modes. This model is then extended to sound field control, and used to derive the compensation signals required at the loudspeakers to equalize the reverberation. We show that estimating the reverberant channel and controlling the sound field now becomes a single adaptive filtering problem in the mode-domain, where the modes can be adapted independently. The performance of the proposed method is compared with existing adaptive and non-adaptive sound field control techniques through simulations. Finally, it is shown that an order of magnitude reduction in the computational complexity can be achieved, while maintaining comparable performance to existing adaptive control techniques

    Design of large polyphase filters in the Quadratic Residue Number System

    Full text link
    • …