666 research outputs found

    Speech enhancement with frequency domain auto-regressive modeling

    Full text link
    Speech applications in far-field real world settings often deal with signals that are corrupted by reverberation. The task of dereverberation constitutes an important step to improve the audible quality and to reduce the error rates in applications like automatic speech recognition (ASR). We propose a unified framework of speech dereverberation for improving the speech quality and the ASR performance using the approach of envelope-carrier decomposition provided by an autoregressive (AR) model. The AR model is applied in the frequency domain of the sub-band speech signals to separate the envelope and carrier parts. A novel neural architecture based on dual path long short term memory (DPLSTM) model is proposed, which jointly enhances the sub-band envelope and carrier components. The dereverberated envelope-carrier signals are modulated and the sub-band signals are synthesized to reconstruct the audio signal back. The DPLSTM model for dereverberation of envelope and carrier components also allows the joint learning of the network weights for the down stream ASR task. In the ASR tasks on the REVERB challenge dataset as well as on the VOiCES dataset, we illustrate that the joint learning of speech dereverberation network and the E2E ASR model yields significant performance improvements over the baseline ASR system trained on log-mel spectrogram as well as other benchmarks for dereverberation (average relative improvements of 10-24% over the baseline system). The speech quality improvements, evaluated using subjective listening tests, further highlight the improved quality of the reconstructed audio.Comment: 10 page

    Subband vector quantization of images using hexagonal filter banks

    Get PDF
    Journal ArticleAbstract Results of psychophysical experiments on human vision conducted in the last three decades indicate that the eye performs a multichannel decomposition of the incident images. This paper presents a subband vector quantization algorithm that employs hexagonal filter banks. The hexagonal filter bank provides an image decomposition similar to what the eye is believed to do. Consequently, the image coder is able to make use of the properties of the human visual system and produce compressed images of high quality at low bit rates. We present a systematic approach for optimal allocation of available bits among the subbands and also for the selection of the size of the vectors in each of the subbands

    Efficient Multiband Algorithms for Blind Source Separation

    Get PDF
    The problem of blind separation refers to recovering original signals, called source signals, from the mixed signals, called observation signals, in a reverberant environment. The mixture is a function of a sequence of original speech signals mixed in a reverberant room. The objective is to separate mixed signals to obtain the original signals without degradation and without prior information of the features of the sources. The strategy used to achieve this objective is to use multiple bands that work at a lower rate, have less computational cost and a quicker convergence than the conventional scheme. Our motivation is the competitive results of unequal-passbands scheme applications, in terms of the convergence speed. The objective of this research is to improve unequal-passbands schemes by improving the speed of convergence and reducing the computational cost. The first proposed work is a novel maximally decimated unequal-passbands scheme.This scheme uses multiple bands that make it work at a reduced sampling rate, and low computational cost. An adaptation approach is derived with an adaptation step that improved the convergence speed. The performance of the proposed scheme was measured in different ways. First, the mean square errors of various bands are measured and the results are compared to a maximally decimated equal-passbands scheme, which is currently the best performing method. The results show that the proposed scheme has a faster convergence rate than the maximally decimated equal-passbands scheme. Second, when the scheme is tested for white and coloured inputs using a low number of bands, it does not yield good results; but when the number of bands is increased, the speed of convergence is enhanced. Third, the scheme is tested for quick changes. It is shown that the performance of the proposed scheme is similar to that of the equal-passbands scheme. Fourth, the scheme is also tested in a stationary state. The experimental results confirm the theoretical work. For more challenging scenarios, an unequal-passbands scheme with over-sampled decimation is proposed; the greater number of bands, the more efficient the separation. The results are compared to the currently best performing method. Second, an experimental comparison is made between the proposed multiband scheme and the conventional scheme. The results show that the convergence speed and the signal-to-interference ratio of the proposed scheme are higher than that of the conventional scheme, and the computation cost is lower than that of the conventional scheme

    Doubly Orthogonal Wavelet Packets for Multi-Users Indoor Visible Light Communication Systems

    Get PDF
    Visible Light Communication (VLC) is a data communication technology that modulates the intensity of the light to transmit the information mostly by means of Light Emitting Diodes (LEDs). The data rate is mainly throttled by the limited bandwidth of the LEDs. To combat, Multi-carrier Code Division Multiple Access (MC-CDMA) is a favorable technique for achieving higher data rates along with reduced Inter-Symbol Interference (ISI) and easy access to multi-users at the cost of slightly reduced compromised spectral efficiency and Multiple Access Interference (MAI). In this article, a multi-user VLC system is designed using a Discrete Wavelet Transform (DWT) that eradicates the use of cyclic prefix due to the good orthogonality and time-frequency localization properties of wavelets. Moreover, the design also comprises suitable signature codes, which are generated by employing double orthogonality depending upon Walsh codes and Wavelet Packets. The proposed multi-user system is simulated in MATLAB software and its overall performance is assessed using line-of-sight (LoS) and non-line-of-sight (NLoS) configurations. Furthermore, two sub-optimum multi-users detection schemes such as zero forcing (ZF) and minimum-mean-square-error (MMSE) are also used at the receiver. The simulated results illustrate that the doubly orthogonal signature waveform-based DWT-MC-CDMA with MMSE detection scheme outperforms the Walsh code-based multi-user system

    Sparse representation for audio noise removal using zero-zone quantizers

    Get PDF
    In zero zone quantization, bins around zero are quantized to a zero value. This kind of quantization can be applied on orthogonal transforms to remove the unwanted or redundant signal. Transforms reveal structures and properties of a signal and hence careful application of a zero zone over the transform coefficients leads to noise removal. In this thesis, such quantizers are applied over Discrete Fourier Transform and Karhunen Loeve Transform coefficients separately, and outputs compared. Further, the localization of the zero zones to certain frequencies leads to better performance in terms of noise removal. PEAQ (Perceptual Evaluation of Audio Quality) scores have been used to measure the objective quality of the denoised signal

    An Investigation of Orthogonal Wavelet Division Multiplexing Techniques as an Alternative to Orthogonal Frequency Division Multiplex Transmissions and Comparison of Wavelet Families and Their Children

    Get PDF
    Recently, issues surrounding wireless communications have risen to prominence because of the increase in the popularity of wireless applications. Bandwidth problems, and the difficulty of modulating signals across carriers, represent significant challenges. Every modulation scheme used to date has had limitations, and the use of the Discrete Fourier Transform in OFDM (Orthogonal Frequency Division Multiplex) is no exception. The restriction on further development of OFDM lies primarily within the type of transform it uses in the heart of its system, Fourier transform. OFDM suffers from sensitivity to Peak to Average Power Ratio, carrier frequency offset and wasting some bandwidth to guard successive OFDM symbols. The discovery of the wavelet transform has opened up a number of potential applications from image compression to watermarking and encryption. Very recently, work has been done to investigate the potential of using wavelet transforms within the communication space. This research will further investigate a recently proposed, innovative, modulation technique, Orthogonal Wavelet Division Multiplex, which utilises the wavelet transform opening a new avenue for an alternative modulation scheme with some interesting potential characteristics. Wavelet transform has many families and each of those families has children which each differ in filter length. This research consider comprehensively investigates the new modulation scheme, and proposes multi-level dynamic sub-banding as a tool to adapt variable signal bandwidths. Furthermore, all compactly supported wavelet families and their associated children of those families are investigated and evaluated against each other and compared with OFDM. The linear computational complexity of wavelet transform is less than the logarithmic complexity of Fourier in OFDM. The more important complexity is the operational complexity which is cost effectiveness, such as the time response of the system, the memory consumption and the number of iterative operations required for data processing. Those complexities are investigated for all available compactly supported wavelet families and their children and compared with OFDM. The evaluation reveals which wavelet families perform more effectively than OFDM, and for each wavelet family identifies which family children perform the best. Based on these results, it is concluded that the wavelet modulation scheme has some interesting advantages over OFDM, such as lower complexity and bandwidth conservation of up to 25%, due to the elimination of guard intervals and dynamic bandwidth allocation, which result in better cost effectiveness
    • …
    corecore