8 research outputs found

    Enhanced IVA for audio separation in highly reverberant environments

    Get PDF
    Blind Audio Source Separation (BASS), inspired by the "cocktail-party problem", has been a leading research application for blind source separation (BSS). This thesis concerns the enhancement of frequency domain convolutive blind source separation (FDCBSS) techniques for audio separation in highly reverberant room environments. Independent component analysis (ICA) is a higher order statistics (HOS) approach commonly used in the BSS framework. When applied to audio FDCBSS, ICA based methods suffer from the permutation problem across the frequency bins of each source. Independent vector analysis (IVA) is an FD-BSS algorithm that theoretically solves the permutation problem by using a multivariate source prior, where the sources are considered to be random vectors. The algorithm allows independence between multivariate source signals, and retains dependency between the source signals within each source vector. The source prior adopted to model the nonlinear dependency structure within the source vectors is crucial to the separation performance of the IVA algorithm. The focus of this thesis is on improving the separation performance of the IVA algorithm in the application of BASS. An alternative multivariate Student's t distribution is proposed as the source prior for the batch IVA algorithm. A Student's t probability density function can better model certain frequency domain speech signals due to its tail dependency property. Then, the nonlinear score function, for the IVA, is derived from the proposed source prior. A novel energy driven mixed super Gaussian and Student's t source prior is proposed for the IVA and FastIVA algorithms. The Student's t distribution, in the mixed source prior, can model the high amplitude data points whereas the super Gaussian distribution can model the lower amplitude information in the speech signals. The ratio of both distributions can be adjusted according to the energy of the observed mixtures to adapt for different types of speech signals. A particular multivariate generalized Gaussian distribution is adopted as the source prior for the online IVA algorithm. The nonlinear score function derived from this proposed source prior contains fourth order relationships between different frequency bins, which provides a more informative and stronger dependency structure and thereby improves the separation performance. An adaptive learning scheme is developed to improve the performance of the online IVA algorithm. The scheme adjusts the learning rate as a function of proximity to the target solutions. The scheme is also accompanied with a novel switched source prior technique taking the best performance properties of the super Gaussian source prior and the generalized Gaussian source prior as the algorithm converges. The methods and techniques, proposed in this thesis, are evaluated with real speech source signals in different simulated and real reverberant acoustic environments. A variety of measures are used within the evaluation criteria of the various algorithms. The experimental results demonstrate improved performance of the proposed methods and their robustness in a wide range of situations

    Single channel blind source separation

    Get PDF
    Single channel blind source separation (SCBSS) is an intensively researched field with numerous important applications. This research sets out to investigate the separation of monaural mixed audio recordings without relying on training knowledge. This research proposes a novel method based on variable regularised sparse nonnegative matrix factorization which decomposes an information-bearing matrix into two-dimensional convolution of factor matrices that represent the spectral basis and temporal code of the sources. In this work, a variational Bayesian approach has been developed for computing the sparsity parameters of the matrix factorization. To further improve the previous work, this research proposes a new method based on decomposing the mixture into a series of oscillatory components termed as the intrinsic mode functions (IMF). It is shown that IMFs have several desirable properties unique to SCBSS problem and how these properties can be advantaged to relax the constraints posed by the problem. In addition, this research develops a novel method for feature extraction using psycho-acoustic model. The monaural mixed signal is transformed to a cochleagram using the gammatone filterbank, whose bandwidths increase incrementally as the center frequency increases; thus resulting to non-uniform time-frequency (TF) resolution in the analysis of audio signal. Within this domain, a family of Itakura-Saito (IS) divergence based novel two-dimensional matrix factorization has been developed. The proposed matrix factorizations have the property of scale invariant which enables lower energy components in the cochleagram to be treated with equal importance as the high energy ones. Results show that all the developed algorithms presented in this thesis have outperformed conventional methods.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Réduction d'interférence dans les systèmes de transmission sans fil

    Get PDF
    Wireless communications have known an exponential growth and a fast progress over the past few decades. Nowadays, wireless mobile communications have evolved over time starting with the first generation primarily developed for voice communications, and reaching the fourth generation referred to as long term evolution (LTE) that offers an increasing capacity and speed using a different radio interface together with core network improvements. Overall throughput and transmission reliability are among the essential measures of service quality in a wireless system. Such measures are mainly subjected to interference management constraint in a multi-user network. The interference management is at the heart of wireless regulation and is essential for maintaining a desirable throughput while avoiding the detrimental impact of interference at the undesired receivers. Our work is incorporated within the framework of interference network where each user is equipped with single or multiple antennas. The goal is to resolve the challenges that the communications face taking into account the achievable rate and the complexity cost. We propose several solutions for the precoding and decoding designs when transmitters have limited cooperation based on a technique called Interference Alignment. We also address the detection scheme in the absence of any precoding design and we introduce a low complexity detection scheme based on the sparse decomposition.Les communications mobiles sans fil ont connu un formidable essor au cours des dernières décennies. Tout a commencé avec les services vocaux offerts par les systèmes de la première génération en 1980, jusqu¿aux systèmes de la quatrième génération aujourd¿hui avec des services internet haut débit et un accroissement du nombre d¿utilisateurs. En effet, les caractéristiques essentielles qui définissent les services et la qualité de ces services dans les systèmes de communication sans fil sont: le débit, la fiabilité de transmission et le nombre d¿utilisateurs. Ces caractéristiques sont fortement liées entre elles et sont dépendantes de la gestion des interférences entre les différents utilisateurs. Les interférences entre-utilisateurs se produisent lorsque plusieurs émetteurs, dans une même zone, transmettent simultanément en utilisant la même bande de fréquence. Dans cette thèse, nous nous intéressons à la gestion d¿interférence entre utilisateurs par le biais de l¿approche d¿alignement d¿interférences où la coopération entre utilisateurs est réduite. Aussi, nous nous sommes intéressés au design d¿un récepteur où l¿alignement d¿interférences n¿est pas utilisé et où la gestion des interférences est réalisée par des techniques de décodage basées sur les décompositions parcimonieuses des signaux de communications. Ces approches ont conduit à des méthodes performantes et peu couteuses, exploitables dans les liens montant ou descendant

    A new demixer scheme for blind source separation using general neural network model

    No full text
    There has been a surge of interest in blind source separation (BSS) because of its potential applications in several areas of engineering and science such as wireless systems. We propose a new neural network demixing scheme using a general neural network structure for the BSS problem for the instantaneous mixtures. It is shown that the existing feedforward (FF) and feedback (FB) neural network schemes can be reduced from the new general model. The results demonstrate that the new scheme is more robust and offers superior convergence properties

    A new demixer scheme for blind source separation using general neural network model

    No full text

    A new demixer scheme for blind source separation using general neural network model

    No full text

    A new demixer scheme for blind source separation using general neural network model

    No full text
    corecore