167 research outputs found

    Blind identification of mixtures of quasi-stationary sources.

    Get PDF
    由於在盲語音分離的應用,線性準平穩源訊號混合的盲識別獲得了巨大的研究興趣。在這個問題上,我們利用準穩態源訊號的時變特性來識別未知的混合系統系數。傳統的方法有二:i)基於張量分解的平行因子分析(PARAFAC);ii)基於對多個矩陣的聯合對角化的聯合對角化算法(JD)。一般來說,PARAFAC和JD 都採用了源聯合的提取方法;即是說,對應所有訊號源的系統係數在升法上是用時進行識別的。在這篇論文中,我利用Khati-Rao(KR)子空間來設計一種新的盲識別算法。在我設計的算法中提出一種與傳統的方法不同的提法。在我設計的算法中,盲識別問題被分解成數個結構上相對簡單的子問題,分別對應不同的源。在超定混合模型,我們提出了一個專門的交替投影算法(AP)。由此產生的算法,不但能從經驗發現是非常有競爭力的,而且更有理論上的利落收斂保證。另外,作為一個有趣的延伸,該算法可循一個簡單的方式應用於欠混合模型。對於欠定混合模型,我們提出啟發式的秩最小化算法從而提高算法的速度。Blind identification of linear instantaneous mixtures of quasi-stationary sources (BI-QSS) has received great research interest over the past few decades, motivated by its application in blind speech separation. In this problem, we identify the unknown mixing system coefcients by exploiting the time-varying characteristics of quasi-stationary sources. Traditional BI-QSS methods fall into two main categories: i) Parallel Factor Analysis (PARAFAC), which is based on tensor decomposition; ii) Joint Diagonalization (JD), which is based on approximate joint diagonalization of multiple matrices. In both PARAFAC and JD, the joint-source formulation is used in general; i.e., the algorithms are designed to identify the whole mixing system simultaneously.In this thesis, I devise a novel blind identification framework using a Khatri-Rao (KR) subspace formulation. The proposed formulation is different from the traditional formulations in that it decomposes the blind identication problem into a number of per-source, structurally less complex subproblems. For the over determined mixing models, a specialized alternating projections algorithm is proposed for the KR subspace for¬mulation. The resulting algorithm is not only empirically found to be very competitive, but also has a theoretically neat convergence guarantee. Even better, the proposed algorithm can be applied to the underdetermined mixing models in a straightforward manner. Rank minimization heuristics are proposed to speed up the algorithm for the underdetermined mixing model. The advantages on employing the rank minimization heuristics are demonstrated by simulations.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Lee, Ka Kit.Thesis (M.Phil.)--Chinese University of Hong Kong, 2012.Includes bibliographical references (leaves 72-76).Abstracts also in Chinese.Abstract --- p.iAcknowledgement --- p.iiChapter 1 --- Introduction --- p.1Chapter 2 --- Settings of Quasi-Stationary Signals based Blind Identification --- p.4Chapter 2.1 --- Signal Model --- p.4Chapter 2.2 --- Assumptions --- p.5Chapter 2.3 --- Local Covariance Model --- p.7Chapter 2.4 --- Noise Covariance Removal --- p.8Chapter 2.5 --- Prewhitening --- p.9Chapter 2.6 --- Summary --- p.10Chapter 3 --- Review on Some Existing BI-QSS Algorithms --- p.11Chapter 3.1 --- Joint Diagonalization --- p.11Chapter 3.1.1 --- Fast Frobenius Diagonalization [4] --- p.12Chapter 3.1.2 --- Pham’s JD [5, 6] --- p.14Chapter 3.2 --- Parallel Factor Analysis --- p.16Chapter 3.2.1 --- Tensor Decomposition [37] --- p.17Chapter 3.2.2 --- Alternating-Columns Diagonal-Centers [12] --- p.21Chapter 3.2.3 --- Trilinear Alternating Least-Squares [10, 11] --- p.23Chapter 3.3 --- Summary --- p.25Chapter 4 --- Proposed Algorithms --- p.26Chapter 4.1 --- KR Subspace Criterion --- p.27Chapter 4.2 --- Blind Identification using Alternating Projections --- p.29Chapter 4.2.1 --- All-Columns Identification --- p.31Chapter 4.3 --- Overdetermined Mixing Models (N > K): Prewhitened Alternating Projection Algorithm (PAPA) --- p.32Chapter 4.4 --- Underdetermined Mixing Models (N <K) --- p.34Chapter 4.4.1 --- Rank Minimization Heuristic --- p.34Chapter 4.4.2 --- Alternating Projections Algorithm with Huber Function Regularization --- p.37Chapter 4.5 --- Robust KR Subspace Extraction --- p.40Chapter 4.6 --- Summary --- p.44Chapter 5 --- Simulation Results --- p.47Chapter 5.1 --- General Settings --- p.47Chapter 5.2 --- Overdetermined Mixing Models --- p.49Chapter 5.2.1 --- Simulation 1 - Performance w.r.t. SNR --- p.49Chapter 5.2.2 --- Simulation 2 - Performance w.r.t. the Number of Available Frames M --- p.49Chapter 5.2.3 --- Simulation 3 - Performance w.r.t. the Number of Sources K --- p.50Chapter 5.3 --- Underdetermined Mixing Models --- p.52Chapter 5.3.1 --- Simulation 1 - Success Rate of KR Huber --- p.53Chapter 5.3.2 --- Simulation 2 - Performance w.r.t. SNR --- p.54Chapter 5.3.3 --- Simulation 3 - Performance w.r.t. M --- p.54Chapter 5.3.4 --- Simulation 4 - Performance w.r.t. N --- p.56Chapter 5.4 --- Summary --- p.56Chapter 6 --- Conclusion and Future Works --- p.58Chapter A --- Convolutive Mixing Model --- p.60Chapter B --- Proofs --- p.63Chapter B.1 --- Proof of Theorem 4.1 --- p.63Chapter B.2 --- Proof of Theorem 4.2 --- p.65Chapter B.3 --- Proof of Observation 4.1 --- p.65Chapter B.4 --- Proof of Proposition 4.1 --- p.66Chapter C --- Singular Value Thresholding --- p.67Chapter D --- Categories of Speech Sounds and Their Impact on SOSs-based BI-QSS Algorithms --- p.69Chapter D.1 --- Vowels --- p.69Chapter D.2 --- Consonants --- p.69Chapter D.1 --- Silent Pauses --- p.70Bibliography --- p.7

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    Hybrid solutions to instantaneous MIMO blind separation and decoding: narrowband, QAM and square cases

    Get PDF
    Future wireless communication systems are desired to support high data rates and high quality transmission when considering the growing multimedia applications. Increasing the channel throughput leads to the multiple input and multiple output and blind equalization techniques in recent years. Thereby blind MIMO equalization has attracted a great interest.Both system performance and computational complexities play important roles in real time communications. Reducing the computational load and providing accurate performances are the main challenges in present systems. In this thesis, a hybrid method which can provide an affordable complexity with good performance for Blind Equalization in large constellation MIMO systems is proposed first. Saving computational cost happens both in the signal sep- aration part and in signal detection part. First, based on Quadrature amplitude modulation signal characteristics, an efficient and simple nonlinear function for the Independent Compo- nent Analysis is introduced. Second, using the idea of the sphere decoding, we choose the soft information of channels in a sphere, and overcome the so- called curse of dimensionality of the Expectation Maximization (EM) algorithm and enhance the final results simultaneously. Mathematically, we demonstrate in the digital communication cases, the EM algorithm shows Newton -like convergence.Despite the widespread use of forward -error coding (FEC), most multiple input multiple output (MIMO) blind channel estimation techniques ignore its presence, and instead make the sim- plifying assumption that the transmitted symbols are uncoded. However, FEC induces code structure in the transmitted sequence that can be exploited to improve blind MIMO channel estimates. In final part of this work, we exploit the iterative channel estimation and decoding performance for blind MIMO equalization. Experiments show the improvements achievable by exploiting the existence of coding structures and that it can access the performance of a BCJR equalizer with perfect channel information in a reasonable SNR range. All results are confirmed experimentally for the example of blind equalization in block fading MIMO systems

    Unsupervised neural spike identification for large-scale, high-density micro-electrode arrays

    Get PDF
    This work deals with the development and evaluation of algorithms that extract sequences of single neuron action potentials from extracellular recordings of superimposed neural activity - a task commonly referred to as spike sorting. Large (>103>10^3 electrodes) and dense (subcellular spatial sampling) CMOS-based micro-electrode-arrays allow to record from hundreds of neurons simultaneously. State of the art algorithms for up to a few hundred sensors are not directly applicable to this type of data. Promising modern spike sorting algorithms that seek the statistically optimal solution or focus on real-time capabilities need to be initialized with a preceding sorting. Therefore, this work focused on unsupervised solutions, in order to learn the number of neurons and their spike trains with proper resolution of both temporally and spatiotemporally overlapping activity from the extracellular data alone. Chapter (1) informs about the nature of the data, a model based view and how this relates to spike sorting in order to understand the design decisions of this thesis. The main materials and methods chapter (2) bundles the infrastructural work that is independent of but mandatory for the development and evaluation of any spike sorting method. The main problem was split in two parts. Chapter (3) assesses the problem of analyzing data from thousands of densely integrated channels in a divide-and-conquer fashion. Making use of the spatial information of dense 2D arrays, regions of interest (ROIs) with boundaries adapted to the electrical image of single or multiple neurons were automatically constructed. All ROIs could then be processed in parallel. Within each region of interest the maximum number of neurons could be estimated from the local data matrix alone. An independent component analysis (ICA) based sorting was used to identify units within ROIs. This stage can be replaced by another suitable spike sorting algorithm to solve the local problem. Redundantly identified units across different ROIs were automatically fused into a global solution. The framework was evaluated on both real as well as simulated recordings with ground truth. For the latter it was shown that a major fraction of units could be extracted without any error. The high-dimensional data can be visualized after automatic sorting for convenient verification. Means of rapidly separating well from poorly isolated neurons were proposed and evaluated. Chapter (4) presents a more sophisticated algorithm that was developed to solve the local problem of densely arranged sensors. ICA assumes the data to be instantaneously mixed, thereby reducing spatial redundancy only and ignoring the temporal structure of extracellular data. The widely accepted generative model describes the intracellular spike trains to be convolved with their extracellular spatiotemporal kernels. To account for the latter it was assessed thoroughly whether convolutive ICA (cICA) could increase sorting performance over instantaneous ICA. The high computational complexity of cICA was dealt with by automatically identifying relevant subspaces that can be unmixed in parallel. Although convolutive ICA is suggested by the data model, the sorting results were dominated by the post-processing for realistic scenarios and did not outperform ICA based sorting. Potential alternatives are discussed thoroughly and bounded from above by a supervised sorting. This work provides a completely unsupervised spike sorting solution that enables the extraction of a major fraction of neurons with high accuracy and thereby helps to overcome current limitations of analyzing the high-dimensional datasets obtained from simultaneously imaging the extracellular activity from hundreds of neurons with thousands of electrodes

    Non-acyclicity of coset lattices and generation of finite groups

    Get PDF

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

    Full text link
    Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.Comment: 232 page

    Audio source separation for music in low-latency and high-latency scenarios

    Get PDF
    Aquesta tesi proposa mètodes per tractar les limitacions de les tècniques existents de separació de fonts musicals en condicions de baixa i alta latència. En primer lloc, ens centrem en els mètodes amb un baix cost computacional i baixa latència. Proposem l'ús de la regularització de Tikhonov com a mètode de descomposició de l'espectre en el context de baixa latència. El comparem amb les tècniques existents en tasques d'estimació i seguiment dels tons, que són passos crucials en molts mètodes de separació. A continuació utilitzem i avaluem el mètode de descomposició de l'espectre en tasques de separació de veu cantada, baix i percussió. En segon lloc, proposem diversos mètodes d'alta latència que milloren la separació de la veu cantada, gràcies al modelatge de components específics, com la respiració i les consonants. Finalment, explorem l'ús de correlacions temporals i anotacions manuals per millorar la separació dels instruments de percussió i dels senyals musicals polifònics complexes.Esta tesis propone métodos para tratar las limitaciones de las técnicas existentes de separación de fuentes musicales en condiciones de baja y alta latencia. En primer lugar, nos centramos en los métodos con un bajo coste computacional y baja latencia. Proponemos el uso de la regularización de Tikhonov como método de descomposición del espectro en el contexto de baja latencia. Lo comparamos con las técnicas existentes en tareas de estimación y seguimiento de los tonos, que son pasos cruciales en muchos métodos de separación. A continuación utilizamos y evaluamos el método de descomposición del espectro en tareas de separación de voz cantada, bajo y percusión. En segundo lugar, proponemos varios métodos de alta latencia que mejoran la separación de la voz cantada, gracias al modelado de componentes que a menudo no se toman en cuenta, como la respiración y las consonantes. Finalmente, exploramos el uso de correlaciones temporales y anotaciones manuales para mejorar la separación de los instrumentos de percusión y señales musicales polifónicas complejas.This thesis proposes specific methods to address the limitations of current music source separation methods in low-latency and high-latency scenarios. First, we focus on methods with low computational cost and low latency. We propose the use of Tikhonov regularization as a method for spectrum decomposition in the low-latency context. We compare it to existing techniques in pitch estimation and tracking tasks, crucial steps in many separation methods. We then use the proposed spectrum decomposition method in low-latency separation tasks targeting singing voice, bass and drums. Second, we propose several high-latency methods that improve the separation of singing voice by modeling components that are often not accounted for, such as breathiness and consonants. Finally, we explore using temporal correlations and human annotations to enhance the separation of drums and complex polyphonic music signals
    corecore