95 research outputs found

    Determining number of independent sources in undercomplete mixture

    Get PDF
    Separation of independent sources using independent component analysis (ICA) requires prior knowledge of the number of independent sources. Performing ICA when the number of recordings is greater than the number of sources can give erroneous results. To improve the quality of separation, the most suitable recordings have to be identified before performing ICA. Techniques employed to estimate suitable recordings require estimation of number of independent sources or require repeated iterations. However there is no objective measure of the number of independent sources in a given mixture. Here, a technique has been developed to determine the number of independent sources in a given mixture. This paper demonstrates that normalised determinant of the global matrix is a measure of the number of independent sources, N, in a mixture of M recordings. It has also been shown that performing ICA on N randomly selected recordings out of M recordings gives good quality of separation

    Iterative issues of ICA, quality of separation and number of sources : a study for biosignal applications

    Get PDF
    This thesis has evaluated the use of Independent Component Analysis (ICA) on Surface Electromyography (sEMG), focusing on the biosignal applications. This research has identified and addressed the following four issues related to the use of ICA for biosignals: • The iterative nature of ICA • The order and magnitude ambiguity problems of ICA • Estimation of number of sources based on dependency and independency nature of the signals • Source separation for non-quadratic ICA (undercomplete and overcomplete) This research first establishes the applicability of ICA for sEMG and also identifies the shortcomings related to order and magnitude ambiguity. It has then developed, a mitigation strategy for these issues by using a single unmixing matrix and neural network weight matrix corresponding to the specific user. The research reports experimental verification of the technique and also the investigation of the impact of inter-subject and inter-experimental variations. The results demonstrate that while using sEMG without separation gives only 60% accuracy, and sEMG separated using traditional ICA gives an accuracy of 65%, this approach gives an accuracy of 99% for the same experimental data. Besides the marked improvement in accuracy, the other advantages of such a system are that it is suitable for real time operations and is easy to train by a lay user. The second part of this thesis reports research conducted to evaluate the use of ICA for the separation of bioelectric signals when the number of active sources may not be known. The work proposes the use of value of the determinant of the Global matrix generated using sparse sub band ICA for identifying the number of active sources. The results indicate that the technique is successful in identifying the number of active muscles for complex hand gestures. The results support the applications such as human computer interface. This thesis has also developed a method of determining the number of independent sources in a given mixture and has also demonstrated that using this information, it is possible to separate the signals in an undercomplete situation and reduce the redundancy in the data using standard ICA methods. The experimental verification has demonstrated that the quality of separation using this method is better than other techniques such as Principal Component Analysis (PCA) and selective PCA. This has number of applications such as audio separation and sensor networks

    When Are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

    Get PDF
    Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabulary. While general overcomplete topic models are not identifiable, we establish generic identifiability under a constraint, referred to as topic persistence. Our sufficient conditions for identifiability involve a novel set of "higher order" expansion conditions on the topic-word matrix or the population structure of the model. This set of higher-order expansion conditions allow for overcomplete models, and require the existence of a perfect matching from latent topics to higher order observed words. We establish that random structured topic models are identifiable w.h.p. in the overcomplete regime. Our identifiability results allows for general (non-degenerate) distributions for modeling the topic proportions, and thus, we can handle arbitrarily correlated topics in our framework. Our identifiability results imply uniqueness of a class of tensor decompositions with structured sparsity which is contained in the class of Tucker decompositions, but is more general than the Candecomp/Parafac (CP) decomposition

    Deep Learning in Single-Cell Analysis

    Full text link
    Single-cell technologies are revolutionizing the entire field of biology. The large volumes of data generated by single-cell technologies are high-dimensional, sparse, heterogeneous, and have complicated dependency structures, making analyses using conventional machine learning approaches challenging and impractical. In tackling these challenges, deep learning often demonstrates superior performance compared to traditional machine learning methods. In this work, we give a comprehensive survey on deep learning in single-cell analysis. We first introduce background on single-cell technologies and their development, as well as fundamental concepts of deep learning including the most popular deep architectures. We present an overview of the single-cell analytic pipeline pursued in research applications while noting divergences due to data sources or specific applications. We then review seven popular tasks spanning through different stages of the single-cell analysis pipeline, including multimodal integration, imputation, clustering, spatial domain identification, cell-type deconvolution, cell segmentation, and cell-type annotation. Under each task, we describe the most recent developments in classical and deep learning methods and discuss their advantages and disadvantages. Deep learning tools and benchmark datasets are also summarized for each task. Finally, we discuss the future directions and the most recent challenges. This survey will serve as a reference for biologists and computer scientists, encouraging collaborations.Comment: 77 pages, 11 figures, 15 tables, deep learning, single-cell analysi

    Cluster-Based Supervised Classification

    Get PDF

    Acta Cybernetica : Volume 19. Number 1.

    Get PDF

    Informative sensing : theory and applications

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 145-156).Compressed sensing is a recent theory for the sampling and reconstruction of sparse signals. Sparse signals only occupy a tiny fraction of the entire signal space and thus have a small amount of information, relative to their dimension. The theory tells us that the information can be captured faithfully with few random measurement samples, even far below the Nyquist rate. Despite the successful story, we question how the theory would change if we had a more precise prior than the simple sparsity model. Hence, we consider the settings where the prior is encoded as a probability density. In a Bayesian perspective, we see the signal recovery as an inference, in which we estimate the unmeasured dimensions of the signal given the incomplete measurements. We claim that good sensors should somehow be designed to minimize the uncertainty of the inference. In this thesis, we primarily use Shannon's entropy to measure the uncertainty and in effect pursue the InfoMax principle, rather than the restricted isometry property, in optimizing the sensors. By approximate analysis on sparse signals, we found random projections, typical in the compressed sensing literature, to be InfoMax optimal if the sparse coefficients are independent and identically distributed (i.i.d.). If not, however, we could find a different set of projections which, in signal reconstruction, consistently outperformed random or other types of measurements. In particular, if the coefficients are groupwise i.i.d., groupwise random projections with nonuniform sampling rate per group prove asymptotically Info- Max optimal. Such a groupwise i.i.d. pattern roughly appears in natural images when the wavelet basis is partitioned into groups according to the scale. Consequently, we applied the groupwise random projections to the sensing of natural images. We also considered designing an optimal color filter array for single-chip cameras. In this case, the feasible set of projections is highly restricted because multiplexing across pixels is not allowed. Nevertheless, our principle still applies. By minimizing the uncertainty of the unmeasured colors given the measured ones, we could find new color filter arrays which showed better demosaicking performance in comparison with Bayer or other existing color filter arrays.by Hyun Sung Chang.Ph.D

    Independent Component Analysis in a convoluted world

    Get PDF

    Separation of Synchronous Sources

    Get PDF
    This thesis studies the Separation of Synchronous Sources (SSS) problem, which deals with the separation of signals resulting from a linear mixing of sources whose phases are synchronous. While this study is made in a form independent of the application, a motivation from a neuroscience perspective is presented. Traditional methods for Blind Source Separation, such as Independent Component Analysis (ICA), cannot address this problem because synchronous sources are highly dependent. We provide sufficient conditions for SSS to be an identifiable problem, and quantify the effect of prewhitening on the difficulty of SSS. We also present two algorithms to solve SSS. Extensive studies on simulated data illustrate that these algorithms yield substantially better results when compared with ICA methods. We conclude that these algorithms can successfully perform SSS in varying configurations (number of sources, number of sensors, level of additive noise, phase lag between sources, among others). Theoretical properties of one of these algorithms are also presented. Future work is discussed extensively, showing that this area of study is far from resolved and still presents interesting challenges
    • …
    corecore