11,948 research outputs found

    Tumor Classification Using High-Order Gene Expression Profiles Based on Multilinear ICA

    Get PDF
    Motivation. Independent Components Analysis (ICA) maximizes the statistical independence of the representational components of a training gene expression profiles (GEP) ensemble, but it cannot distinguish relations between the different factors, or different modes, and it is not available to high-order GEP Data Mining. In order to generalize ICA, we introduce Multilinear-ICA and apply it to tumor classification using high order GEP. Firstly, we introduce the basis conceptions and operations of tensor and recommend Support Vector Machine (SVM) classifier and Multilinear-ICA. Secondly, the higher score genes of original high order GEP are selected by using t-statistics and tabulate tensors. Thirdly, the tensors are performed by Multilinear-ICA. Finally, the SVM is used to classify the tumor subtypes. Results. To show the validity of the proposed method, we apply it to tumor classification using high order GEP. Though we only use three datasets, the experimental results show that the method is effective and feasible. Through this survey, we hope to gain some insight into the problem of high order GEP tumor classification, in aid of further developing more effective tumor classification algorithms

    Tensor Decompositions for Signal Processing Applications From Two-way to Multiway Component Analysis

    Full text link
    The widespread use of multi-sensor technology and the emergence of big datasets has highlighted the limitations of standard flat-view matrix models and the necessity to move towards more versatile data analysis tools. We show that higher-order tensors (i.e., multiway arrays) enable such a fundamental paradigm shift towards models that are essentially polynomial and whose uniqueness, unlike the matrix methods, is guaranteed under verymild and natural conditions. Benefiting fromthe power ofmultilinear algebra as theirmathematical backbone, data analysis techniques using tensor decompositions are shown to have great flexibility in the choice of constraints that match data properties, and to find more general latent components in the data than matrix-based methods. A comprehensive introduction to tensor decompositions is provided from a signal processing perspective, starting from the algebraic foundations, via basic Canonical Polyadic and Tucker models, through to advanced cause-effect and multi-view data analysis schemes. We show that tensor decompositions enable natural generalizations of some commonly used signal processing paradigms, such as canonical correlation and subspace techniques, signal separation, linear regression, feature extraction and classification. We also cover computational aspects, and point out how ideas from compressed sensing and scientific computing may be used for addressing the otherwise unmanageable storage and manipulation problems associated with big datasets. The concepts are supported by illustrative real world case studies illuminating the benefits of the tensor framework, as efficient and promising tools for modern signal processing, data analysis and machine learning applications; these benefits also extend to vector/matrix data through tensorization. Keywords: ICA, NMF, CPD, Tucker decomposition, HOSVD, tensor networks, Tensor Train

    Discrete and Continuous Linearizable Equations

    Full text link
    We study the projective systems in both continuous and discrete settings. These systems are linearizable by construction and thus, obviously, integrable. We show that in the continuous case it is possible to eliminate all variables but one and reduce the system to a single differential equation. This equation is of the form of those singled-out by Painlev\'e in his quest for integrable forms. In the discrete case, we extend previous results of ours showing that, again by elimination of variables, the general projective system can be written as a mapping for a single variable. We show that this mapping is a member of the family of multilinear systems (which is not integrable in general). The continuous limit of multilinear mappings is also discussed.Comment: Plain Tex file, 14 pages, no figur

    Adaptive Higher-order Spectral Estimators

    Full text link
    Many applications involve estimation of a signal matrix from a noisy data matrix. In such cases, it has been observed that estimators that shrink or truncate the singular values of the data matrix perform well when the signal matrix has approximately low rank. In this article, we generalize this approach to the estimation of a tensor of parameters from noisy tensor data. We develop new classes of estimators that shrink or threshold the mode-specific singular values from the higher-order singular value decomposition. These classes of estimators are indexed by tuning parameters, which we adaptively choose from the data by minimizing Stein's unbiased risk estimate. In particular, this procedure provides a way to estimate the multilinear rank of the underlying signal tensor. Using simulation studies under a variety of conditions, we show that our estimators perform well when the mean tensor has approximately low multilinear rank, and perform competitively when the signal tensor does not have approximately low multilinear rank. We illustrate the use of these methods in an application to multivariate relational data.Comment: 29 pages, 3 figure

    Blind Multilinear Identification

    Full text link
    We discuss a technique that allows blind recovery of signals or blind identification of mixtures in instances where such recovery or identification were previously thought to be impossible: (i) closely located or highly correlated sources in antenna array processing, (ii) highly correlated spreading codes in CDMA radio communication, (iii) nearly dependent spectra in fluorescent spectroscopy. This has important implications --- in the case of antenna array processing, it allows for joint localization and extraction of multiple sources from the measurement of a noisy mixture recorded on multiple sensors in an entirely deterministic manner. In the case of CDMA, it allows the possibility of having a number of users larger than the spreading gain. In the case of fluorescent spectroscopy, it allows for detection of nearly identical chemical constituents. The proposed technique involves the solution of a bounded coherence low-rank multilinear approximation problem. We show that bounded coherence allows us to establish existence and uniqueness of the recovered solution. We will provide some statistical motivation for the approximation problem and discuss greedy approximation bounds. To provide the theoretical underpinnings for this technique, we develop a corresponding theory of sparse separable decompositions of functions, including notions of rank and nuclear norm that specialize to the usual ones for matrices and operators but apply to also hypermatrices and tensors.Comment: 20 pages, to appear in IEEE Transactions on Information Theor

    Bayesian Robust Tensor Factorization for Incomplete Multiway Data

    Full text link
    We propose a generative model for robust tensor factorization in the presence of both missing data and outliers. The objective is to explicitly infer the underlying low-CP-rank tensor capturing the global information and a sparse tensor capturing the local information (also considered as outliers), thus providing the robust predictive distribution over missing entries. The low-CP-rank tensor is modeled by multilinear interactions between multiple latent factors on which the column sparsity is enforced by a hierarchical prior, while the sparse tensor is modeled by a hierarchical view of Student-tt distribution that associates an individual hyperparameter with each element independently. For model learning, we develop an efficient closed-form variational inference under a fully Bayesian treatment, which can effectively prevent the overfitting problem and scales linearly with data size. In contrast to existing related works, our method can perform model selection automatically and implicitly without need of tuning parameters. More specifically, it can discover the groundtruth of CP rank and automatically adapt the sparsity inducing priors to various types of outliers. In addition, the tradeoff between the low-rank approximation and the sparse representation can be optimized in the sense of maximum model evidence. The extensive experiments and comparisons with many state-of-the-art algorithms on both synthetic and real-world datasets demonstrate the superiorities of our method from several perspectives.Comment: in IEEE Transactions on Neural Networks and Learning Systems, 201
    corecore