11,948 research outputs found
Tumor Classification Using High-Order Gene Expression Profiles Based on Multilinear ICA
Motivation. Independent Components Analysis (ICA) maximizes the statistical independence of the representational components of
a training gene expression profiles (GEP) ensemble, but it cannot
distinguish relations between the different factors, or different
modes, and it is not available to high-order GEP Data Mining. In
order to generalize ICA, we introduce Multilinear-ICA and apply it to
tumor classification using high order GEP. Firstly, we introduce the
basis conceptions and operations of tensor and recommend Support
Vector Machine (SVM) classifier and Multilinear-ICA. Secondly,
the higher score genes of original high order GEP are selected by
using t-statistics and tabulate tensors. Thirdly, the tensors are
performed by Multilinear-ICA. Finally, the SVM is used to classify
the tumor subtypes. Results. To show the validity of the proposed method, we apply it
to tumor classification using high order GEP. Though we only use
three datasets, the experimental results show that the method is
effective and feasible. Through this survey, we hope to gain some
insight into the problem of high order GEP tumor classification, in
aid of further developing more effective tumor classification algorithms
Tensor Decompositions for Signal Processing Applications From Two-way to Multiway Component Analysis
The widespread use of multi-sensor technology and the emergence of big
datasets has highlighted the limitations of standard flat-view matrix models
and the necessity to move towards more versatile data analysis tools. We show
that higher-order tensors (i.e., multiway arrays) enable such a fundamental
paradigm shift towards models that are essentially polynomial and whose
uniqueness, unlike the matrix methods, is guaranteed under verymild and natural
conditions. Benefiting fromthe power ofmultilinear algebra as theirmathematical
backbone, data analysis techniques using tensor decompositions are shown to
have great flexibility in the choice of constraints that match data properties,
and to find more general latent components in the data than matrix-based
methods. A comprehensive introduction to tensor decompositions is provided from
a signal processing perspective, starting from the algebraic foundations, via
basic Canonical Polyadic and Tucker models, through to advanced cause-effect
and multi-view data analysis schemes. We show that tensor decompositions enable
natural generalizations of some commonly used signal processing paradigms, such
as canonical correlation and subspace techniques, signal separation, linear
regression, feature extraction and classification. We also cover computational
aspects, and point out how ideas from compressed sensing and scientific
computing may be used for addressing the otherwise unmanageable storage and
manipulation problems associated with big datasets. The concepts are supported
by illustrative real world case studies illuminating the benefits of the tensor
framework, as efficient and promising tools for modern signal processing, data
analysis and machine learning applications; these benefits also extend to
vector/matrix data through tensorization. Keywords: ICA, NMF, CPD, Tucker
decomposition, HOSVD, tensor networks, Tensor Train
Discrete and Continuous Linearizable Equations
We study the projective systems in both continuous and discrete settings.
These systems are linearizable by construction and thus, obviously, integrable.
We show that in the continuous case it is possible to eliminate all variables
but one and reduce the system to a single differential equation. This equation
is of the form of those singled-out by Painlev\'e in his quest for integrable
forms. In the discrete case, we extend previous results of ours showing that,
again by elimination of variables, the general projective system can be written
as a mapping for a single variable. We show that this mapping is a member of
the family of multilinear systems (which is not integrable in general). The
continuous limit of multilinear mappings is also discussed.Comment: Plain Tex file, 14 pages, no figur
Adaptive Higher-order Spectral Estimators
Many applications involve estimation of a signal matrix from a noisy data
matrix. In such cases, it has been observed that estimators that shrink or
truncate the singular values of the data matrix perform well when the signal
matrix has approximately low rank. In this article, we generalize this approach
to the estimation of a tensor of parameters from noisy tensor data. We develop
new classes of estimators that shrink or threshold the mode-specific singular
values from the higher-order singular value decomposition. These classes of
estimators are indexed by tuning parameters, which we adaptively choose from
the data by minimizing Stein's unbiased risk estimate. In particular, this
procedure provides a way to estimate the multilinear rank of the underlying
signal tensor. Using simulation studies under a variety of conditions, we show
that our estimators perform well when the mean tensor has approximately low
multilinear rank, and perform competitively when the signal tensor does not
have approximately low multilinear rank. We illustrate the use of these methods
in an application to multivariate relational data.Comment: 29 pages, 3 figure
Blind Multilinear Identification
We discuss a technique that allows blind recovery of signals or blind
identification of mixtures in instances where such recovery or identification
were previously thought to be impossible: (i) closely located or highly
correlated sources in antenna array processing, (ii) highly correlated
spreading codes in CDMA radio communication, (iii) nearly dependent spectra in
fluorescent spectroscopy. This has important implications --- in the case of
antenna array processing, it allows for joint localization and extraction of
multiple sources from the measurement of a noisy mixture recorded on multiple
sensors in an entirely deterministic manner. In the case of CDMA, it allows the
possibility of having a number of users larger than the spreading gain. In the
case of fluorescent spectroscopy, it allows for detection of nearly identical
chemical constituents. The proposed technique involves the solution of a
bounded coherence low-rank multilinear approximation problem. We show that
bounded coherence allows us to establish existence and uniqueness of the
recovered solution. We will provide some statistical motivation for the
approximation problem and discuss greedy approximation bounds. To provide the
theoretical underpinnings for this technique, we develop a corresponding theory
of sparse separable decompositions of functions, including notions of rank and
nuclear norm that specialize to the usual ones for matrices and operators but
apply to also hypermatrices and tensors.Comment: 20 pages, to appear in IEEE Transactions on Information Theor
Bayesian Robust Tensor Factorization for Incomplete Multiway Data
We propose a generative model for robust tensor factorization in the presence
of both missing data and outliers. The objective is to explicitly infer the
underlying low-CP-rank tensor capturing the global information and a sparse
tensor capturing the local information (also considered as outliers), thus
providing the robust predictive distribution over missing entries. The
low-CP-rank tensor is modeled by multilinear interactions between multiple
latent factors on which the column sparsity is enforced by a hierarchical
prior, while the sparse tensor is modeled by a hierarchical view of Student-
distribution that associates an individual hyperparameter with each element
independently. For model learning, we develop an efficient closed-form
variational inference under a fully Bayesian treatment, which can effectively
prevent the overfitting problem and scales linearly with data size. In contrast
to existing related works, our method can perform model selection automatically
and implicitly without need of tuning parameters. More specifically, it can
discover the groundtruth of CP rank and automatically adapt the sparsity
inducing priors to various types of outliers. In addition, the tradeoff between
the low-rank approximation and the sparse representation can be optimized in
the sense of maximum model evidence. The extensive experiments and comparisons
with many state-of-the-art algorithms on both synthetic and real-world datasets
demonstrate the superiorities of our method from several perspectives.Comment: in IEEE Transactions on Neural Networks and Learning Systems, 201
- …