1,545 research outputs found
Provable Sparse Tensor Decomposition
We propose a novel sparse tensor decomposition method, namely Tensor
Truncated Power (TTP) method, that incorporates variable selection into the
estimation of decomposition components. The sparsity is achieved via an
efficient truncation step embedded in the tensor power iteration. Our method
applies to a broad family of high dimensional latent variable models, including
high dimensional Gaussian mixture and mixtures of sparse regressions. A
thorough theoretical investigation is further conducted. In particular, we show
that the final decomposition estimator is guaranteed to achieve a local
statistical rate, and further strengthen it to the global statistical rate by
introducing a proper initialization procedure. In high dimensional regimes, the
obtained statistical rate significantly improves those shown in the existing
non-sparse decomposition methods. The empirical advantages of TTP are confirmed
in extensive simulated results and two real applications of click-through rate
prediction and high-dimensional gene clustering.Comment: To Appear in JRSS-
Inverse Projection Representation and Category Contribution Rate for Robust Tumor Recognition
Sparse representation based classification (SRC) methods have achieved
remarkable results. SRC, however, still suffer from requiring enough training
samples, insufficient use of test samples and instability of representation. In
this paper, a stable inverse projection representation based classification
(IPRC) is presented to tackle these problems by effectively using test samples.
An IPR is firstly proposed and its feasibility and stability are analyzed. A
classification criterion named category contribution rate is constructed to
match the IPR and complete classification. Moreover, a statistical measure is
introduced to quantify the stability of representation-based classification
methods. Based on the IPRC technique, a robust tumor recognition framework is
presented by interpreting microarray gene expression data, where a two-stage
hybrid gene selection method is introduced to select informative genes.
Finally, the functional analysis of candidate's pathogenicity-related genes is
given. Extensive experiments on six public tumor microarray gene expression
datasets demonstrate the proposed technique is competitive with
state-of-the-art methods.Comment: 14 pages, 19 figures, 10 table
Posterior Contraction Rates of the Phylogenetic Indian Buffet Processes
By expressing prior distributions as general stochastic processes,
nonparametric Bayesian methods provide a flexible way to incorporate prior
knowledge and constrain the latent structure in statistical inference. The
Indian buffet process (IBP) is such an example that can be used to define a
prior distribution on infinite binary features, where the exchangeability among
subjects is assumed. The phylogenetic Indian buffet process (pIBP), a
derivative of IBP, enables the modeling of non-exchangeability among subjects
through a stochastic process on a rooted tree, which is similar to that used in
phylogenetics, to describe relationships among the subjects. In this paper, we
study the theoretical properties of IBP and pIBP under a binary factor model.
We establish the posterior contraction rates for both IBP and pIBP and
substantiate the theoretical results through simulation studies. This is the
first work addressing the frequentist property of the posterior behaviors of
IBP and pIBP. We also demonstrated its practical usefulness by applying pIBP
prior to a real data example arising in the field of cancer genomics where the
exchangeability among subjects is violated
- …