61,734 research outputs found
Self-Dictionary Sparse Regression for Hyperspectral Unmixing: Greedy Pursuit and Pure Pixel Search are Related
This paper considers a recently emerged hyperspectral unmixing formulation
based on sparse regression of a self-dictionary multiple measurement vector
(SD-MMV) model, wherein the measured hyperspectral pixels are used as the
dictionary. Operating under the pure pixel assumption, this SD-MMV formalism is
special in that it allows simultaneous identification of the endmember spectral
signatures and the number of endmembers. Previous SD-MMV studies mainly focus
on convex relaxations. In this study, we explore the alternative of greedy
pursuit, which generally provides efficient and simple algorithms. In
particular, we design a greedy SD-MMV algorithm using simultaneous orthogonal
matching pursuit. Intriguingly, the proposed greedy algorithm is shown to be
closely related to some existing pure pixel search algorithms, especially, the
successive projection algorithm (SPA). Thus, a link between SD-MMV and pure
pixel search is revealed. We then perform exact recovery analyses, and prove
that the proposed greedy algorithm is robust to noise---including its
identification of the (unknown) number of endmembers---under a sufficiently low
noise level. The identification performance of the proposed greedy algorithm is
demonstrated through both synthetic and real-data experiments
A machine learning pipeline for discriminant pathways identification
Motivation: Identifying the molecular pathways more prone to disruption
during a pathological process is a key task in network medicine and, more in
general, in systems biology.
Results: In this work we propose a pipeline that couples a machine learning
solution for molecular profiling with a recent network comparison method. The
pipeline can identify changes occurring between specific sub-modules of
networks built in a case-control biomarker study, discriminating key groups of
genes whose interactions are modified by an underlying condition. The proposal
is independent from the classification algorithm used. Three applications on
genomewide data are presented regarding children susceptibility to air
pollution and two neurodegenerative diseases: Parkinson's and Alzheimer's.
Availability: Details about the software used for the experiments discussed
in this paper are provided in the Appendix
Graphical LASSO Based Model Selection for Time Series
We propose a novel graphical model selection (GMS) scheme for
high-dimensional stationary time series or discrete time process. The method is
based on a natural generalization of the graphical LASSO (gLASSO), introduced
originally for GMS based on i.i.d. samples, and estimates the conditional
independence graph (CIG) of a time series from a finite length observation. The
gLASSO for time series is defined as the solution of an l1-regularized maximum
(approximate) likelihood problem. We solve this optimization problem using the
alternating direction method of multipliers (ADMM). Our approach is
nonparametric as we do not assume a finite dimensional (e.g., an
autoregressive) parametric model for the observed process. Instead, we require
the process to be sufficiently smooth in the spectral domain. For Gaussian
processes, we characterize the performance of our method theoretically by
deriving an upper bound on the probability that our algorithm fails to
correctly identify the CIG. Numerical experiments demonstrate the ability of
our method to recover the correct CIG from a limited amount of samples
Provable Sparse Tensor Decomposition
We propose a novel sparse tensor decomposition method, namely Tensor
Truncated Power (TTP) method, that incorporates variable selection into the
estimation of decomposition components. The sparsity is achieved via an
efficient truncation step embedded in the tensor power iteration. Our method
applies to a broad family of high dimensional latent variable models, including
high dimensional Gaussian mixture and mixtures of sparse regressions. A
thorough theoretical investigation is further conducted. In particular, we show
that the final decomposition estimator is guaranteed to achieve a local
statistical rate, and further strengthen it to the global statistical rate by
introducing a proper initialization procedure. In high dimensional regimes, the
obtained statistical rate significantly improves those shown in the existing
non-sparse decomposition methods. The empirical advantages of TTP are confirmed
in extensive simulated results and two real applications of click-through rate
prediction and high-dimensional gene clustering.Comment: To Appear in JRSS-
A Constrained L1 Minimization Approach to Sparse Precision Matrix Estimation
A constrained L1 minimization method is proposed for estimating a sparse
inverse covariance matrix based on a sample of iid -variate random
variables. The resulting estimator is shown to enjoy a number of desirable
properties. In particular, it is shown that the rate of convergence between the
estimator and the true -sparse precision matrix under the spectral norm is
when the population distribution has either exponential-type
tails or polynomial-type tails. Convergence rates under the elementwise
norm and Frobenius norm are also presented. In addition, graphical
model selection is considered. The procedure is easily implementable by linear
programming. Numerical performance of the estimator is investigated using both
simulated and real data. In particular, the procedure is applied to analyze a
breast cancer dataset. The procedure performs favorably in comparison to
existing methods.Comment: To appear in Journal of the American Statistical Associatio
- …