9,520 research outputs found

    Robust Sparse Canonical Correlation Analysis

    Full text link
    Canonical correlation analysis (CCA) is a multivariate statistical method which describes the associations between two sets of variables. The objective is to find linear combinations of the variables in each data set having maximal correlation. This paper discusses a method for Robust Sparse CCA. Sparse estimation produces canonical vectors with some of their elements estimated as exactly zero. As such, their interpretability is improved. We also robustify the method such that it can cope with outliers in the data. To estimate the canonical vectors, we convert the CCA problem into an alternating regression framework, and use the sparse Least Trimmed Squares estimator. We illustrate the good performance of the Robust Sparse CCA method in several simulation studies and two real data examples

    Correlated Component Analysis for diffuse component separation with error estimation on simulated Planck polarization data

    Get PDF
    We present a data analysis pipeline for CMB polarization experiments, running from multi-frequency maps to the power spectra. We focus mainly on component separation and, for the first time, we work out the covariance matrix accounting for errors associated to the separation itself. This allows us to propagate such errors and evaluate their contributions to the uncertainties on the final products.The pipeline is optimized for intermediate and small scales, but could be easily extended to lower multipoles. We exploit realistic simulations of the sky, tailored for the Planck mission. The component separation is achieved by exploiting the Correlated Component Analysis in the harmonic domain, that we demonstrate to be superior to the real-space application (Bonaldi et al. 2006). We present two techniques to estimate the uncertainties on the spectral parameters of the separated components. The component separation errors are then propagated by means of Monte Carlo simulations to obtain the corresponding contributions to uncertainties on the component maps and on the CMB power spectra. For the Planck polarization case they are found to be subdominant compared to noise.Comment: 17 pages, accepted in MNRA

    A D.C. Programming Approach to the Sparse Generalized Eigenvalue Problem

    Full text link
    In this paper, we consider the sparse eigenvalue problem wherein the goal is to obtain a sparse solution to the generalized eigenvalue problem. We achieve this by constraining the cardinality of the solution to the generalized eigenvalue problem and obtain sparse principal component analysis (PCA), sparse canonical correlation analysis (CCA) and sparse Fisher discriminant analysis (FDA) as special cases. Unlike the 1\ell_1-norm approximation to the cardinality constraint, which previous methods have used in the context of sparse PCA, we propose a tighter approximation that is related to the negative log-likelihood of a Student's t-distribution. The problem is then framed as a d.c. (difference of convex functions) program and is solved as a sequence of convex programs by invoking the majorization-minimization method. The resulting algorithm is proved to exhibit \emph{global convergence} behavior, i.e., for any random initialization, the sequence (subsequence) of iterates generated by the algorithm converges to a stationary point of the d.c. program. The performance of the algorithm is empirically demonstrated on both sparse PCA (finding few relevant genes that explain as much variance as possible in a high-dimensional gene dataset) and sparse CCA (cross-language document retrieval and vocabulary selection for music retrieval) applications.Comment: 40 page
    corecore