12 research outputs found

    Robust Lasso-Zero for sparse corruption and model selection with missing covariates

    Full text link
    We propose Robust Lasso-Zero, an extension of the Lasso-Zero methodology [Descloux and Sardy, 2018], initially introduced for sparse linear models, to the sparse corruptions problem. We give theoretical guarantees on the sign recovery of the parameters for a slightly simplified version of the estimator, called Thresholded Justice Pursuit. The use of Robust Lasso-Zero is showcased for variable selection with missing values in the covariates. In addition to not requiring the specification of a model for the covariates, nor estimating their covariance matrix or the noise variance, the method has the great advantage of handling missing not-at random values without specifying a parametric model. Numerical experiments and a medical application underline the relevance of Robust Lasso-Zero in such a context with few available competitors. The method is easy to use and implemented in the R library lass0

    Sparse Support Recovery with Thresholded Basis Pursuit and Lasso-Zero, and an Extension to Handle Missing Data

    No full text
    Cette thèse porte sur le problème de sélection de variables dans le modèle de régression linéaire en haute dimension. Elle complète la littérature existante sur l'estimateur Thresholded Basis Pursuit (TBP) et, plus généralement, sur l'idée sous-jacente consistant à surajuster le modèle, puis à seuiller les coefficients obtenus. Dans un premier temps, de nouvelles garanties théoriques pour la reconstruction du vecteur de signes par TBP sont démontrées. Dans un deuxième temps, une extension de TBP, appelée Lasso-Zero, est introduite. La nouveauté réside dans l'utilisation de plusieurs dictionnaires de bruit, concaténés à la matrice de régression afin de prendre en compte la présence de bruit lors de l'étape de surajustement. Enfin, une extension robuste de Lasso-Zero est proposée pour la sélection de variables en présence de données manquantes

    Robust Lasso-Zero for sparse corruption and model selection with missing covariates

    No full text
    International audienceWe propose Robust Lasso-Zero, an extension of the Lasso-Zero methodology, initially introduced for sparse linear models, to the sparse corruptions problem. We give theoretical guarantees on the sign recovery of the parameters for a slightly simplified version of the estimator, called Thresholded Justice Pursuit. The use of Robust Lasso-Zero is showcased for variable selection with missing values in the covariates. In addition to not requiring the specification of a model for the covariates, nor estimating their covariance matrix or the noise variance, the method has the great advantage of handling missing not-at random values without specifying a parametric model. Numerical experiments and a medical application underline the relevance of Robust Lasso-Zero in such a context with few available competitors. The method is easy to use and implemented in the R library lass0

    Fitted lasso logistic regression for distinction between lung cancer and controls.

    No full text
    <p>(A) ROC curve and AUC value for prediction on the whole sample (= training set) is shown for the optimal model based on 27-peptides (λ<sub>min</sub>), AUC = 0.961. (B) ROC curve of the second-best 18-peptide model is shown (λ<sub>1se</sub>), AUC = 0.931. (C-D) Fitted scores (estimation of the probability that the patient has lung cancer) are shown for 27-peptide model (C) with 18-peptide model (D).</p

    Comparison of ELISA signal intensities of controls and lung cancer patients using peptides and protein fragments for capturing anti-BARD1 autoimmune antibodies.

    No full text
    <p>(A) Log transformed signal intensities for peptides are presented in increasing order of their p-values obtained applying the Wilcoxon’s rank-sum test from left to right. For 15 peptides the resulting p-values were lower than 0.01, and for 29 peptides lower than 0.05. The signal intensities tend to be significantly higher for lung cancer samples than for controls. Ten peptides scored highest in 27 peptides model and common with 18 peptides model are marked with red stars. (B) Log transformed signal intensities of fragments are presented aligned in increasing order of their p-values from left to right.</p

    Comparison of ROC curves for in early or limited disease and late lung cancer.

    No full text
    <p>During each of the 200 repetitions of random sub-sampling validation (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182356#pone.0182356.g004" target="_blank">Fig 4</a>), the lung cancer patients of the training set were divided into two groups according to stage of the disease I-III and IV and 200 ROC curves were computed for each group separately. The average ROC curves are shown and respective AUCs. The error bars indicate ± 1 standard error of the mean.</p

    BARD1 protein structure and epitopes.

    No full text
    <p>The top line shows FL BARD1 exon structure is shown with protein motives RING, Ankyrin (ANK) repeats, and BRCT domains indicated. Grey lines underneath show BARD1 isoforms with dotted lines representing the respective missing exons. Brown bars on the bottom represent protein fragments used for ELISA experiments.</p
    corecore