1,630 research outputs found

    A consistent and numerically efficient variable selection method for sparse Poisson regression with applications to learning and signal recovery

    Get PDF
    We propose an adaptive 1-penalized estimator in the framework of Generalized Linear Models with identity-link and Poisson data, by taking advantage of a globally quadratic approximation of the Kullback-Leibler divergence. We prove that this approximation is asymptotically unbiased and that the proposed estimator has the variable selection consistency property in a deterministic matrix design framework. Moreover, we present a numerically efficient strategy for the computation of the proposed estimator, making it suitable for the analysis of massive counts datasets. We show with two numerical experiments that the method can be applied both to statistical learning and signal recovery problems

    Learning and inverse problems: from theory to solar physics applications

    Get PDF
    The problem of approximating a function from a set of discrete measurements has been extensively studied since the seventies. Our theoretical analysis proposes a formalization of the function approximation problem which allows dealing with inverse problems and supervised kernel learning as two sides of the same coin. The proposed formalization takes into account arbitrary noisy data (deterministically or statistically defined), arbitrary loss functions (possibly seen as a log-likelihood), handling both direct and indirect measurements. The core idea of this part relies on the analogy between statistical learning and inverse problems. One of the main evidences of the connection occurring across these two areas is that regularization methods, usually developed for ill-posed inverse problems, can be used for solving learning problems. Furthermore, spectral regularization convergence rate analyses provided in these two areas, share the same source conditions but are carried out with either increasing number of samples in learning theory or decreasing noise level in inverse problems. Even more in general, regularization via sparsity-enhancing methods is widely used in both areas and it is possible to apply well-known ell1ell_1-penalized methods for solving both learning and inverse problems. In the first part of the Thesis, we analyze such a connection at three levels: (1) at an infinite dimensional level, we define an abstract function approximation problem from which the two problems can be derived; (2) at a discrete level, we provide a unified formulation according to a suitable definition of sampling; and (3) at a convergence rates level, we provide a comparison between convergence rates given in the two areas, by quantifying the relation between the noise level and the number of samples. In the second part of the Thesis, we focus on a specific class of problems where measurements are distributed according to a Poisson law. We provide a data-driven, asymptotically unbiased, and globally quadratic approximation of the Kullback-Leibler divergence and we propose Lasso-type methods for solving sparse Poisson regression problems, named PRiL for Poisson Reweighed Lasso and an adaptive version of this method, named APRiL for Adaptive Poisson Reweighted Lasso, proving consistency properties in estimation and variable selection, respectively. Finally we consider two problems in solar physics: 1) the problem of forecasting solar flares (learning application) and 2) the desaturation problem of solar flare images (inverse problem application). The first application concerns the prediction of solar storms using images of the magnetic field on the sun, in particular physics-based features extracted from active regions from data provided by Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO). The second application concerns the reconstruction problem of Extreme Ultra-Violet (EUV) solar flare images recorded by a second instrument on board SDO, the Atmospheric Imaging Assembly (AIA). We propose a novel sparsity-enhancing method SE-DESAT to reconstruct images affected by saturation and diffraction, without using any a priori estimate of the background solar activity

    Inference of Sparse Networks with Unobserved Variables. Application to Gene Regulatory Networks

    Full text link
    Networks are a unifying framework for modeling complex systems and network inference problems are frequently encountered in many fields. Here, I develop and apply a generative approach to network inference (RCweb) for the case when the network is sparse and the latent (not observed) variables affect the observed ones. From all possible factor analysis (FA) decompositions explaining the variance in the data, RCweb selects the FA decomposition that is consistent with a sparse underlying network. The sparsity constraint is imposed by a novel method that significantly outperforms (in terms of accuracy, robustness to noise, complexity scaling, and computational efficiency) Bayesian methods and MLE methods using l1 norm relaxation such as K-SVD and l1--based sparse principle component analysis (PCA). Results from simulated models demonstrate that RCweb recovers exactly the model structures for sparsity as low (as non-sparse) as 50% and with ratio of unobserved to observed variables as high as 2. RCweb is robust to noise, with gradual decrease in the parameter ranges as the noise level increases.Comment: 8 pages, 5 figure
    corecore