23 research outputs found

    DC Proximal Newton for Non-Convex Optimization Problems

    Get PDF
    We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are non-convex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. The algorithm consists in obtaining a descent direction from an approximation of the loss function and then in performing a line search to ensure sufficient descent. A theoretical analysis is provided showing that the iterates of the proposed algorithm {admit} as limit points stationary points of the DC objective function. Numerical experiments show that our approach is more efficient than current state of the art for a problem with a convex loss functions and non-convex regularizer. We have also illustrated the benefit of our algorithm in high-dimensional transductive learning problem where both loss function and regularizers are non-convex

    Non-convex regularization in remote sensing

    Get PDF
    In this paper, we study the effect of different regularizers and their implications in high dimensional image classification and sparse linear unmixing. Although kernelization or sparse methods are globally accepted solutions for processing data in high dimensions, we present here a study on the impact of the form of regularization used and its parametrization. We consider regularization via traditional squared (2) and sparsity-promoting (1) norms, as well as more unconventional nonconvex regularizers (p and Log Sum Penalty). We compare their properties and advantages on several classification and linear unmixing tasks and provide advices on the choice of the best regularizer for the problem at hand. Finally, we also provide a fully functional toolbox for the community.Comment: 11 pages, 11 figure

    Nonconvex Regularization in Remote Sensing

    No full text

    \ell_p-\ell_q penalty for sparse linear and sparse multiple kernel multi-task learning,

    No full text
    13International audienceRecently, there has been a lot of interest around multi-task learning (MTL) problem with the constraints that tasks should share a common sparsity profile. Such a problem can be addressed through a regularization framework where the regularizer induces a joint-sparsity pattern between task decision functions. We follow this principled framework and focus on ℓp−ℓq\ell_p-\ell_q (with 0≤p≤10 \leq p \leq 1 and 1≤q≤2 1 \leq q \leq 2) mixed-norms as sparsity- inducing penalties. Our motivation for addressing such a larger class of penalty is to adapt the penalty to a problem at hand leading thus to better performances and better sparsity pattern. For solving the problem in the general multiple kernel case, we first derive a variational formulation of the ℓ1−ℓq\ell_1-\ell_q penalty which helps up in proposing an alternate optimization algorithm. Although very simple, the latter algorithm provably converges to the global minimum of the ℓ1−ℓq\ell_1-\ell_q penalized problem. For the linear case, we extend existing works considering accelerated proximal gradient to this penalty. Our contribution in this context is to provide an efficient scheme for computing the ℓ1−ℓq\ell_1-\ell_q proximal operator. Then, for the more general case when 0<p<10 < p < 1, we solve the resulting non-convex problem through a majorization-minimization approach. The resulting algorithm is an iterative scheme which, at each iteration, solves a weighted ℓ1−ℓq\ell_1-\ell_q sparse MTL problem. Empirical evidences from toy dataset and real-word datasets dealing with BCI single trial EEG classification and protein subcellular localization show the benefit of the proposed approaches and algorithms

    Optimal Transport with Adaptive Regularisation

    No full text
    Regularising the primal formulation of optimal transport (OT) with a strictly convex term leads to enhanced numerical complexity and a denser transport plan. Many formulations impose a global constraint on the transport plan, for instance by relying on entropic regularisation. As it is more expensive to diffuse mass for outlier points compared to central ones, this typically results in a significant imbalance in the way mass is spread across the points. This can be detrimental for some applications where a minimum of smoothing is required per point. To remedy this, we introduce OT with Adaptive RegularIsation (OTARI), a new formulation of OT that imposes constraints on the mass going in or/and out of each point. We then showcase the benefits of this approach for domain adaptation

    Optimal Transport with Adaptive Regularisation

    No full text
    Regularising the primal formulation of optimal transport (OT) with a strictly convex term leads to enhanced numerical complexity and a denser transport plan. Many formulations impose a global constraint on the transport plan, for instance by relying on entropic regularisation. As it is more expensive to diffuse mass for outlier points compared to central ones, this typically results in a significant imbalance in the way mass is spread across the points. This can be detrimental for some applications where a minimum of smoothing is required per point. To remedy this, we introduce OT with Adaptive RegularIsation (OTARI), a new formulation of OT that imposes constraints on the mass going in or/and out of each point. We then showcase the benefits of this approach for domain adaptation

    Optimal Transport with Adaptive Regularisation

    No full text
    Regularising the primal formulation of optimal transport (OT) with a strictly convex term leads to enhanced numerical complexity and a denser transport plan. Many formulations impose a global constraint on the transport plan, for instance by relying on entropic regularisation. As it is more expensive to diffuse mass for outlier points compared to central ones, this typically results in a significant imbalance in the way mass is spread across the points. This can be detrimental for some applications where a minimum of smoothing is required per point. To remedy this, we introduce OT with Adaptive RegularIsation (OTARI), a new formulation of OT that imposes constraints on the mass going in or/and out of each point. We then showcase the benefits of this approach for domain adaptation

    Optimal Transport with Adaptive Regularisation

    No full text
    International audienceRegularising the primal formulation of optimal transport (OT) with a strictly convex term leads to enhanced numerical complexity and a denser transport plan. Many formulations impose a global constraint on the transport plan, for instance by relying on entropic regularisation. As it is more expensive to diffuse mass for outlier points compared to central ones, this typically results in a significant imbalance in the way mass is spread across the points. This can be detrimental for some applications where a minimum of smoothing is required per point. To remedy this, we introduce OT with Adaptive RegularIsation (OTARI), a new formulation of OT that imposes constraints on the mass going in or/and out of each point. We then showcase the benefits of this approach for domain adaptation

    Optimal Transport with Adaptive Regularisation

    No full text
    Regularising the primal formulation of optimal transport (OT) with a strictly convex term leads to enhanced numerical complexity and a denser transport plan. Many formulations impose a global constraint on the transport plan, for instance by relying on entropic regularisation. As it is more expensive to diffuse mass for outlier points compared to central ones, this typically results in a significant imbalance in the way mass is spread across the points. This can be detrimental for some applications where a minimum of smoothing is required per point. To remedy this, we introduce OT with Adaptive RegularIsation (OTARI), a new formulation of OT that imposes constraints on the mass going in or/and out of each point. We then showcase the benefits of this approach for domain adaptation

    Unsupervised variable selection for kernel methods in systems biology

    No full text
    National audienceKernel methods have proven to be useful and successful to analyse large-scale multi-omics datasets [Schölkopf et al., 2004]. However, as stated in [Hofmann et al., 2015, Mariette et al., 2017], these methods usually suffer from a lack of interpretability as the information of thousands descriptors is summarized in a few similarity measures, that can be strongly in uenced by a large number of irrelevant descriptors. To address this issue, feature selection is a widely used strategy: it consist in selecting the most promising features during or prior the analysis. However, most existing methods are proposed in a supervised framework [Tibshirani, 1996, Robnik-Sikonja and Kononenko, 2003, Lin and Tang, 2006]. In the unsupervised framework, the number of proposals is much less important, because there is no objective criterion or value on which to tune the quality of a given feature. Proposals thus aim at preserving at best the similarities between individuals like the SPEC approach [Zhao and Liu, 2007] or at recovering a latent cluster structure, like MCFS [Cai et al., 2010], NDFS [Li et al., 2012] and UDFS [Yang et al., 2011]. In this communication, we will present a feature selection algorithm that explicitly takes advantage of the kernel structure in an unsupervised fashion
    corecore