Sparsity by Redundancy: Solving L1L_1 with a Simple Reparametrization

Abstract

We identify and prove a general principle: L1L_1 sparsity can be achieved using a redundant parametrization plus L2L_2 penalty. Our results lead to a simple algorithm, \textit{spred}, that seamlessly integrates L1L_1 regularization into any modern deep learning framework. Practically, we demonstrate (1) the efficiency of \textit{spred} in optimizing conventional tasks such as lasso and sparse coding, (2) benchmark our method for nonlinear feature selection of six gene selection tasks, and (3) illustrate the usage of the method for achieving structured and unstructured sparsity in deep learning in an end-to-end manner. Conceptually, our result bridges the gap in understanding the inductive bias of the redundant parametrization common in deep learning and conventional statistical learning.Comment: preprin

    Similar works

    Full text

    thumbnail-image

    Available Versions