250 research outputs found

    Stacking-based Deep Neural Network: Deep Analytic Network on Convolutional Spectral Histogram Features

    Full text link
    Stacking-based deep neural network (S-DNN), in general, denotes a deep neural network (DNN) resemblance in terms of its very deep, feedforward network architecture. The typical S-DNN aggregates a variable number of individually learnable modules in series to assemble a DNN-alike alternative to the targeted object recognition tasks. This work likewise devises an S-DNN instantiation, dubbed deep analytic network (DAN), on top of the spectral histogram (SH) features. The DAN learning principle relies on ridge regression, and some key DNN constituents, specifically, rectified linear unit, fine-tuning, and normalization. The DAN aptitude is scrutinized on three repositories of varying domains, including FERET (faces), MNIST (handwritten digits), and CIFAR10 (natural objects). The empirical results unveil that DAN escalates the SH baseline performance over a sufficiently deep layer.Comment: 5 page

    Structured sparsity-inducing norms through submodular functions

    Get PDF
    Sparse methods for supervised learning aim at finding good linear predictors from as few variables as possible, i.e., with small cardinality of their supports. This combinatorial selection problem is often turned into a convex optimization problem by replacing the cardinality function by its convex envelope (tightest convex lower bound), in this case the L1-norm. In this paper, we investigate more general set-functions than the cardinality, that may incorporate prior knowledge or structural constraints which are common in many applications: namely, we show that for nondecreasing submodular set-functions, the corresponding convex envelope can be obtained from its \lova extension, a common tool in submodular analysis. This defines a family of polyhedral norms, for which we provide generic algorithmic tools (subgradients and proximal operators) and theoretical results (conditions for support recovery or high-dimensional inference). By selecting specific submodular functions, we can give a new interpretation to known norms, such as those based on rank-statistics or grouped norms with potentially overlapping groups; we also define new norms, in particular ones that can be used as non-factorial priors for supervised learning

    Non-convex regularization in remote sensing

    Get PDF
    In this paper, we study the effect of different regularizers and their implications in high dimensional image classification and sparse linear unmixing. Although kernelization or sparse methods are globally accepted solutions for processing data in high dimensions, we present here a study on the impact of the form of regularization used and its parametrization. We consider regularization via traditional squared (2) and sparsity-promoting (1) norms, as well as more unconventional nonconvex regularizers (p and Log Sum Penalty). We compare their properties and advantages on several classification and linear unmixing tasks and provide advices on the choice of the best regularizer for the problem at hand. Finally, we also provide a fully functional toolbox for the community.Comment: 11 pages, 11 figure

    Deep Learning Meets Sparse Regularization: A Signal Processing Perspective

    Full text link
    Deep learning has been wildly successful in practice and most state-of-the-art machine learning methods are based on neural networks. Lacking, however, is a rigorous mathematical theory that adequately explains the amazing performance of deep neural networks. In this article, we present a relatively new mathematical framework that provides the beginning of a deeper understanding of deep learning. This framework precisely characterizes the functional properties of neural networks that are trained to fit to data. The key mathematical tools which support this framework include transform-domain sparse regularization, the Radon transform of computed tomography, and approximation theory, which are all techniques deeply rooted in signal processing. This framework explains the effect of weight decay regularization in neural network training, the use of skip connections and low-rank weight matrices in network architectures, the role of sparsity in neural networks, and explains why neural networks can perform well in high-dimensional problems

    Stable low-rank matrix recovery via null space properties

    Full text link
    The problem of recovering a matrix of low rank from an incomplete and possibly noisy set of linear measurements arises in a number of areas. In order to derive rigorous recovery results, the measurement map is usually modeled probabilistically. We derive sufficient conditions on the minimal amount of measurements ensuring recovery via convex optimization. We establish our results via certain properties of the null space of the measurement map. In the setting where the measurements are realized as Frobenius inner products with independent standard Gaussian random matrices we show that 10r(n1+n2)10 r (n_1 + n_2) measurements are enough to uniformly and stably recover an n1×n2n_1 \times n_2 matrix of rank at most rr. We then significantly generalize this result by only requiring independent mean-zero, variance one entries with four finite moments at the cost of replacing 1010 by some universal constant. We also study the case of recovering Hermitian rank-rr matrices from measurement matrices proportional to rank-one projectors. For mCrnm \geq C r n rank-one projective measurements onto independent standard Gaussian vectors, we show that nuclear norm minimization uniformly and stably reconstructs Hermitian rank-rr matrices with high probability. Next, we partially de-randomize this by establishing an analogous statement for projectors onto independent elements of a complex projective 4-designs at the cost of a slightly higher sampling rate mCrnlognm \geq C rn \log n. Moreover, if the Hermitian matrix to be recovered is known to be positive semidefinite, then we show that the nuclear norm minimization approach may be replaced by minimizing the 2\ell_2-norm of the residual subject to the positive semidefinite constraint. Then no estimate of the noise level is required a priori. We discuss applications in quantum physics and the phase retrieval problem.Comment: 26 page
    corecore