135,902 research outputs found

    Talk 1: Convolutional neural networks against the curse of dimensionality

    Get PDF
    Convolutional Neural Networks are a powerful class of non-linear representations that have shown through numerous supervised learning tasks their ability to extract rich information from images, speech and text, with excellent statistical generalization. These are examples of truly high-dimensional signals, in which classical statistical models suffer from the curse of dimensionality, referring to their inability to generalize well unless provided with exponentially large amounts of training data. In this talk we will start by studying statistical models defined from wavelet scattering networks, a class of CNNs where the convolutional filter banks are given by complex, multi-resolution wavelet families. The reasons for such success lie on their ability to preserve discriminative information while being stable with respect to high-dimensional deformations, providing a framework that partially extends to trained CNNs. We will give conditions under which signals can be recovered from their scattering coefficients, and will discuss a family of Gibbs processes defined by CNN sufficient statistics, from which one can sample image and auditory textures. Although the scattering recovery is non-convex and corresponds to a generalized phase recovery problem, gradient descent algorithms show good empirical performance and enjoy weak convergence properties. We will discuss connections with non-linear compressed sensing, applications to texture synthesis, inverse problems such as super-resolution, as well as an application to sentence modeling, where convolutions are generalized using associative trees to generate robust sentence representations

    Geodesics on the manifold of multivariate generalized Gaussian distributions with an application to multicomponent texture discrimination

    Get PDF
    We consider the Rao geodesic distance (GD) based on the Fisher information as a similarity measure on the manifold of zero-mean multivariate generalized Gaussian distributions (MGGD). The MGGD is shown to be an adequate model for the heavy-tailed wavelet statistics in multicomponent images, such as color or multispectral images. We discuss the estimation of MGGD parameters using various methods. We apply the GD between MGGDs to color texture discrimination in several classification experiments, taking into account the correlation structure between the spectral bands in the wavelet domain. We compare the performance, both in terms of texture discrimination capability and computational load, of the GD and the Kullback-Leibler divergence (KLD). Likewise, both uni- and multivariate generalized Gaussian models are evaluated, characterized by a fixed or a variable shape parameter. The modeling of the interband correlation significantly improves classification efficiency, while the GD is shown to consistently outperform the KLD as a similarity measure

    Multivariate texture discrimination based on geodesics to class centroids on a generalized Gaussian Manifold

    Get PDF
    A texture discrimination scheme is proposed wherein probability distributions are deployed on a probabilistic manifold for modeling the wavelet statistics of images. We consider the Rao geodesic distance (GD) to the class centroid for texture discrimination in various classification experiments. We compare the performance of GD to class centroid with the Euclidean distance in a similar context, both in terms of accuracy and computational complexity. Also, we compare our proposed classification scheme with the k-nearest neighbor algorithm. Univariate and multivariate Gaussian and Laplace distributions, as well as generalized Gaussian distributions with variable shape parameter are each evaluated as a statistical model for the wavelet coefficients. The GD to the centroid outperforms the Euclidean distance and yields superior discrimination compared to the k-nearest neighbor approach

    Nonlinear unmixing of hyperspectral images: Models and algorithms

    Get PDF
    When considering the problem of unmixing hyperspectral images, most of the literature in the geoscience and image processing areas relies on the widely used linear mixing model (LMM). However, the LMM may be not valid, and other nonlinear models need to be considered, for instance, when there are multiscattering effects or intimate interactions. Consequently, over the last few years, several significant contributions have been proposed to overcome the limitations inherent in the LMM. In this article, we present an overview of recent advances in nonlinear unmixing modeling

    Unsupervised Visual and Textual Information Fusion in Multimedia Retrieval - A Graph-based Point of View

    Full text link
    Multimedia collections are more than ever growing in size and diversity. Effective multimedia retrieval systems are thus critical to access these datasets from the end-user perspective and in a scalable way. We are interested in repositories of image/text multimedia objects and we study multimodal information fusion techniques in the context of content based multimedia information retrieval. We focus on graph based methods which have proven to provide state-of-the-art performances. We particularly examine two of such methods : cross-media similarities and random walk based scores. From a theoretical viewpoint, we propose a unifying graph based framework which encompasses the two aforementioned approaches. Our proposal allows us to highlight the core features one should consider when using a graph based technique for the combination of visual and textual information. We compare cross-media and random walk based results using three different real-world datasets. From a practical standpoint, our extended empirical analysis allow us to provide insights and guidelines about the use of graph based methods for multimodal information fusion in content based multimedia information retrieval.Comment: An extended version of the paper: Visual and Textual Information Fusion in Multimedia Retrieval using Semantic Filtering and Graph based Methods, by J. Ah-Pine, G. Csurka and S. Clinchant, submitted to ACM Transactions on Information System

    Image analysis and statistical modelling for measurement and quality assessment of ornamental horticulture crops in glasshouses

    Get PDF
    Image analysis for ornamental crops is discussed with examples from the bedding plant industry. Feed-forward artificial neural networks are used to segment top and side view images of three contrasting species of bedding plants. The segmented images provide objective measurements of leaf and flower cover, colour, uniformity and leaf canopy height. On each imaging occasion, each pack was scored for quality by an assessor panel and it is shown that image analysis can explain 88.5%, 81.7% and 70.4% of the panel quality scores for the three species, respectively. Stereoscopy for crop height and uniformity is outlined briefly. The methods discussed here could be used for crop grading at marketing or for monitoring and assessment of growing crops within a glasshouse during all stages of production
    corecore