1,244 research outputs found

    Sparse Modeling for Image and Vision Processing

    Get PDF
    In recent years, a large amount of multi-disciplinary research has been conducted on sparse models and their applications. In statistics and machine learning, the sparsity principle is used to perform model selection---that is, automatically selecting a simple model among a large collection of them. In signal processing, sparse coding consists of representing data with linear combinations of a few dictionary elements. Subsequently, the corresponding tools have been widely adopted by several scientific communities such as neuroscience, bioinformatics, or computer vision. The goal of this monograph is to offer a self-contained view of sparse modeling for visual recognition and image processing. More specifically, we focus on applications where the dictionary is learned and adapted to data, yielding a compact representation that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics and Visio

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    Keyframe-based monocular SLAM: design, survey, and future directions

    Get PDF
    Extensive research in the field of monocular SLAM for the past fifteen years has yielded workable systems that found their way into various applications in robotics and augmented reality. Although filter-based monocular SLAM systems were common at some time, the more efficient keyframe-based solutions are becoming the de facto methodology for building a monocular SLAM system. The objective of this paper is threefold: first, the paper serves as a guideline for people seeking to design their own monocular SLAM according to specific environmental constraints. Second, it presents a survey that covers the various keyframe-based monocular SLAM systems in the literature, detailing the components of their implementation, and critically assessing the specific strategies made in each proposed solution. Third, the paper provides insight into the direction of future research in this field, to address the major limitations still facing monocular SLAM; namely, in the issues of illumination changes, initialization, highly dynamic motion, poorly textured scenes, repetitive textures, map maintenance, and failure recovery

    Using the Earth Mover's Distance for perceptually meaningful visual saliency

    Get PDF
    Visual saliency is one of the mechanisms that guide our visual attention, or where we look. This topic has seen a lot of research in recent years, starting with biologicallyinspired models, followed by the information-theoretic and recently statistical-based models. This dissertation looks at a state-of-the-art statistical model and studies what effects the histogram construction method and histogram distance measures have on detecting saliency. Equi-width histograms, which have constant bin size, equi-depth histograms, which have constant density per bin, and diagonal histograms, whose bin widths are determined from constant diagonal portions of the empirical cumulative distribution function (ecdf), are used to calculate saliency scores on a publicly available dataset. Crossbin distances are introduced and compared with the currently employed bin-to-bin distances by calculating saliency scores on the same dataset. An exhaustive experiment with combinations of all histogram construction methods and histogram distance measures is performed. It was discovered that using the equi-depth histogram is able to improve various saliency metrics. It is also shown that employing cross-bin histogram distances improves the contrast of the resulting saliency maps, making them more perceptually meaningful but lowering their saliency scores in the process. A novel improvement is made to the model which removes the implicit center bias, which also generates more perceptually meaningful saliency maps but lowers saliency scores. A new scoring method is proposed which aims to deal with the perceptual and scoring disparities
    • …
    corecore