139 research outputs found

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    Unsupervised Learning from Shollow to Deep

    Get PDF
    Machine learning plays a pivotal role in most state-of-the-art systems in many application research domains. With the rising of deep learning, massive labeled data become the solution of feature learning, which enables the model to learn automatically. Unfortunately, the trained deep learning model is hard to adapt to other datasets without fine-tuning, and the applicability of machine learning methods is limited by the amount of available labeled data. Therefore, the aim of this thesis is to alleviate the limitations of supervised learning by exploring algorithms to learn good internal representations, and invariant feature hierarchies from unlabelled data. Firstly, we extend the traditional dictionary learning and sparse coding algorithms onto hierarchical image representations in a principled way. To achieve dictionary atoms capture additional information from extended receptive fields and attain improved descriptive capacity, we present a two-pass multi-resolution cascade framework for dictionary learning and sparse coding. This cascade method allows collaborative reconstructions at different resolutions using only the same dimensional dictionary atoms. The jointly learned dictionary comprises atoms that adapt to the information available at the coarsest layer, where the support of atoms reaches a maximum range, and the residual images, where the supplementary details refine progressively a reconstruction objective. Our method generates flexible and accurate representations using only a small number of coefficients, and is efficient in computation. In the following work, we propose to incorporate the traditional self-expressiveness property into deep learning to explore better representation for subspace clustering. This architecture is built upon deep auto-encoders, which non-linearly map the input data into a latent space. Our key idea is to introduce a novel self-expressive layer between the encoder and the decoder to mimic the ``self-expressiveness'' property that has proven effective in traditional subspace clustering. Being differentiable, our new self-expressive layer provides a simple but effective way to learn pairwise affinities between all data points through a standard back-propagation procedure. Being nonlinear, our neural-network based method is able to cluster data points having complex (often nonlinear) structures. However, Subspace clustering algorithms are notorious for their scalability issues because building and processing large affinity matrices are demanding. We propose two methods to tackle this problem. One method is based on kk-Subspace Clustering, where we introduce a method that simultaneously learns an embedding space along subspaces within it to minimize a notion of reconstruction error, thus addressing the problem of subspace clustering in an end-to-end learning paradigm. This in turn frees us from the need of having an affinity matrix to perform clustering. The other way starts from using a feed forward network to replace the spectral clustering and learn the affinities of each data from "self-expressive" layer. We introduce the Neural Collaborative Subspace Clustering, where it benefits from a classifier which determines whether a pair of points lies on the same subspace under supervision of "self-expressive" layer. Essential to our model is the construction of two affinity matrices, one from the classifier and the other from a notion of subspace self-expressiveness, to supervise training in a collaborative scheme. In summary, we make constributions on how to perform the unsupervised learning in several tasks in this thesis. It starts from traditional sparse coding and dictionary learning perspective in low-level vision. Then, we exploit how to incorporate unsupervised learning in convolutional neural networks without label information and make subspace clustering to large scale dataset. Furthermore, we also extend the clustering on dense prediction task (saliency detection)

    A PDE Approach to Data-driven Sub-Riemannian Geodesics in SE(2)

    Get PDF
    We present a new flexible wavefront propagation algorithm for the boundary value problem for sub-Riemannian (SR) geodesics in the roto-translation group SE(2)=R2ā‹ŠS1SE(2) = \mathbb{R}^2 \rtimes S^1 with a metric tensor depending on a smooth external cost C:SE(2)ā†’[Ī“,1]\mathcal{C}:SE(2) \to [\delta,1], Ī“>0\delta>0, computed from image data. The method consists of a first step where a SR-distance map is computed as a viscosity solution of a Hamilton-Jacobi-Bellman (HJB) system derived via Pontryagin's Maximum Principle (PMP). Subsequent backward integration, again relying on PMP, gives the SR-geodesics. For C=1\mathcal{C}=1 we show that our method produces the global minimizers. Comparison with exact solutions shows a remarkable accuracy of the SR-spheres and the SR-geodesics. We present numerical computations of Maxwell points and cusp points, which we again verify for the uniform cost case C=1\mathcal{C}=1. Regarding image analysis applications, tracking of elongated structures in retinal and synthetic images show that our line tracking generically deals with crossings. We show the benefits of including the sub-Riemannian geometry.Comment: Extended version of SSVM 2015 conference article "Data-driven Sub-Riemannian Geodesics in SE(2)

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. Ā© 2011 IEEE

    Towards Data-Driven Large Scale Scientific Visualization and Exploration

    Get PDF
    Technological advances have enabled us to acquire extremely large datasets but it remains a challenge to store, process, and extract information from them. This dissertation builds upon recent advances in machine learning, visualization, and user interactions to facilitate exploration of large-scale scientific datasets. First, we use data-driven approaches to computationally identify regions of interest in the datasets. Second, we use visual presentation for effective user comprehension. Third, we provide interactions for human users to integrate domain knowledge and semantic information into this exploration process. Our research shows how to extract, visualize, and explore informative regions on very large 2D landscape images, 3D volumetric datasets, high-dimensional volumetric mouse brain datasets with thousands of spatially-mapped gene expression profiles, and geospatial trajectories that evolve over time. The contribution of this dissertation include: (1) We introduce a sliding-window saliency model that discovers regions of user interest in very large images; (2) We develop visual segmentation of intensity-gradient histograms to identify meaningful components from volumetric datasets; (3) We extract boundary surfaces from a wealth of volumetric gene expression mouse brain profiles to personalize the reference brain atlas; (4) We show how to efficiently cluster geospatial trajectories by mapping each sequence of locations to a high-dimensional point with the kernel distance framework. We aim to discover patterns, relationships, and anomalies that would lead to new scientific, engineering, and medical advances. This work represents one of the first steps toward better visual understanding of large-scale scientific data by combining machine learning and human intelligence

    Entropy in Image Analysis II

    Get PDF
    Image analysis is a fundamental task for any application where extracting information from images is required. The analysis requires highly sophisticated numerical and analytical methods, particularly for those applications in medicine, security, and other fields where the results of the processing consist of data of vital importance. This fact is evident from all the articles composing the Special Issue "Entropy in Image Analysis II", in which the authors used widely tested methods to verify their results. In the process of reading the present volume, the reader will appreciate the richness of their methods and applications, in particular for medical imaging and image security, and a remarkable cross-fertilization among the proposed research areas

    Reversible Image Watermarking Using Modified Quadratic Difference Expansion and Hybrid Optimization Technique

    Get PDF
    With increasing copyright violation cases, watermarking of digital images is a very popular solution for securing online media content. Since some sensitive applications require image recovery after watermark extraction, reversible watermarking is widely preferred. This article introduces a Modified Quadratic Difference Expansion (MQDE) and fractal encryption-based reversible watermarking for securing the copyrights of images. First, fractal encryption is applied to watermarks using Tromino's L-shaped theorem to improve security. In addition, Cuckoo Search-Grey Wolf Optimization (CSGWO) is enforced on the cover image to optimize block allocation for inserting an encrypted watermark such that it greatly increases its invisibility. While the developed MQDE technique helps to improve coverage and visual quality, the novel data-driven distortion control unit ensures optimal performance. The suggested approach provides the highest level of protection when retrieving the secret image and original cover image without losing the essential information, apart from improving transparency and capacity without much tradeoff. The simulation results of this approach are superior to existing methods in terms of embedding capacity. With an average PSNR of 67 dB, the method shows good imperceptibility in comparison to other schemes

    3D Mesh Simplification. A survey of algorithms and CAD model simplification tests

    Get PDF
    Simpliļ¬cation of highly detailed CAD models is an important step when CAD models are visualized or by other means utilized in augmented reality applications. Without simpliļ¬cation, CAD models may cause severe processing and storage is- sues especially in mobile devices. In addition, simpliļ¬ed models may have other advantages like better visual clarity or improved reliability when used for visual pose tracking. The geometry of CAD models is invariably presented in form of a 3D mesh. In this paper, we survey mesh simpliļ¬cation algorithms in general and focus especially to algorithms that can be used to simplify CAD models. We test some commonly known algorithms with real world CAD data and characterize some new CAD related simpliļ¬cation algorithms that have not been surveyed in previous mesh simpliļ¬cation reviews.Siirretty Doriast
    • ā€¦
    corecore