1,473 research outputs found

    Steered mixture-of-experts for light field images and video : representation and coding

    Get PDF
    Research in light field (LF) processing has heavily increased over the last decade. This is largely driven by the desire to achieve the same level of immersion and navigational freedom for camera-captured scenes as it is currently available for CGI content. Standardization organizations such as MPEG and JPEG continue to follow conventional coding paradigms in which viewpoints are discretely represented on 2-D regular grids. These grids are then further decorrelated through hybrid DPCM/transform techniques. However, these 2-D regular grids are less suited for high-dimensional data, such as LFs. We propose a novel coding framework for higher-dimensional image modalities, called Steered Mixture-of-Experts (SMoE). Coherent areas in the higher-dimensional space are represented by single higher-dimensional entities, called kernels. These kernels hold spatially localized information about light rays at any angle arriving at a certain region. The global model consists thus of a set of kernels which define a continuous approximation of the underlying plenoptic function. We introduce the theory of SMoE and illustrate its application for 2-D images, 4-D LF images, and 5-D LF video. We also propose an efficient coding strategy to convert the model parameters into a bitstream. Even without provisions for high-frequency information, the proposed method performs comparable to the state of the art for low-to-mid range bitrates with respect to subjective visual quality of 4-D LF images. In case of 5-D LF video, we observe superior decorrelation and coding performance with coding gains of a factor of 4x in bitrate for the same quality. At least equally important is the fact that our method inherently has desired functionality for LF rendering which is lacking in other state-of-the-art techniques: (1) full zero-delay random access, (2) light-weight pixel-parallel view reconstruction, and (3) intrinsic view interpolation and super-resolution

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    Hierarchical visual perception and two-dimensional compressive sensing for effective content-based color image retrieval

    Get PDF
    Content-based image retrieval (CBIR) has been an active research theme in the computer vision community for over two decades. While the field is relatively mature, significant research is still required in this area to develop solutions for practical applications. One reason that practical solutions have not yet been realized could be due to a limited understanding of the cognitive aspects of the human vision system. Inspired by three cognitive properties of human vision, namely, hierarchical structuring, color perception and embedded compressive sensing, a new CBIR approach is proposed. In the proposed approach, the Hue, Saturation and Value (HSV) color model and the Similar Gray Level Co-occurrence Matrix (SGLCM) texture descriptors are used to generate elementary features. These features then form a hierarchical representation of the data to which a two-dimensional compressive sensing (2D CS) feature mining algorithm is applied. Finally, a weighted feature matching method is used to perform image retrieval. We present a comprehensive set of results of applying our proposed Hierarchical Visual Perception Enabled 2D CS approach using publicly available datasets and demonstrate the efficacy of our techniques when compared with other recently published, state-of-the-art approaches

    Dynamic Path-Controllable Deep Unfolding Network for Compressive Sensing

    Full text link
    Deep unfolding network (DUN) that unfolds the optimization algorithm into a deep neural network has achieved great success in compressive sensing (CS) due to its good interpretability and high performance. Each stage in DUN corresponds to one iteration in optimization. At the test time, all the sampling images generally need to be processed by all stages, which comes at a price of computation burden and is also unnecessary for the images whose contents are easier to restore. In this paper, we focus on CS reconstruction and propose a novel Dynamic Path-Controllable Deep Unfolding Network (DPC-DUN). DPC-DUN with our designed path-controllable selector can dynamically select a rapid and appropriate route for each image and is slimmable by regulating different performance-complexity tradeoffs. Extensive experiments show that our DPC-DUN is highly flexible and can provide excellent performance and dynamic adjustment to get a suitable tradeoff, thus addressing the main requirements to become appealing in practice. Codes are available at https://github.com/songjiechong/DPC-DUN.Comment: TIP, 202

    Secured and progressive transmission of compressed images on the Internet: application to telemedicine

    Get PDF
    International audienceWithin the framework of telemedicine, the amount of images leads first to use efficient lossless compression methods for the aim of storing information. Furthermore, multiresolution scheme including Region of Interest (ROI) processing is an important feature for a remote access to medical images. What is more, the securization of sensitive data (e.g. metadata from DICOM images) constitutes one more expected functionality: indeed the lost of IP packets could have tragic effects on a given diagnosis. For this purpose, we present in this paper an original scalable image compression technique (LAR method) used in association with a channel coding method based on the Mojette Transform, so that a hierarchical priority encoding system is elaborated. This system provides a solution for secured transmission of medical images through low-bandwidth networks such as the Internet

    Data compression techniques applied to high resolution high frame rate video technology

    Get PDF
    An investigation is presented of video data compression applied to microgravity space experiments using High Resolution High Frame Rate Video Technology (HHVT). An extensive survey of methods of video data compression, described in the open literature, was conducted. The survey examines compression methods employing digital computing. The results of the survey are presented. They include a description of each method and assessment of image degradation and video data parameters. An assessment is made of present and near term future technology for implementation of video data compression in high speed imaging system. Results of the assessment are discussed and summarized. The results of a study of a baseline HHVT video system, and approaches for implementation of video data compression, are presented. Case studies of three microgravity experiments are presented and specific compression techniques and implementations are recommended

    Edge-enhanced dual discriminator generative adversarial network for fast MRI with parallel imaging using multi-view information

    Get PDF
    In clinical medicine, magnetic resonance imaging (MRI) is one of the most important tools for diagnosis, triage, prognosis, and treatment planning. However, MRI suffers from an inherent slow data acquisition process because data is collected sequentially in k-space. In recent years, most MRI reconstruction methods proposed in the literature focus on holistic image reconstruction rather than enhancing the edge information. This work steps aside this general trend by elaborating on the enhancement of edge information. Specifically, we introduce a novel parallel imaging coupled dual discriminator generative adversarial network (PIDD-GAN) for fast multi-channel MRI reconstruction by incorporating multi-view information. The dual discriminator design aims to improve the edge information in MRI reconstruction. One discriminator is used for holistic image reconstruction, whereas the other one is responsible for enhancing edge information. An improved U-Net with local and global residual learning is proposed for the generator. Frequency channel attention blocks (FCA Blocks) are embedded in the generator for incorporating attention mechanisms. Content loss is introduced to train the generator for better reconstruction quality. We performed comprehensive experiments on Calgary-Campinas public brain MR dataset and compared our method with state-of-the-art MRI reconstruction methods. Ablation studies of residual learning were conducted on the MICCAI13 dataset to validate the proposed modules. Results show that our PIDD-GAN provides high-quality reconstructed MR images, with well-preserved edge information. The time of single-image reconstruction is below 5ms, which meets the demand of faster processing

    A Tutorial on Speckle Reduction in Synthetic Aperture Radar Images

    Get PDF
    Speckle is a granular disturbance, usually modeled as a multiplicative noise, that affects synthetic aperture radar (SAR) images, as well as all coherent images. Over the last three decades, several methods have been proposed for the reduction of speckle, or despeckling, in SAR images. Goal of this paper is making a comprehensive review of despeckling methods since their birth, over thirty years ago, highlighting trends and changing approaches over years. The concept of fully developed speckle is explained. Drawbacks of homomorphic filtering are pointed out. Assets of multiresolution despeckling, as opposite to spatial-domain despeckling, are highlighted. Also advantages of undecimated, or stationary, wavelet transforms over decimated ones are discussed. Bayesian estimators and probability density function (pdf) models in both spatial and multiresolution domains are reviewed. Scale-space varying pdf models, as opposite to scale varying models, are promoted. Promising methods following non-Bayesian approaches, like nonlocal (NL) filtering and total variation (TV) regularization, are reviewed and compared to spatial- and wavelet-domain Bayesian filters. Both established and new trends for assessment of despeckling are presented. A few experiments on simulated data and real COSMO-SkyMed SAR images highlight, on one side the costperformance tradeoff of the different methods, on the other side the effectiveness of solutions purposely designed for SAR heterogeneity and not fully developed speckle. Eventually, upcoming methods based on new concepts of signal processing, like compressive sensing, are foreseen as a new generation of despeckling, after spatial-domain and multiresolution-domain method
    • …
    corecore