1,442 research outputs found

    Local Inverse Tone Curve Learning for High Dynamic Range Image Scalable Compression

    Get PDF
    International audienceThis paper presents a scalable high dynamic range (HDR) image coding scheme in which the base layer is a lowdynamic range (LDR) version of the image that may have been generated by an arbitrary Tone Mapping Operator (TMO). No restriction is imposed on the TMO, which can be either global or local, so as to fully respect the artistic intent of the producer. Our method successfully handles the case of complex local TMOs thanks to a block-wise and non-linear approach. A novel template based Inter Layer Prediction (ILP) is designed in order to perform the inverse tone mapping of a block without the need to transmit any additional parameter to the decoder. This method enables the use of a more accurate inverse tone mapping model than the simple linear regression commonly used for blockwise ILP. In addition, this paper shows that a linear adjustment of the initially predicted block can further improve the overall coding performance by using an efficient encoding scheme of the scaling parameters. Our experiments have shown an average bitrate saving of 47% on the HDR enhancement layer, compared to previous local ILP methods

    Lightning-Fast Dual-Layer Lossless Coding for Radiance Format High Dynamic Range Images

    Full text link
    This paper proposes a fast dual-layer lossless coding for high dynamic range images (HDRIs) in the Radiance format. The coding, which consists of a base layer and a lossless enhancement layer, provides a standard dynamic range image (SDRI) without requiring an additional algorithm at the decoder and can losslessly decode the HDRI by adding the residual signals (residuals) between the HDRI and SDRI to the SDRI, if desired. To suppress the dynamic range of the residuals in the enhancement layer, the coding directly uses the mantissa and exponent information from the Radiance format. To further reduce the residual energy, each mantissa is modeled (estimated) as a linear function, i.e., a simple linear regression, of the encoded-decoded SDRI in each region with the same exponent. This is called simple linear regressive mantissa estimator. Experimental results show that, compared with existing methods, our coding reduces the average bitrate by approximately 1.571.57-6.686.68 % and significantly reduces the average encoder implementation time by approximately 87.1387.13-98.9698.96 %

    Scaling optical computing in synthetic frequency dimension using integrated cavity acousto-optics

    Full text link
    Optical computing with integrated photonics brings a pivotal paradigm shift to data-intensive computing technologies. However, the scaling of on-chip photonic architectures using spatially distributed schemes faces the challenge imposed by the fundamental limit of integration density. Synthetic dimensions of light offer the opportunity to extend the length of operand vectors within a single photonic component. Here, we show that large-scale, complex-valued matrix-vector multiplications on synthetic frequency lattices can be performed using an ultra-efficient, silicon-based nanophotonic cavity acousto-optic modulator. By harnessing the resonantly enhanced strong electro-optomechanical coupling, we achieve, in a single such modulator, the full-range phase-coherent frequency conversions across the entire synthetic lattice, which constitute a fully connected linear computing layer. Our demonstrations open up the route towards the experimental realizations of frequency-domain integrated optical computing systems simultaneously featuring very large-scale data processing and small device footprints.Comment: 4 figures, 14 pages for main text, 14 pages of supplementary material

    Survey of image-based representations and compression techniques

    Get PDF
    In this paper, we survey the techniques for image-based rendering (IBR) and for compressing image-based representations. Unlike traditional three-dimensional (3-D) computer graphics, in which 3-D geometry of the scene is known, IBR techniques render novel views directly from input images. IBR techniques can be classified into three categories according to how much geometric information is used: rendering without geometry, rendering with implicit geometry (i.e., correspondence), and rendering with explicit geometry (either with approximate or accurate geometry). We discuss the characteristics of these categories and their representative techniques. IBR techniques demonstrate a surprising diverse range in their extent of use of images and geometry in representing 3-D scenes. We explore the issues in trading off the use of images and geometry by revisiting plenoptic-sampling analysis and the notions of view dependency and geometric proxies. Finally, we highlight compression techniques specifically designed for image-based representations. Such compression techniques are important in making IBR techniques practical.published_or_final_versio

    Inverse tone mapping

    Get PDF
    The introduction of High Dynamic Range Imaging in computer graphics has produced a novelty in Imaging that can be compared to the introduction of colour photography or even more. Light can now be captured, stored, processed, and finally visualised without losing information. Moreover, new applications that can exploit physical values of the light have been introduced such as re-lighting of synthetic/real objects, or enhanced visualisation of scenes. However, these new processing and visualisation techniques cannot be applied to movies and pictures that have been produced by photography and cinematography in more than one hundred years. This thesis introduces a general framework for expanding legacy content into High Dynamic Range content. The expansion is achieved avoiding artefacts, producing images suitable for visualisation and re-lighting of synthetic/real objects. Moreover, it is presented a methodology based on psychophysical experiments and computational metrics to measure performances of expansion algorithms. Finally, a compression scheme, inspired by the framework, for High Dynamic Range Textures, is proposed and evaluated

    High Dynamic Range Visual Content Compression

    Get PDF
    This thesis addresses the research questions of High Dynamic Range (HDR) visual contents compression. The HDR representations are intended to represent the actual physical value of the light rather than exposed value. The current HDR compression schemes are the extension of legacy Low Dynamic Range (LDR) compressions, by using Tone-Mapping Operators (TMO) to reduce the dynamic range of the HDR contents. However, introducing TMO increases the overall computational complexity, and it causes the temporal artifacts. Furthermore, these compression schemes fail to compress non-salient region differently than the salient region, when Human Visual System (HVS) perceives them differently. The main contribution of this thesis is to propose a novel Mapping-free visual saliency-guided HDR content compression scheme. Firstly, the relationship of Discrete Wavelet Transform (DWT) lifting steps and TMO are explored. A novel approach to compress HDR image by Joint Photographic Experts Group (JPEG) 2000 codec while backward compatible to LDR is proposed. This approach exploits the reversibility of tone mapping and scalability of DWT. Secondly, the importance of the TMO in the HDR compression is evaluated in this thesis. A mapping-free post HDR image compression based on JPEG and JPEG2000 standard codecs for current HDR image formats is proposed. This approach exploits the structure of HDR formats. It has an equivalent compression performance and the lowest computational complexity compared to the existing HDR lossy compressions (50% lower than the state-of-the-art). Finally, the shortcomings of the current HDR visual saliency models, and HDR visual saliency-guided compression are explored in this thesis. A spatial saliency model for HDR visual content outperform others by 10% for spatial visual prediction task with 70% lower computational complexity is proposed. Furthermore, the experiment suggested more than 90% temporal saliency is predicted by the proposed spatial model. Moreover, the proposed saliency model can be used to guide the HDR compression by applying different quantization factor according to the intensity of predicted saliency map

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    Towards one video encoder per individual : guided High Efficiency Video Coding

    Get PDF

    Compression, Modeling, and Real-Time Rendering of Realistic Materials and Objects

    Get PDF
    The realism of a scene basically depends on the quality of the geometry, the illumination and the materials that are used. Whereas many sources for the creation of three-dimensional geometry exist and numerous algorithms for the approximation of global illumination were presented, the acquisition and rendering of realistic materials remains a challenging problem. Realistic materials are very important in computer graphics, because they describe the reflectance properties of surfaces, which are based on the interaction of light and matter. In the real world, an enormous diversity of materials can be found, comprising very different properties. One important objective in computer graphics is to understand these processes, to formalize them and to finally simulate them. For this purpose various analytical models do already exist, but their parameterization remains difficult as the number of parameters is usually very high. Also, they fail for very complex materials that occur in the real world. Measured materials, on the other hand, are prone to long acquisition time and to huge input data size. Although very efficient statistical compression algorithms were presented, most of them do not allow for editability, such as altering the diffuse color or mesostructure. In this thesis, a material representation is introduced that makes it possible to edit these features. This makes it possible to re-use the acquisition results in order to easily and quickly create deviations of the original material. These deviations may be subtle, but also substantial, allowing for a wide spectrum of material appearances. The approach presented in this thesis is not based on compression, but on a decomposition of the surface into several materials with different reflection properties. Based on a microfacette model, the light-matter interaction is represented by a function that can be stored in an ordinary two-dimensional texture. Additionally, depth information, local rotations, and the diffuse color are stored in these textures. As a result of the decomposition, some of the original information is inevitably lost, therefore an algorithm for the efficient simulation of subsurface scattering is presented as well. Another contribution of this work is a novel perception-based simplification metric that includes the material of an object. This metric comprises features of the human visual system, for example trichromatic color perception or reduced resolution. The proposed metric allows for a more aggressive simplification in regions where geometric metrics do not simplif
    corecore