342,949 research outputs found

    A comparative survey on high dynamic range video compression

    Get PDF
    International audienceHigh dynamic range (HDR) video compression has until now been approached by using the high profile of existing state-of-the-art H.264/AVC (Advanced Video Coding) codec or by separately encoding low dynamic range (LDR) video and the residue resulted from the estimation of HDR video from LDR video. Although the latter approach has a distinctive advantage of providing backward compatibility to 8-bit LDR displays, the superiority of one approach to the other in terms of the rate distortion trade-off has not been verified yet. In this paper, we first give a detailed overview of the methods in these two approaches. Then, we experimentally compare two approaches with respect to different objective and perceptual metrics, such as HDR mean square error (HDR MSE), perceptually uniform peak signal to noise ratio (PU PSNR) and HDR visible difference predictor (HDR VDP). We first conclude that the optimized methods for backward compatibility to 8-bit LDR displays are superior to the method designed for high profile encoder both for 8-bit and 12-bit mappings in terms of all metrics. Second, using higher bit-depths with a high profile encoder is giving better rate-distortion performances than employing an 8-bit mapping with an 8-bit encoder for the same method, in particular when the dynamic range of the video sequence is high. Third, rather than encoding of the residue signal in backward compatible methods, changing the quantization step size of the LDR layer encoder would be sufficient to achieve a required quality. In other words, the quality of tone mapping is more important than residue encoding for the performance of HDR image and video coding

    Fully-automatic inverse tone mapping algorithm based on dynamic mid-level tone mapping

    Get PDF
    High Dynamic Range (HDR) displays can show images with higher color contrast levels and peak luminosities than the common Low Dynamic Range (LDR) displays. However, most existing video content is recorded and/or graded in LDR format. To show LDR content on HDR displays, it needs to be up-scaled using a so-called inverse tone mapping algorithm. Several techniques for inverse tone mapping have been proposed in the last years, going from simple approaches based on global and local operators to more advanced algorithms such as neural networks. Some of the drawbacks of existing techniques for inverse tone mapping are the need for human intervention, the high computation time for more advanced algorithms, limited low peak brightness, and the lack of the preservation of the artistic intentions. In this paper, we propose a fully-automatic inverse tone mapping operator based on mid-level mapping capable of real-time video processing. Our proposed algorithm allows expanding LDR images into HDR images with peak brightness over 1000 nits, preserving the artistic intentions inherent to the HDR domain. We assessed our results using the full-reference objective quality metrics HDR-VDP-2.2 and DRIM, and carrying out a subjective pair-wise comparison experiment. We compared our results with those obtained with the most recent methods found in the literature. Experimental results demonstrate that our proposed method outperforms the current state-of-the-art of simple inverse tone mapping methods and its performance is similar to other more complex and time-consuming advanced techniques

    Video surveillance systems-current status and future trends

    Get PDF
    Within this survey an attempt is made to document the present status of video surveillance systems. The main components of a surveillance system are presented and studied thoroughly. Algorithms for image enhancement, object detection, object tracking, object recognition and item re-identification are presented. The most common modalities utilized by surveillance systems are discussed, putting emphasis on video, in terms of available resolutions and new imaging approaches, like High Dynamic Range video. The most important features and analytics are presented, along with the most common approaches for image / video quality enhancement. Distributed computational infrastructures are discussed (Cloud, Fog and Edge Computing), describing the advantages and disadvantages of each approach. The most important deep learning algorithms are presented, along with the smart analytics that they utilize. Augmented reality and the role it can play to a surveillance system is reported, just before discussing the challenges and the future trends of surveillance

    A Video Upgradation of Low Vision AVI Video by Individual Pixel Channel Intensity Measurement and Its Enhancement

    Get PDF
    From the past few decades, the researchers and scholars have done the quality work in video and image processing and a wide range of outcomes has been discover and invented including the resolutions and sensitivity. Apart from these work there are many aspects are still hidden such as record a high dynamic range images and videos in low-light conditions especially when light is very low. When the intensity of noise is greater than the signal then the traditional denoising techniques cannot done their work properly. For this problem, many approaches being designed and developed to enhance the low-light video but Low contrast and noise remains a barrier to visually pleasing videos in low light conditions. To capture the videos in social gatherings, concerts, parties, musical events, dark forest and in security monitoring situations are still unsolved problem. In such conditions the video enhancement of low light video is really a tedious and tough job. This paper is proposing a new approach of video enhancement. The work is further going on to find a technique for better visibility of video

    Focus Is All You Need: Loss Functions For Event-based Vision

    Full text link
    Event cameras are novel vision sensors that output pixel-level brightness changes ("events") instead of traditional video frames. These asynchronous sensors offer several advantages over traditional cameras, such as, high temporal resolution, very high dynamic range, and no motion blur. To unlock the potential of such sensors, motion compensation methods have been recently proposed. We present a collection and taxonomy of twenty two objective functions to analyze event alignment in motion compensation approaches (Fig. 1). We call them Focus Loss Functions since they have strong connections with functions used in traditional shape-from-focus applications. The proposed loss functions allow bringing mature computer vision tools to the realm of event cameras. We compare the accuracy and runtime performance of all loss functions on a publicly available dataset, and conclude that the variance, the gradient and the Laplacian magnitudes are among the best loss functions. The applicability of the loss functions is shown on multiple tasks: rotational motion, depth and optical flow estimation. The proposed focus loss functions allow to unlock the outstanding properties of event cameras.Comment: 29 pages, 19 figures, 4 table

    Algorithms for compression of high dynamic range images and video

    Get PDF
    The recent advances in sensor and display technologies have brought upon the High Dynamic Range (HDR) imaging capability. The modern multiple exposure HDR sensors can achieve the dynamic range of 100-120 dB and LED and OLED display devices have contrast ratios of 10^5:1 to 10^6:1. Despite the above advances in technology the image/video compression algorithms and associated hardware are yet based on Standard Dynamic Range (SDR) technology, i.e. they operate within an effective dynamic range of up to 70 dB for 8 bit gamma corrected images. Further the existing infrastructure for content distribution is also designed for SDR, which creates interoperability problems with true HDR capture and display equipment. The current solutions for the above problem include tone mapping the HDR content to fit SDR. However this approach leads to image quality associated problems, when strong dynamic range compression is applied. Even though some HDR-only solutions have been proposed in literature, they are not interoperable with current SDR infrastructure and are thus typically used in closed systems. Given the above observations a research gap was identified in the need for efficient algorithms for the compression of still images and video, which are capable of storing full dynamic range and colour gamut of HDR images and at the same time backward compatible with existing SDR infrastructure. To improve the usability of SDR content it is vital that any such algorithms should accommodate different tone mapping operators, including those that are spatially non-uniform. In the course of the research presented in this thesis a novel two layer CODEC architecture is introduced for both HDR image and video coding. Further a universal and computationally efficient approximation of the tone mapping operator is developed and presented. It is shown that the use of perceptually uniform colourspaces for internal representation of pixel data enables improved compression efficiency of the algorithms. Further proposed novel approaches to the compression of metadata for the tone mapping operator is shown to improve compression performance for low bitrate video content. Multiple compression algorithms are designed, implemented and compared and quality-complexity trade-offs are identified. Finally practical aspects of implementing the developed algorithms are explored by automating the design space exploration flow and integrating the high level systems design framework with domain specific tools for synthesis and simulation of multiprocessor systems. The directions for further work are also presented

    Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network

    Get PDF
    The objective investigation of the dynamic properties of vocal fold vibrations demands the recording and further quantitative analysis of laryngeal high-speed video (HSV). Quantification of the vocal fold vibration patterns requires as a first step the segmentation of the glottal area within each video frame from which the vibrating edges of the vocal folds are usually derived. Consequently, the outcome of any further vibration analysis depends on the quality of this initial segmentation process. In this work we propose for the first time a procedure to fully automatically segment not only the time-varying glottal area but also the vocal fold tissue directly from laryngeal high-speed video (HSV) using a deep Convolutional Neural Network (CNN) approach. Eighteen different Convolutional Neural Network (CNN) network configurations were trained and evaluated on totally 13,000 high-speed video (HSV) frames obtained from 56 healthy and 74 pathologic subjects. The segmentation quality of the best performing Convolutional Neural Network (CNN) model, which uses Long Short-Term Memory (LSTM) cells to take also the temporal context into account, was intensely investigated on 15 test video sequences comprising 100 consecutive images each. As performance measures the Dice Coefficient (DC) as well as the precisions of four anatomical landmark positions were used. Over all test data a mean Dice Coefficient (DC) of 0.85 was obtained for the glottis and 0.91 and 0.90 for the right and left vocal fold (VF) respectively. The grand average precision of the identified landmarks amounts 2.2 pixels and is in the same range as comparable manual expert segmentations which can be regarded as Gold Standard. The method proposed here requires no user interaction and overcomes the limitations of current semiautomatic or computational expensive approaches. Thus, it allows also for the analysis of long high-speed video (HSV)-sequences and holds the promise to facilitate the objective analysis of vocal fold vibrations in clinical routine. The here used dataset including the ground truth will be provided freely for all scientific groups to allow a quantitative benchmarking of segmentation approaches in future

    Algorithms for the enhancement of dynamic range and colour constancy of digital images & video

    Get PDF
    One of the main objectives in digital imaging is to mimic the capabilities of the human eye, and perhaps, go beyond in certain aspects. However, the human visual system is so versatile, complex, and only partially understood that no up-to-date imaging technology has been able to accurately reproduce the capabilities of the it. The extraordinary capabilities of the human eye have become a crucial shortcoming in digital imaging, since digital photography, video recording, and computer vision applications have continued to demand more realistic and accurate imaging reproduction and analytic capabilities. Over decades, researchers have tried to solve the colour constancy problem, as well as extending the dynamic range of digital imaging devices by proposing a number of algorithms and instrumentation approaches. Nevertheless, no unique solution has been identified; this is partially due to the wide range of computer vision applications that require colour constancy and high dynamic range imaging, and the complexity of the human visual system to achieve effective colour constancy and dynamic range capabilities. The aim of the research presented in this thesis is to enhance the overall image quality within an image signal processor of digital cameras by achieving colour constancy and extending dynamic range capabilities. This is achieved by developing a set of advanced image-processing algorithms that are robust to a number of practical challenges and feasible to be implemented within an image signal processor used in consumer electronics imaging devises. The experiments conducted in this research show that the proposed algorithms supersede state-of-the-art methods in the fields of dynamic range and colour constancy. Moreover, this unique set of image processing algorithms show that if they are used within an image signal processor, they enable digital camera devices to mimic the human visual system s dynamic range and colour constancy capabilities; the ultimate goal of any state-of-the-art technique, or commercial imaging device

    Real-Time HDR Panorama Video

    Full text link
    The interest for wide field of view panorama video is in-creasing. In this respect, we have an application that uses an array of cameras that overlook a soccer stadium. The input of these cameras are stitched together to provide a panoramic view of the stadium. One of the challenges we face is that large parts of the field are obscured by shad-ows on sunny days. Such circumstances cause unsatisfying video quality. We have therefore implemented and evaluated multiple algorithms related to high dynamic range (HDR) video. The evaluation shows that a combination of several approaches gives the most useful results in our scenario
    corecore