473 research outputs found

    Video modeling via implicit motion representations

    Get PDF
    Video modeling refers to the development of analytical representations for explaining the intensity distribution in video signals. Based on the analytical representation, we can develop algorithms for accomplishing particular video-related tasks. Therefore video modeling provides us a foundation to bridge video data and related-tasks. Although there are many video models proposed in the past decades, the rise of new applications calls for more efficient and accurate video modeling approaches.;Most existing video modeling approaches are based on explicit motion representations, where motion information is explicitly expressed by correspondence-based representations (i.e., motion velocity or displacement). Although it is conceptually simple, the limitations of those representations and the suboptimum of motion estimation techniques can degrade such video modeling approaches, especially for handling complex motion or non-ideal observation video data. In this thesis, we propose to investigate video modeling without explicit motion representation. Motion information is implicitly embedded into the spatio-temporal dependency among pixels or patches instead of being explicitly described by motion vectors.;Firstly, we propose a parametric model based on a spatio-temporal adaptive localized learning (STALL). We formulate video modeling as a linear regression problem, in which motion information is embedded within the regression coefficients. The coefficients are adaptively learned within a local space-time window based on LMMSE criterion. Incorporating a spatio-temporal resampling and a Bayesian fusion scheme, we can enhance the modeling capability of STALL on more general videos. Under the framework of STALL, we can develop video processing algorithms for a variety of applications by adjusting model parameters (i.e., the size and topology of model support and training window). We apply STALL on three video processing problems. The simulation results show that motion information can be efficiently exploited by our implicit motion representation and the resampling and fusion do help to enhance the modeling capability of STALL.;Secondly, we propose a nonparametric video modeling approach, which is not dependent on explicit motion estimation. Assuming the video sequence is composed of many overlapping space-time patches, we propose to embed motion-related information into the relationships among video patches and develop a generic sparsity-based prior for typical video sequences. First, we extend block matching to more general kNN-based patch clustering, which provides an implicit and distributed representation for motion information. We propose to enforce the sparsity constraint on a higher-dimensional data array signal, which is generated by packing the patches in the similar patch set. Then we solve the inference problem by updating the kNN array and the wanted signal iteratively. Finally, we present a Bayesian fusion approach to fuse multiple-hypothesis inferences. Simulation results in video error concealment, denoising, and deartifacting are reported to demonstrate its modeling capability.;Finally, we summarize the proposed two video modeling approaches. We also point out the perspectives of implicit motion representations in applications ranging from low to high level problems

    Mitigation of H.264 and H.265 Video Compression for Reliable PRNU Estimation

    Full text link
    The photo-response non-uniformity (PRNU) is a distinctive image sensor characteristic, and an imaging device inadvertently introduces its sensor's PRNU into all media it captures. Therefore, the PRNU can be regarded as a camera fingerprint and used for source attribution. The imaging pipeline in a camera, however, involves various processing steps that are detrimental to PRNU estimation. In the context of photographic images, these challenges are successfully addressed and the method for estimating a sensor's PRNU pattern is well established. However, various additional challenges related to generation of videos remain largely untackled. With this perspective, this work introduces methods to mitigate disruptive effects of widely deployed H.264 and H.265 video compression standards on PRNU estimation. Our approach involves an intervention in the decoding process to eliminate a filtering procedure applied at the decoder to reduce blockiness. It also utilizes decoding parameters to develop a weighting scheme and adjust the contribution of video frames at the macroblock level to PRNU estimation process. Results obtained on videos captured by 28 cameras show that our approach increases the PRNU matching metric up to more than five times over the conventional estimation method tailored for photos

    Adaptive filtering techniques for acquisition noise and coding artifacts of digital pictures

    Get PDF
    The quality of digital pictures is often degraded by various processes (e.g, acquisition or capturing, compression, filtering process, transmission, etc). In digital image/video processing systems, random noise appearing in images is mainly generated during the capturing process; while the artifacts (or distortions) are generated in compression or filtering processes. This dissertation looks at digital image/video quality degradations with possible solution for post processing techniques for coding artifacts and acquisition noise reduction for images/videos. Three major issues associated with the image/video degradation are addressed in this work. The first issue is the temporal fluctuation artifact in digitally compressed videos. In the state-of-art video coding standard, H.264/AVC, temporal fluctuations are noticeable between intra picture frames or between an intra picture frame and neighbouring inter picture frames. To resolve this problem, a novel robust statistical temporal filtering technique is proposed. It utilises a re-descending robust statistical model with outlier rejection feature to reduce the temporal fluctuations while preserving picture details and motion sharpness. PSNR and sum of square difference (SSD) show improvement of proposed filters over other benchmark filters. Even for videos contain high motion, the proposed temporal filter shows good performances in fluctuation reduction and motion clarity preservation compared with other baseline temporal filters. The second issue concerns both the spatial and temporal artifacts (e.g, blocking, ringing, and temporal fluctuation artifacts) appearing in compressed video. To address this issue, a novel joint spatial and temporal filtering framework is constructed for artifacts reduction. Both the spatial and the temporal filters employ a re-descending robust statistical model (RRSM) in the filtering processes. The robust statistical spatial filter (RSSF) reduces spatial blocking and ringing artifacts whilst the robust statistical temporal filter (RSTF) suppresses the temporal fluctuations. Performance evaluations demonstrate that the proposed joint spatio-temporal filter is superior to H.264 loop filter in terms of spatial and temporal artifacts reduction and motion clarity preservation. The third issue is random noise, commonly modeled as mixed Gaussian and impulse noise (MGIN), which appears in image/video acquisition process. An effective method to estimate MGIN is through a robust estimator, median absolute deviation normalized (MADN). The MADN estimator is used to separate the MGIN model into impulse and additive Gaussian noise portion. Based on this estimation, the proposed filtering process is composed of a modified median filter for impulse noise reduction, and a DCT transform based denoising filter for additive Gaussian noise reduction. However, this DCT based denoising filter produces temporal fluctuations for videos. To solve this problem, a temporal filter is added to the filtering process. Therefore, another joint spatio-temporal filtering scheme is built to achieve the best visual quality of denoised videos. Extensive experiments show that the proposed joint spatio-temporal filtering scheme outperforms other benchmark filters in noise and distortions suppression
    corecore