1,066 research outputs found

    PEA265: Perceptual Assessment of Video Compression Artifacts

    Full text link
    The most widely used video encoders share a common hybrid coding framework that includes block-based motion estimation/compensation and block-based transform coding. Despite their high coding efficiency, the encoded videos often exhibit visually annoying artifacts, denoted as Perceivable Encoding Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience (QoE) of end users. To monitor and improve visual QoE, it is crucial to develop subjective and objective measures that can identify and quantify various types of PEAs. In this work, we make the first attempt to build a large-scale subjectlabelled database composed of H.265/HEVC compressed videos containing various PEAs. The database, namely the PEA265 database, includes 4 types of spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types of temporal PEAs (i.e. flickering and floating). Each containing at least 60,000 image or video patches with positive and negative labels. To objectively identify these PEAs, we train Convolutional Neural Networks (CNNs) using the PEA265 database. It appears that state-of-theart ResNeXt is capable of identifying each type of PEAs with high accuracy. Furthermore, we define PEA pattern and PEA intensity measures to quantify PEA levels of compressed video sequence. We believe that the PEA265 database and our findings will benefit the future development of video quality assessment methods and perceptually motivated video encoders.Comment: 10 pages,15 figures,4 table

    Visual motion processing and human tracking behavior

    Full text link
    The accurate visual tracking of a moving object is a human fundamental skill that allows to reduce the relative slip and instability of the object's image on the retina, thus granting a stable, high-quality vision. In order to optimize tracking performance across time, a quick estimate of the object's global motion properties needs to be fed to the oculomotor system and dynamically updated. Concurrently, performance can be greatly improved in terms of latency and accuracy by taking into account predictive cues, especially under variable conditions of visibility and in presence of ambiguous retinal information. Here, we review several recent studies focusing on the integration of retinal and extra-retinal information for the control of human smooth pursuit.By dynamically probing the tracking performance with well established paradigms in the visual perception and oculomotor literature we provide the basis to test theoretical hypotheses within the framework of dynamic probabilistic inference. We will in particular present the applications of these results in light of state-of-the-art computer vision algorithms

    Wavelet-based video codec using human visual system coefficients for 3G mobiles

    Get PDF
    A new wavelet based video codec that uses human visual system coefficients is presented. In INTRA mode of operation, wavelet transform is used to split the input frame into a number of subbands. Human Visual system coefficients are designed for handheld videophone devices and used to regulate the quantization stepsize in the pixel quantization of the high frequency subbands’ coefficients. The quantized coefficients are coded using quadtreecoding scheme. In the INTER mode of operation, the displaced frame difference is generated and a wavelet transform decorrelates it into a number of subbands. These subbands are coded using adaptive vector quantization scheme. Results indicate a significant improvement in frame quality compared to motion JPEG200

    Mental and sensorimotor extrapolation fare better than motion extrapolation in the offset condition

    Get PDF
    Evidence for motion extrapolation at motion offset is scarce. In contrast, there is abundant evidence that subjects mentally extrapolate the future trajectory of weak motion signals at motion offset. Further, pointing movements overshoot at motion offset. We believe that mental and sensorimotor extrapolation is sufficient to solve the problem of perceptual latencies. Both present the advantage of being much more flexible than motion extrapolatio

    Evaluation and optimization of central vision compensation techniques

    Get PDF
    Non-costly, non-invasive, safe, and reliable electronic vision enhancement systems (EVES) and their methods have presented a huge medical and industrial demand in the early 21st century. Two unique, vision compensation and enhancement algorithms are reviewed and compared, qualitatively optimizing the view of a restricted (or truncated) image. The first is described as the convex or fish-eye technique, and the second is the cartoon superimposition or Peli technique (after the leading author for this research). The novelty in this dissertation is in presenting and analyzing both of these with a comparison to a novel technique, motivated by characterization of quality vision parameters (or the distribution of photoreceptors in the eye), in an attempt to account for and compensate reported viewing difficulties and low image quality measures associated with these two existing methods.;This partial cartoon technique is based on introducing the invisible image to the immediate left and right of the truncated image as a superimposed cartoon into respective sides of the truncated image, yet only on a partial basis as not to distract the central view of the image. It is generated and evaluated using MatlabRTM to warp sample grayscale images according to predefined parameters such as warping method, cartoon and other warping parameters, different grayscale values, as well as comparing both the static and movie modes. Warped images are quantitatively compared by evaluating the Root-Mean-Square Error (RMSE) and the Universal Image Quality Index (UIQI), both representing image distortion and quality measures of warped, as compared to original images for five different scenes; landscape, close-up, obstacle, text, and home (or low-illumination) views. Remapped images are also evaluated through surveys performed on 115 subjects, where improvement is assessed using measures of image detail and distortion.;It is finally concluded that the presented partial cartoon method exhibits superior image quality for all objective measures, as well as for a majority of subjective distortion measures. Justification is provided as to why the technique does not offer superior subjective detail measures. Further improvement is suggested, as well as additional techniques and research

    New prediction schemes for scalable wavelet video coding

    Get PDF
    A Scalable Video Coder (SVC) can be conceived according to different kinds of spatio-temporal decomposition structures which can be designed to produce a multiresolution spatio-temporal subband hierarchy which is then coded with a progressive or quality scalable coding technique [1-5]. A classification of SVC architectures has been suggested by the MPEG Ad-Hoc Group on SVC [6]. The so called t+2D schemes (one example is [2]) performs first an MCTF, producing temporal subband frames, then the spatial DWT is applied on each one of these frames. Alternatively, in a 2D+t scheme (one example is [7]), a spatial DWT is applied first to each video frame and then MCTF is made on spatial subbands. A third approach named 2D+t+2D uses a first stage DWT to produce reference video sequences at various resolutions; t+2D transforms are then performed on each resolution level of the obtained spatial pyramid. Each scheme has evidenced its pros and cons [8,9] in terms of coding performance. From a theoretical point of view, the critical aspects of the above SVC scheme mainly reside: i) in the coherence and trustworthiness of the motion estimation at various scales (especially for t+2D schemes); ii) in the difficulties to compensate for the shift-variant nature of the wavelet transform (especially for 2D+t schemes); iii) in the performance of inter-scale prediction (ISP) mechanisms (especially for 2D+t+2D schemes). In this document we recall the STool scheme principles, already presented in [10]. We present an STool SVC architecture and compare it with respect other SVC schemes. Some main advancements and new solutions are detailed and the related results presented. Our software implementations are based on the VidWav reference software [11,12]

    SoftCast: Clean-slate Scalable Wireless Video

    Get PDF
    Video broadcast and mobile video challenge the conventional wireless design. In broadcast and mobile scenarios the bit rate supported by the channel differs across receivers and varies quickly over time. The conventional design however forces the source to pick a single bit rate and degrades sharply when the channel cannot not support the chosen bit rate. This paper presents SoftCast, a clean-slate design for wireless video where the source transmits one video stream that each receiver decodes to a video quality commensurate with its specific instantaneous channel quality. To do so, SoftCast ensures the samples of the digital video signal transmitted on the channel are linearly related to the pixels' luminance. Thus, when channel noise perturbs the transmitted signal samples, the perturbation naturally translates into approximation in the original video pixels. Hence, a receiver with a good channel (low noise) obtains a high fidelity video, and a receiver with a bad channel (high noise) obtains a low fidelity video. We implement SoftCast using the GNURadio software and the USRP platform. Results from a 20-node testbed show that SoftCast improves the average video quality (i.e., PSNR) across broadcast receivers in our testbed by up to 5.5dB. Even for a single receiver, it eliminates video glitches caused by mobility and increases robustness to packet loss by an order of magnitude
    • 

    corecore