1,213 research outputs found

    Saliency-Aware Spatio-Temporal Artifact Detection for Compressed Video Quality Assessment

    Full text link
    Compressed videos often exhibit visually annoying artifacts, known as Perceivable Encoding Artifacts (PEAs), which dramatically degrade video visual quality. Subjective and objective measures capable of identifying and quantifying various types of PEAs are critical in improving visual quality. In this paper, we investigate the influence of four spatial PEAs (i.e. blurring, blocking, bleeding, and ringing) and two temporal PEAs (i.e. flickering and floating) on video quality. For spatial artifacts, we propose a visual saliency model with a low computational cost and higher consistency with human visual perception. In terms of temporal artifacts, self-attention based TimeSFormer is improved to detect temporal artifacts. Based on the six types of PEAs, a quality metric called Saliency-Aware Spatio-Temporal Artifacts Measurement (SSTAM) is proposed. Experimental results demonstrate that the proposed method outperforms state-of-the-art metrics. We believe that SSTAM will be beneficial for optimizing video coding techniques

    Hierarchical representations for spatio-temporal visual attention: modeling and understanding

    Get PDF
    Mención Internacional en el título de doctorDentro del marco de la Inteligencia Artificial, la Visión Artificial es una disciplina científica que tiene como objetivo simular automaticamente las funciones del sistema visual humano, tratando de resolver tareas como la localización y el reconocimiento de objetos, la detección de eventos o el seguimiento de objetos....Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Luis Salgado Álvarez de Sotomayor.- Secretario: Ascensión Gallardo Antolín.- Vocal: Jenny Benois Pinea

    Spatiotemporal Saliency Detection: State of Art

    Get PDF
    Saliency detection has become a very prominent subject for research in recent time. Many techniques has been defined for the saliency detection.In this paper number of techniques has been explained that include the saliency detection from the year 2000 to 2015, almost every technique has been included.all the methods are explained briefly including their advantages and disadvantages. Comparison between various techniques has been done. With the help of table which includes authors name,paper name,year,techniques,algorithms and challenges. A comparison between levels of acceptance rates and accuracy levels are made

    PIM: Video Coding using Perceptual Importance Maps

    Full text link
    Human perception is at the core of lossy video compression, with numerous approaches developed for perceptual quality assessment and improvement over the past two decades. In the determination of perceptual quality, different spatio-temporal regions of the video differ in their relative importance to the human viewer. However, since it is challenging to infer or even collect such fine-grained information, it is often not used during compression beyond low-level heuristics. We present a framework which facilitates research into fine-grained subjective importance in compressed videos, which we then utilize to improve the rate-distortion performance of an existing video codec (x264). The contributions of this work are threefold: (1) we introduce a web-tool which allows scalable collection of fine-grained perceptual importance, by having users interactively paint spatio-temporal maps over encoded videos; (2) we use this tool to collect a dataset with 178 videos with a total of 14443 frames of human annotated spatio-temporal importance maps over the videos; and (3) we use our curated dataset to train a lightweight machine learning model which can predict these spatio-temporal importance regions. We demonstrate via a subjective study that encoding the videos in our dataset while taking into account the importance maps leads to higher perceptual quality at the same bitrate, with the videos encoded with importance maps preferred 1.8×1.8 \times over the baseline videos. Similarly, we show that for the 18 videos in test set, the importance maps predicted by our model lead to higher perceptual quality videos, 2×2 \times preferred over the baseline at the same bitrate
    corecore