150 research outputs found

    Automatic Video Quality Measurement System And Method Based On Spatial-temporal Coherence Metrics

    Get PDF
    An automatic video quality (AVQ) metric system for evaluating the quality of processed video and deriving an estimate of a subjectively determined function called Mean Time Between Failures (MTBF). The AVQ system has a blockiness metric, a streakiness metric, and a blurriness metric. The blockiness metric can be used to measure compression artifacts in processed video. The streakiness metric can be used to measure network artifacts in the processed video. The blurriness metric can measure the degradation (i.e., blurriness) of the images in the processed video to detect compression artifacts.Georgia Tech Research Corporatio

    PEA265: Perceptual Assessment of Video Compression Artifacts

    Full text link
    The most widely used video encoders share a common hybrid coding framework that includes block-based motion estimation/compensation and block-based transform coding. Despite their high coding efficiency, the encoded videos often exhibit visually annoying artifacts, denoted as Perceivable Encoding Artifacts (PEAs), which significantly degrade the visual Qualityof- Experience (QoE) of end users. To monitor and improve visual QoE, it is crucial to develop subjective and objective measures that can identify and quantify various types of PEAs. In this work, we make the first attempt to build a large-scale subjectlabelled database composed of H.265/HEVC compressed videos containing various PEAs. The database, namely the PEA265 database, includes 4 types of spatial PEAs (i.e. blurring, blocking, ringing and color bleeding) and 2 types of temporal PEAs (i.e. flickering and floating). Each containing at least 60,000 image or video patches with positive and negative labels. To objectively identify these PEAs, we train Convolutional Neural Networks (CNNs) using the PEA265 database. It appears that state-of-theart ResNeXt is capable of identifying each type of PEAs with high accuracy. Furthermore, we define PEA pattern and PEA intensity measures to quantify PEA levels of compressed video sequence. We believe that the PEA265 database and our findings will benefit the future development of video quality assessment methods and perceptually motivated video encoders.Comment: 10 pages,15 figures,4 table

    Video Quality Metrics

    Get PDF

    DEMI : deep video quality estimation model using perceptual video quality dimensions

    Get PDF
    Existing works in the field of quality assessment focus separately on gaming and non-gaming content. Along with the traditional modeling approaches, deep learning based approaches have been used to develop quality models, due to their high prediction accuracy. In this paper, we present a deep learning based quality estimation model considering both gaming and non-gaming videos. The model is developed in three phases. First, a convolutional neural network (CNN) is trained based on an objective metric which allows the CNN to learn video artifacts such as blurriness and blockiness. Next, the model is fine-tuned based on a small image quality dataset using blockiness and blurriness ratings. Finally, a Random Forest is used to pool frame-level predictions and temporal information of videos in order to predict the overall video quality. The light-weight, low complexity nature of the model makes it suitable for real-time applications considering both gaming and non-gaming content while achieving similar performance to existing state-of-the-art model NDNetGaming. The model implementation for testing is available on GitHub

    VIDEO PREPROCESSING BASED ON HUMAN PERCEPTION FOR TELESURGERY

    Get PDF
    Video transmission plays a critical role in robotic telesurgery because of the high bandwidth and high quality requirement. The goal of this dissertation is to find a preprocessing method based on human visual perception for telesurgical video, so that when preprocessed image sequences are passed to the video encoder, the bandwidth can be reallocated from non-essential surrounding regions to the region of interest, ensuring excellent image quality of critical regions (e.g. surgical region). It can also be considered as a quality control scheme that will gracefully degrade the video quality in the presence of network congestion. The proposed preprocessing method can be separated into two major parts. First, we propose a time-varying attention map whose value is highest at the gazing point and falls off progressively towards the periphery. Second, we propose adaptive spatial filtering and the parameters of which are adjusted according to the attention map. By adding visual adaptation to the spatial filtering, telesurgical video data can be compressed efficiently because of the high degree of visual redundancy removal by our algorithm. Our experimental results have shown that with the proposed preprocessing method, over half of the bandwidth can be reduced while there is no significant visual effect for the observer. We have also developed an optimal parameter selecting algorithm, so that when the network bandwidth is limited, the overall visual distortion after preprocessing is minimized

    No-reference video quality assessment model based on artifact metrics for digital transmission applications

    Get PDF
    Tese (doutorado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2017.Um dos principais fatores para a redução da qualidade do conteúdo visual, em sistemas de imagem digital, são a presença de degradações introduzidas durante as etapas de processamento de sinais. Contudo, medir a qualidade de um vídeo implica em comparar direta ou indiretamente um vídeo de teste com o seu vídeo de referência. Na maioria das aplicações, os seres humanos são o meio mais confiável de estimar a qualidade de um vídeo. Embora mais confiáveis, estes métodos consomem tempo e são difíceis de incorporar em um serviço de controle de qualidade automatizado. Como alternativa, as métricas objectivas, ou seja, algoritmos, são geralmente usadas para estimar a qualidade de um vídeo automaticamente. Para desenvolver uma métrica objetiva é importante entender como as características perceptuais de um conjunto de artefatos estão relacionadas com suas forças físicas e com o incômodo percebido. Então, nós estudamos as características de diferentes tipos de artefatos comumente encontrados em vídeos comprimidos (ou seja, blocado, borrado e perda-de-pacotes) por meio de experimentos psicofísicos para medir independentemente a força e o incômodo desses artefatos, quando sozinhos ou combinados no vídeo. Nós analisamos os dados obtidos desses experimentos e propomos vários modelos de qualidade baseados nas combinações das forças perceptuais de artefatos individuais e suas interações. Inspirados pelos resultados experimentos, nós propomos uma métrica sem-referência baseada em características extraídas dos vídeos (por exemplo, informações DCT, a média da diferença absoluta entre blocos de uma imagem, variação da intensidade entre pixels vizinhos e atenção visual). Um modelo de regressão não-linear baseado em vetores de suporte (Support Vector Regression) é usado para combinar todas as características e estimar a qualidade do vídeo. Nossa métrica teve um desempenho muito melhor que as métricas de artefatos testadas e para algumas métricas com-referência (full-reference).The main causes for the reducing of visual quality in digital imaging systems are the unwanted presence of degradations introduced during processing and transmission steps. However, measuring the quality of a video implies in a direct or indirect comparison between test video and reference video. In most applications, psycho-physical experiments with human subjects are the most reliable means of determining the quality of a video. Although more reliable, these methods are time consuming and difficult to incorporate into an automated quality control service. As an alternative, objective metrics, i.e. algorithms, are generally used to estimate video quality quality automatically. To develop an objective metric, it is important understand how the perceptual characteristics of a set of artifacts are related to their physical strengths and to the perceived annoyance. Then, to study the characteristics of different types of artifacts commonly found in compressed videos (i.e. blockiness, blurriness, and packet-loss) we performed six psychophysical experiments to independently measure the strength and overall annoyance of these artifact signals when presented alone or in combination. We analyzed the data from these experiments and proposed several models for the overall annoyance based on combinations of the perceptual strengths of the individual artifact signals and their interactions. Inspired by experimental results, we proposed a no-reference video quality metric based in several features extracted from the videos (e.g. DCT information, cross-correlation of sub-sampled images, average absolute differences between block image pixels, intensity variation between neighbouring pixels, and visual attention). A non-linear regression model using a support vector (SVR) technique is used to combine all features to obtain an overall quality estimate. Our metric performed better than the tested artifact metrics and for some full-reference metrics

    Video restoration and enhancement: algorithms and applications

    Full text link

    Finding perceptually optimal operating points of a real time interactive video-conferencing system

    Get PDF
    This research aims to address issues faced by real time video-conferencing systems in locating a perceptually optimal operating point under various network and conversational conditions. In order to determine the perceptually optimal operating point of a video-conferencing system, we must first be able to conduct a fair assessment of the quality of the current operating point in the system and compare it with another operating point to determine if one is better than the other in terms of perceptual quality. However at this point in time, there does not exist one objective quality metric that can accurately and fully describe the perceptual quality of a real time video conversation. Hence there is a need for a controlled environment to allow tests to be conducted in and in which we can study different metrics and identify the best trade-offs between them. We begin by studying the components of a typical setup of a real time video-conferencing system and the impacts that various network and conversation conditions can have on the overall perceptual quality. We also look into different metrics available to measure those impacts. We then created a platform to perform black box testing on current video conferencing systems and observe how they handle the changes in operating conditions. The platform is then used to conduct a brief evaluation of the performance of Skype, a popular commercial video-conferencing system. However, we are not able to modify the system parameters of Skype. The main contribution of this thesis is the design of a new testbed that provides a controlled environment to allow tests to be conducted to determine the perceptual optimum operating point of a video conversation under specified network and conversation conditions. This testbed will allow us to modify certain parameters, such as frame rate and frame size, which were not previously possible. The testbed takes as input, two recorded videos of the two speakers of a face-to-face conversation and desired output video parameters, such as frame rate, frame size and delay. A video generation algorithm is designed as part of the testbed to handle modifications to frame rate and frame size of the videos as well as delays inserted into the recorded video conversation to simulate the effects of network delays. The most important issue addressed is the generation of new frames to fill up the gaps created due to a change in frame rate or delay inserted, unlike as in the case of voice, where a period of silence can simply be used to handle these situations. The testbed uses a packetization strategy designed on the basis of an uneven packet transmission rate (UPTR) and that handles the packetization of interleaved video and audio data; it also uses piggybacking to provide redundancy if required. Losses can be injected either randomly or based on packet traces collected via PlanetLab. The processed videos will then be pieced together side-by-side to give the viewpoint of a third-party observing the video conversation from the site of the first speaker. Hence the first speaker will be observed to have a faster reaction time without network delays than that of the second speaker who is simulated to be located at the remote end. The video of the second speaker will also reflect the degradations in perceptual quality induced by the network conditions, whereas the first speaker will be of perfect quality. Hence with the testbed, we are able to generate output videos for different operating points under the same network and conversational conditions and thus able to make comparisons between two operating points. With the testbed in place, we demonstrate how it can be used to evaluate the effects of various parameters on the overall perceptual quality. Lastly, we demonstrate the results of applying an existing efficient search algorithm used for estimating the perceptually optimal mouth-to-ear delay (MED) of a Voice-over-IP(VoIP) conversation to a Video Conversation. This is achieved by using the network simulator designed to conduct a series of subjective and objective tests to identify the perceptual optimum MED under specific network and conversational conditions

    One Transform To Compute Them All: Efficient Fusion-Based Full-Reference Video Quality Assessment

    Full text link
    The Visual Multimethod Assessment Fusion (VMAF) algorithm has recently emerged as a state-of-the-art approach to video quality prediction, that now pervades the streaming and social media industry. However, since VMAF requires the evaluation of a heterogeneous set of quality models, it is computationally expensive. Given other advances in hardware-accelerated encoding, quality assessment is emerging as a significant bottleneck in video compression pipelines. Towards alleviating this burden, we propose a novel Fusion of Unified Quality Evaluators (FUNQUE) framework, by enabling computation sharing and by using a transform that is sensitive to visual perception to boost accuracy. Further, we expand the FUNQUE framework to define a collection of improved low-complexity fused-feature models that advance the state-of-the-art of video quality performance with respect to both accuracy, by 4.2\% to 5.3\%, and computational efficiency, by factors of 3.8 to 11 times!Comment: Version
    corecore