792 research outputs found

    Massive Online Crowdsourced Study of Subjective and Objective Picture Quality

    Full text link
    Most publicly available image quality databases have been created under highly controlled conditions by introducing graded simulated distortions onto high-quality photographs. However, images captured using typical real-world mobile camera devices are usually afflicted by complex mixtures of multiple distortions, which are not necessarily well-modeled by the synthetic distortions found in existing databases. The originators of existing legacy databases usually conducted human psychometric studies to obtain statistically meaningful sets of human opinion scores on images in a stringently controlled visual environment, resulting in small data collections relative to other kinds of image analysis databases. Towards overcoming these limitations, we designed and created a new database that we call the LIVE In the Wild Image Quality Challenge Database, which contains widely diverse authentic image distortions on a large number of images captured using a representative variety of modern mobile devices. We also designed and implemented a new online crowdsourcing system, which we have used to conduct a very large-scale, multi-month image quality assessment subjective study. Our database consists of over 350000 opinion scores on 1162 images evaluated by over 7000 unique human observers. Despite the lack of control over the experimental environments of the numerous study participants, we demonstrate excellent internal consistency of the subjective dataset. We also evaluate several top-performing blind Image Quality Assessment algorithms on it and present insights on how mixtures of distortions challenge both end users as well as automatic perceptual quality prediction models.Comment: 16 page

    Large-Scale Study of Perceptual Video Quality

    Full text link
    The great variations of videographic skills, camera designs, compression and processing protocols, and displays lead to an enormous variety of video impairments. Current no-reference (NR) video quality models are unable to handle this diversity of distortions. This is true in part because available video quality assessment databases contain very limited content, fixed resolutions, were captured using a small number of camera devices by a few videographers and have been subjected to a modest number of distortions. As such, these databases fail to adequately represent real world videos, which contain very different kinds of content obtained under highly diverse imaging conditions and are subject to authentic, often commingled distortions that are impossible to simulate. As a result, NR video quality predictors tested on real-world video data often perform poorly. Towards advancing NR video quality prediction, we constructed a large-scale video quality assessment database containing 585 videos of unique content, captured by a large number of users, with wide ranges of levels of complex, authentic distortions. We collected a large number of subjective video quality scores via crowdsourcing. A total of 4776 unique participants took part in the study, yielding more than 205000 opinion scores, resulting in an average of 240 recorded human opinions per video. We demonstrate the value of the new resource, which we call the LIVE Video Quality Challenge Database (LIVE-VQC), by conducting a comparison of leading NR video quality predictors on it. This study is the largest video quality assessment study ever conducted along several key dimensions: number of unique contents, capture devices, distortion types and combinations of distortions, study participants, and recorded subjective scores. The database is available for download on this link: http://live.ece.utexas.edu/research/LIVEVQC/index.html

    Learning to Predict Streaming Video QoE: Distortions, Rebuffering and Memory

    Full text link
    Mobile streaming video data accounts for a large and increasing percentage of wireless network traffic. The available bandwidths of modern wireless networks are often unstable, leading to difficulties in delivering smooth, high-quality video. Streaming service providers such as Netflix and YouTube attempt to adapt their systems to adjust in response to these bandwidth limitations by changing the video bitrate or, failing that, allowing playback interruptions (rebuffering). Being able to predict end user' quality of experience (QoE) resulting from these adjustments could lead to perceptually-driven network resource allocation strategies that would deliver streaming content of higher quality to clients, while being cost effective for providers. Existing objective QoE models only consider the effects on user QoE of video quality changes or playback interruptions. For streaming applications, adaptive network strategies may involve a combination of dynamic bitrate allocation along with playback interruptions when the available bandwidth reaches a very low value. Towards effectively predicting user QoE, we propose Video Assessment of TemporaL Artifacts and Stalls (Video ATLAS): a machine learning framework where we combine a number of QoE-related features, including objective quality features, rebuffering-aware features and memory-driven features to make QoE predictions. We evaluated our learning-based QoE prediction model on the recently designed LIVE-Netflix Video QoE Database which consists of practical playout patterns, where the videos are afflicted by both quality changes and rebuffering events, and found that it provides improved performance over state-of-the-art video quality metrics while generalizing well on different datasets. The proposed algorithm is made publicly available at http://live.ece.utexas.edu/research/Quality/VideoATLAS release_v2.rar.Comment: under review in Transactions on Image Processin

    An Augmented Autoregressive Approach to HTTP Video Stream Quality Prediction

    Full text link
    HTTP-based video streaming technologies allow for flexible rate selection strategies that account for time-varying network conditions. Such rate changes may adversely affect the user's Quality of Experience; hence online prediction of the time varying subjective quality can lead to perceptually optimised bitrate allocation policies. Recent studies have proposed to use dynamic network approaches for continuous-time prediction; yet they do not consider multiple video quality models as inputs nor consider forecasting ensembles. Here we address the problem of predicting continuous-time subjective quality using multiple inputs fed to a non-linear autoregressive network. By considering multiple network configurations and by applying simple averaging forecasting techniques, we are able to considerably improve prediction performance and decrease forecasting errors

    Automatic Channel Network Extraction from Remotely Sensed Images by Singularity Analysis

    Full text link
    Quantitative analysis of channel networks plays an important role in river studies. To provide a quantitative representation of channel networks, we propose a new method that extracts channels from remotely sensed images and estimates their widths. Our fully automated method is based on a recently proposed Multiscale Singularity Index that responds strongly to curvilinear structures but weakly to edges. The algorithm produces a channel map, using a single image where water and non-water pixels have contrast, such as a Landsat near-infrared band image or a water index defined on multiple bands. The proposed method provides a robust alternative to the procedures that are used in remote sensing of fluvial geomorphology and makes classification and analysis of channel networks easier. The source code of the algorithm is available at: http://live.ece.utexas.edu/research/cne/.Comment: IEEE Geosci. Remote Sens. Lett., in revie

    A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction

    Full text link
    Blind image quality assessment (BIQA) remains a very challenging problem due to the unavailability of a reference image. Deep learning based BIQA methods have been attracting increasing attention in recent years, yet it remains a difficult task to train a robust deep BIQA model because of the very limited number of training samples with human subjective scores. Most existing methods learn a regression network to minimize the prediction error of a scalar image quality score. However, such a scheme ignores the fact that an image will receive divergent subjective scores from different subjects, which cannot be adequately represented by a single scalar number. This is particularly true on complex, real-world distorted images. Moreover, images may broadly differ in their distributions of assigned subjective scores. Recognizing this, we propose a new representation of perceptual image quality, called probabilistic quality representation (PQR), to describe the image subjective score distribution, whereby a more robust loss function can be employed to train a deep BIQA model. The proposed PQR method is shown to not only speed up the convergence of deep model training, but to also greatly improve the achievable level of quality prediction accuracy relative to scalar quality score regression methods. The source code is available at https://github.com/HuiZeng/BIQA_Toolbox.Comment: Add the link of source cod

    Weeping and Gnashing of Teeth: Teaching Deep Learning in Image and Video Processing Classes

    Full text link
    In this rather informal paper and talk I will discuss my own experiences, feelings, and evolution as an Image Processing and Digital Video educator trying to navigate the Deep Learning revolution. I will discuss my own ups and downs of trying to deal with extremely rapid technological changes, and how I have reacted to, and dealt with consequent dramatic changes in the relevance of the topics I've taught for three decades. I have arranged the discussion in terms of the stages, over time, of my progression dealing with these sea changes.Comment: 5 page

    UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content

    Full text link
    Recent years have witnessed an explosion of user-generated content (UGC) videos shared and streamed over the Internet, thanks to the evolution of affordable and reliable consumer capture devices, and the tremendous popularity of social media platforms. Accordingly, there is a great need for accurate video quality assessment (VQA) models for UGC/consumer videos to monitor, control, and optimize this vast content. Blind quality prediction of in-the-wild videos is quite challenging, since the quality degradations of UGC content are unpredictable, complicated, and often commingled. Here we contribute to advancing the UGC-VQA problem by conducting a comprehensive evaluation of leading no-reference/blind VQA (BVQA) features and models on a fixed evaluation architecture, yielding new empirical insights on both subjective video quality studies and VQA model design. By employing a feature selection strategy on top of leading VQA model features, we are able to extract 60 of the 763 statistical features used by the leading models to create a new fusion-based BVQA model, which we dub the \textbf{VID}eo quality \textbf{EVAL}uator (VIDEVAL), that effectively balances the trade-off between VQA performance and efficiency. Our experimental results show that VIDEVAL achieves state-of-the-art performance at considerably lower computational cost than other leading models. Our study protocol also defines a reliable benchmark for the UGC-VQA problem, which we believe will facilitate further research on deep learning-based VQA modeling, as well as perceptually-optimized efficient UGC video processing, transcoding, and streaming. To promote reproducible research and public evaluation, an implementation of VIDEVAL has been made available online: \url{https://github.com/tu184044109/VIDEVAL_release}.Comment: 13 pages, 11 figures, 11 table

    180-degree Outpainting from a Single Image

    Full text link
    Presenting context images to a viewer's peripheral vision is one of the most effective techniques to enhance immersive visual experiences. However, most images only present a narrow view, since the field-of-view (FoV) of standard cameras is small. To overcome this limitation, we propose a deep learning approach that learns to predict a 180{\deg} panoramic image from a narrow-view image. Specifically, we design a foveated framework that applies different strategies on near-periphery and mid-periphery regions. Two networks are trained separately, and then are employed jointly to sequentially perform narrow-to-90{\deg} generation and 90{\deg}-to-180{\deg} generation. The generated outputs are then fused with their aligned inputs to produce expanded equirectangular images for viewing. Our experimental results show that single-view-to-panoramic image generation using deep learning is both feasible and promising

    A Markov Decision Model for Adaptive Scheduling of Stored Scalable Videos

    Full text link
    We propose two scheduling algorithms that seek to optimize the quality of scalably coded videos that have been stored at a video server before transmission.} The first scheduling algorithm is derived from a Markov Decision Process (MDP) formulation developed here. We model the dynamics of the channel as a Markov chain and reduce the problem of dynamic video scheduling to a tractable Markov decision problem over a finite state space. Based on the MDP formulation, a near-optimal scheduling policy is computed that minimize the mean square error. Using insights taken from the development of the optimal MDP-based scheduling policy, the second proposed scheduling algorithm is an online scheduling method that only requires easily measurable knowledge of the channel dynamics, and is thus viable in practice. Simulation results show that the performance of both scheduling algorithms is close to a performance upper bound also derived in this paper.Comment: 14 page
    • …
    corecore