792 research outputs found
Massive Online Crowdsourced Study of Subjective and Objective Picture Quality
Most publicly available image quality databases have been created under
highly controlled conditions by introducing graded simulated distortions onto
high-quality photographs. However, images captured using typical real-world
mobile camera devices are usually afflicted by complex mixtures of multiple
distortions, which are not necessarily well-modeled by the synthetic
distortions found in existing databases. The originators of existing legacy
databases usually conducted human psychometric studies to obtain statistically
meaningful sets of human opinion scores on images in a stringently controlled
visual environment, resulting in small data collections relative to other kinds
of image analysis databases. Towards overcoming these limitations, we designed
and created a new database that we call the LIVE In the Wild Image Quality
Challenge Database, which contains widely diverse authentic image distortions
on a large number of images captured using a representative variety of modern
mobile devices. We also designed and implemented a new online crowdsourcing
system, which we have used to conduct a very large-scale, multi-month image
quality assessment subjective study. Our database consists of over 350000
opinion scores on 1162 images evaluated by over 7000 unique human observers.
Despite the lack of control over the experimental environments of the numerous
study participants, we demonstrate excellent internal consistency of the
subjective dataset. We also evaluate several top-performing blind Image Quality
Assessment algorithms on it and present insights on how mixtures of distortions
challenge both end users as well as automatic perceptual quality prediction
models.Comment: 16 page
Large-Scale Study of Perceptual Video Quality
The great variations of videographic skills, camera designs, compression and
processing protocols, and displays lead to an enormous variety of video
impairments. Current no-reference (NR) video quality models are unable to
handle this diversity of distortions. This is true in part because available
video quality assessment databases contain very limited content, fixed
resolutions, were captured using a small number of camera devices by a few
videographers and have been subjected to a modest number of distortions. As
such, these databases fail to adequately represent real world videos, which
contain very different kinds of content obtained under highly diverse imaging
conditions and are subject to authentic, often commingled distortions that are
impossible to simulate. As a result, NR video quality predictors tested on
real-world video data often perform poorly. Towards advancing NR video quality
prediction, we constructed a large-scale video quality assessment database
containing 585 videos of unique content, captured by a large number of users,
with wide ranges of levels of complex, authentic distortions. We collected a
large number of subjective video quality scores via crowdsourcing. A total of
4776 unique participants took part in the study, yielding more than 205000
opinion scores, resulting in an average of 240 recorded human opinions per
video. We demonstrate the value of the new resource, which we call the LIVE
Video Quality Challenge Database (LIVE-VQC), by conducting a comparison of
leading NR video quality predictors on it. This study is the largest video
quality assessment study ever conducted along several key dimensions: number of
unique contents, capture devices, distortion types and combinations of
distortions, study participants, and recorded subjective scores. The database
is available for download on this link:
http://live.ece.utexas.edu/research/LIVEVQC/index.html
Learning to Predict Streaming Video QoE: Distortions, Rebuffering and Memory
Mobile streaming video data accounts for a large and increasing percentage of
wireless network traffic. The available bandwidths of modern wireless networks
are often unstable, leading to difficulties in delivering smooth, high-quality
video. Streaming service providers such as Netflix and YouTube attempt to adapt
their systems to adjust in response to these bandwidth limitations by changing
the video bitrate or, failing that, allowing playback interruptions
(rebuffering). Being able to predict end user' quality of experience (QoE)
resulting from these adjustments could lead to perceptually-driven network
resource allocation strategies that would deliver streaming content of higher
quality to clients, while being cost effective for providers. Existing
objective QoE models only consider the effects on user QoE of video quality
changes or playback interruptions. For streaming applications, adaptive network
strategies may involve a combination of dynamic bitrate allocation along with
playback interruptions when the available bandwidth reaches a very low value.
Towards effectively predicting user QoE, we propose Video Assessment of
TemporaL Artifacts and Stalls (Video ATLAS): a machine learning framework where
we combine a number of QoE-related features, including objective quality
features, rebuffering-aware features and memory-driven features to make QoE
predictions. We evaluated our learning-based QoE prediction model on the
recently designed LIVE-Netflix Video QoE Database which consists of practical
playout patterns, where the videos are afflicted by both quality changes and
rebuffering events, and found that it provides improved performance over
state-of-the-art video quality metrics while generalizing well on different
datasets. The proposed algorithm is made publicly available at
http://live.ece.utexas.edu/research/Quality/VideoATLAS release_v2.rar.Comment: under review in Transactions on Image Processin
An Augmented Autoregressive Approach to HTTP Video Stream Quality Prediction
HTTP-based video streaming technologies allow for flexible rate selection
strategies that account for time-varying network conditions. Such rate changes
may adversely affect the user's Quality of Experience; hence online prediction
of the time varying subjective quality can lead to perceptually optimised
bitrate allocation policies. Recent studies have proposed to use dynamic
network approaches for continuous-time prediction; yet they do not consider
multiple video quality models as inputs nor consider forecasting ensembles.
Here we address the problem of predicting continuous-time subjective quality
using multiple inputs fed to a non-linear autoregressive network. By
considering multiple network configurations and by applying simple averaging
forecasting techniques, we are able to considerably improve prediction
performance and decrease forecasting errors
Automatic Channel Network Extraction from Remotely Sensed Images by Singularity Analysis
Quantitative analysis of channel networks plays an important role in river
studies. To provide a quantitative representation of channel networks, we
propose a new method that extracts channels from remotely sensed images and
estimates their widths. Our fully automated method is based on a recently
proposed Multiscale Singularity Index that responds strongly to curvilinear
structures but weakly to edges. The algorithm produces a channel map, using a
single image where water and non-water pixels have contrast, such as a Landsat
near-infrared band image or a water index defined on multiple bands. The
proposed method provides a robust alternative to the procedures that are used
in remote sensing of fluvial geomorphology and makes classification and
analysis of channel networks easier. The source code of the algorithm is
available at: http://live.ece.utexas.edu/research/cne/.Comment: IEEE Geosci. Remote Sens. Lett., in revie
A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction
Blind image quality assessment (BIQA) remains a very challenging problem due
to the unavailability of a reference image. Deep learning based BIQA methods
have been attracting increasing attention in recent years, yet it remains a
difficult task to train a robust deep BIQA model because of the very limited
number of training samples with human subjective scores. Most existing methods
learn a regression network to minimize the prediction error of a scalar image
quality score. However, such a scheme ignores the fact that an image will
receive divergent subjective scores from different subjects, which cannot be
adequately represented by a single scalar number. This is particularly true on
complex, real-world distorted images. Moreover, images may broadly differ in
their distributions of assigned subjective scores. Recognizing this, we propose
a new representation of perceptual image quality, called probabilistic quality
representation (PQR), to describe the image subjective score distribution,
whereby a more robust loss function can be employed to train a deep BIQA model.
The proposed PQR method is shown to not only speed up the convergence of deep
model training, but to also greatly improve the achievable level of quality
prediction accuracy relative to scalar quality score regression methods. The
source code is available at https://github.com/HuiZeng/BIQA_Toolbox.Comment: Add the link of source cod
Weeping and Gnashing of Teeth: Teaching Deep Learning in Image and Video Processing Classes
In this rather informal paper and talk I will discuss my own experiences,
feelings, and evolution as an Image Processing and Digital Video educator
trying to navigate the Deep Learning revolution. I will discuss my own ups and
downs of trying to deal with extremely rapid technological changes, and how I
have reacted to, and dealt with consequent dramatic changes in the relevance of
the topics I've taught for three decades. I have arranged the discussion in
terms of the stages, over time, of my progression dealing with these sea
changes.Comment: 5 page
UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content
Recent years have witnessed an explosion of user-generated content (UGC)
videos shared and streamed over the Internet, thanks to the evolution of
affordable and reliable consumer capture devices, and the tremendous popularity
of social media platforms. Accordingly, there is a great need for accurate
video quality assessment (VQA) models for UGC/consumer videos to monitor,
control, and optimize this vast content. Blind quality prediction of
in-the-wild videos is quite challenging, since the quality degradations of UGC
content are unpredictable, complicated, and often commingled. Here we
contribute to advancing the UGC-VQA problem by conducting a comprehensive
evaluation of leading no-reference/blind VQA (BVQA) features and models on a
fixed evaluation architecture, yielding new empirical insights on both
subjective video quality studies and VQA model design. By employing a feature
selection strategy on top of leading VQA model features, we are able to extract
60 of the 763 statistical features used by the leading models to create a new
fusion-based BVQA model, which we dub the \textbf{VID}eo quality
\textbf{EVAL}uator (VIDEVAL), that effectively balances the trade-off between
VQA performance and efficiency. Our experimental results show that VIDEVAL
achieves state-of-the-art performance at considerably lower computational cost
than other leading models. Our study protocol also defines a reliable benchmark
for the UGC-VQA problem, which we believe will facilitate further research on
deep learning-based VQA modeling, as well as perceptually-optimized efficient
UGC video processing, transcoding, and streaming. To promote reproducible
research and public evaluation, an implementation of VIDEVAL has been made
available online: \url{https://github.com/tu184044109/VIDEVAL_release}.Comment: 13 pages, 11 figures, 11 table
180-degree Outpainting from a Single Image
Presenting context images to a viewer's peripheral vision is one of the most
effective techniques to enhance immersive visual experiences. However, most
images only present a narrow view, since the field-of-view (FoV) of standard
cameras is small. To overcome this limitation, we propose a deep learning
approach that learns to predict a 180{\deg} panoramic image from a narrow-view
image. Specifically, we design a foveated framework that applies different
strategies on near-periphery and mid-periphery regions. Two networks are
trained separately, and then are employed jointly to sequentially perform
narrow-to-90{\deg} generation and 90{\deg}-to-180{\deg} generation. The
generated outputs are then fused with their aligned inputs to produce expanded
equirectangular images for viewing. Our experimental results show that
single-view-to-panoramic image generation using deep learning is both feasible
and promising
A Markov Decision Model for Adaptive Scheduling of Stored Scalable Videos
We propose two scheduling algorithms that seek to optimize the quality of
scalably coded videos that have been stored at a video server before
transmission.} The first scheduling algorithm is derived from a Markov Decision
Process (MDP) formulation developed here. We model the dynamics of the channel
as a Markov chain and reduce the problem of dynamic video scheduling to a
tractable Markov decision problem over a finite state space. Based on the MDP
formulation, a near-optimal scheduling policy is computed that minimize the
mean square error. Using insights taken from the development of the optimal
MDP-based scheduling policy, the second proposed scheduling algorithm is an
online scheduling method that only requires easily measurable knowledge of the
channel dynamics, and is thus viable in practice. Simulation results show that
the performance of both scheduling algorithms is close to a performance upper
bound also derived in this paper.Comment: 14 page
- …