208 research outputs found
Effective Aesthetics Prediction with Multi-level Spatially Pooled Features
We propose an effective deep learning approach to aesthetics quality
assessment that relies on a new type of pre-trained features, and apply it to
the AVA data set, the currently largest aesthetics database. While previous
approaches miss some of the information in the original images, due to taking
small crops, down-scaling or warping the originals during training, we propose
the first method that efficiently supports full resolution images as an input,
and can be trained on variable input sizes. This allows us to significantly
improve upon the state of the art, increasing the Spearman rank-order
correlation coefficient (SRCC) of ground-truth mean opinion scores (MOS) from
the existing best reported of 0.612 to 0.756. To achieve this performance, we
extract multi-level spatially pooled (MLSP) features from all convolutional
blocks of a pre-trained InceptionResNet-v2 network, and train a custom shallow
Convolutional Neural Network (CNN) architecture on these new features.Comment: To appear in CVPR 201
Subjective Annotation for a Frame Interpolation Benchmark using Artefact Amplification
Current benchmarks for optical flow algorithms evaluate the estimation either
directly by comparing the predicted flow fields with the ground truth or
indirectly by using the predicted flow fields for frame interpolation and then
comparing the interpolated frames with the actual frames. In the latter case,
objective quality measures such as the mean squared error are typically
employed. However, it is well known that for image quality assessment, the
actual quality experienced by the user cannot be fully deduced from such simple
measures. Hence, we conducted a subjective quality assessment crowdscouring
study for the interpolated frames provided by one of the optical flow
benchmarks, the Middlebury benchmark. We collected forced-choice paired
comparisons between interpolated images and corresponding ground truth. To
increase the sensitivity of observers when judging minute difference in paired
comparisons we introduced a new method to the field of full-reference quality
assessment, called artefact amplification. From the crowdsourcing data, we
reconstructed absolute quality scale values according to Thurstone's model. As
a result, we obtained a re-ranking of the 155 participating algorithms w.r.t.
the visual quality of the interpolated frames. This re-ranking not only shows
the necessity of visual quality assessment as another evaluation metric for
optical flow and frame interpolation benchmarks, the results also provide the
ground truth for designing novel image quality assessment (IQA) methods
dedicated to perceptual quality of interpolated images. As a first step, we
proposed such a new full-reference method, called WAE-IQA. By weighing the
local differences between an interpolated image and its ground truth WAE-IQA
performed slightly better than the currently best FR-IQA approach from the
literature.Comment: arXiv admin note: text overlap with arXiv:1901.0536
Recovering Missing Coefficients in DCT-Transformed Images
A general method for recovering missing DCT coefficients in DCT-transformed
images is presented in this work. We model the DCT coefficients recovery
problem as an optimization problem and recover all missing DCT coefficients via
linear programming. The visual quality of the recovered image gradually
decreases as the number of missing DCT coefficients increases. For some images,
the quality is surprisingly good even when more than 10 most significant DCT
coefficients are missing. When only the DC coefficient is missing, the proposed
algorithm outperforms existing methods according to experimental results
conducted on 200 test images. The proposed recovery method can be used for
cryptanalysis of DCT based selective encryption schemes and other applications.Comment: 4 pages, 4 figure
An Improved DC Recovery Method from AC Coefficients of DCT-Transformed Images
Motivated by the work of Uehara et al. [1], an improved method to recover DC
coefficients from AC coefficients of DCT-transformed images is investigated in
this work, which finds applications in cryptanalysis of selective multimedia
encryption. The proposed under/over-flow rate minimization (FRM) method employs
an optimization process to get a statistically more accurate estimation of
unknown DC coefficients, thus achieving a better recovery performance. It was
shown by experimental results based on 200 test images that the proposed DC
recovery method significantly improves the quality of most recovered images in
terms of the PSNR values and several state-of-the-art objective image quality
assessment (IQA) metrics such as SSIM and MS-SSIM.Comment: 6 pages, 6 figures, ICIP 201
DeepFL-IQA: Weak Supervision for Deep IQA Feature Learning
Multi-level deep-features have been driving state-of-the-art methods for
aesthetics and image quality assessment (IQA). However, most IQA benchmarks are
comprised of artificially distorted images, for which features derived from
ImageNet under-perform. We propose a new IQA dataset and a weakly supervised
feature learning approach to train features more suitable for IQA of
artificially distorted images. The dataset, KADIS-700k, is far more extensive
than similar works, consisting of 140,000 pristine images, 25 distortions
types, totaling 700k distorted versions. Our weakly supervised feature learning
is designed as a multi-task learning type training, using eleven existing
full-reference IQA metrics as proxies for differential mean opinion scores. We
also introduce a benchmark database, KADID-10k, of artificially degraded
images, each subjectively annotated by 30 crowd workers. We make use of our
derived image feature vectors for (no-reference) image quality assessment by
training and testing a shallow regression network on this database and five
other benchmark IQA databases. Our method, termed DeepFL-IQA, performs better
than other feature-based no-reference IQA methods and also better than all
tested full-reference IQA methods on KADID-10k. For the other five benchmark
IQA databases, DeepFL-IQA matches the performance of the best existing
end-to-end deep learning-based methods on average.Comment: dataset url: http://database.mmsp-kn.d
Critical analysis on the reproducibility of visual quality assessment using deep features
Data used to train supervised machine learning models are commonly split into
independent training, validation, and test sets. In this paper we illustrate
that intricate cases of data leakage have occurred in the no-reference video
and image quality assessment literature. We show that the performance results
of several recently published journal papers that are well above the best
performances in related works, cannot be reached. Our analysis shows that
information from the test set was inappropriately used in the training process
in different ways. When correcting for the data leakage, the performances of
the approaches drop below the state-of-the-art by a large margin. Additionally,
we investigate end-to-end variations to the discussed approaches, which do not
improve upon the original.Comment: 20 pages, 7 figures, PLOS ONE journal. arXiv admin note: substantial
text overlap with arXiv:2005.0440
Localization of Just Noticeable Difference for Image Compression
The just noticeable difference (JND) is the minimal difference between
stimuli that can be detected by a person. The picture-wise just noticeable
difference (PJND) for a given reference image and a compression algorithm
represents the minimal level of compression that causes noticeable differences
in the reconstruction. These differences can only be observed in some specific
regions within the image, dubbed as JND-critical regions. Identifying these
regions can improve the development of image compression algorithms. Due to the
fact that visual perception varies among individuals, determining the PJND
values and JND-critical regions for a target population of consumers requires
subjective assessment experiments involving a sufficiently large number of
observers. In this paper, we propose a novel framework for conducting such
experiments using crowdsourcing. By applying this framework, we created a novel
PJND dataset, KonJND++, consisting of 300 source images, compressed versions
thereof under JPEG or BPG compression, and an average of 43 ratings of PJND and
129 self-reported locations of JND-critical regions for each source image. Our
experiments demonstrate the effectiveness and reliability of our proposed
framework, which is easy to be adapted for collecting a large-scale dataset.
The source code and dataset are available at
https://github.com/angchen-dev/LocJND
- …