481 research outputs found
Data Analysis in Multimedia Quality Assessment: Revisiting the Statistical Tests
Assessment of multimedia quality relies heavily on subjective assessment, and
is typically done by human subjects in the form of preferences or continuous
ratings. Such data is crucial for analysis of different multimedia processing
algorithms as well as validation of objective (computational) methods for the
said purpose. To that end, statistical testing provides a theoretical framework
towards drawing meaningful inferences, and making well grounded conclusions and
recommendations. While parametric tests (such as t test, ANOVA, and error
estimates like confidence intervals) are popular and widely used in the
community, there appears to be a certain degree of confusion in the application
of such tests. Specifically, the assumption of normality and homogeneity of
variance is often not well understood. Therefore, the main goal of this paper
is to revisit them from a theoretical perspective and in the process provide
useful insights into their practical implications. Experimental results on both
simulated and real data are presented to support the arguments made. A software
implementing the said recommendations is also made publicly available, in order
to achieve the goal of reproducible research
On the perceptual similarity of realistic looking tone mapped High Dynamic Range images
International audienceHigh Dynamic Range (HDR) images are usually displayed on conventional Low Dynamic Range (LDR) displays because of the limited availability of HDR displays. For the conversion of the large dynamic luminance range into the eight bit quantized values, parameterized Tone Mapping Operators (TMO) are applied. Human observers are able to optimize the parameters in order to get the highest Quality of Experience by judging the displayed LDR images on a realism scale. In the study presented in this paper, two TMOs with three parameters each were evaluated by observers in a subjective experiment. Although the chosen parameter settings vary largely, the chosen images appear to have the same QoE for the observers. In order to assess this similarity objectively, three commonly used image quality measurement algorithms were applied. Their agreement with the preference of the observers was analyzed and it was found that the Visual Difference Predictor (VDP) outperforms the Structural Similarity Index and the Root Mean Square Error. A threshold value for VDP is derived that indicates when two LDR images appear to have the same Quality of Experience
A robust image watermarking technique based on quantization noise visibility thresholds
International audienceA tremendous amount of digital multimedia data is broadcasted daily over the internet. Since digital data can be very quickly and easily duplicated, intellectual property right protection techniques have become important and first appeared about fifty years ago (see [I.J. Cox, M.L. Miller, The First 50 Years of Electronic Watermarking, EURASIP J. Appl. Signal Process. 2 (2002) 126-132. [52]] for an extended review). Digital watermarking was born. Since its inception, many watermarking techniques have appeared, in all possible transformed spaces. However, an important lack in watermarking literature concerns the human visual system models. Several human visual system (HVS) model based watermarking techniques were designed in the late 1990's. Due to the weak robustness results, especially concerning geometrical distortions, the interest in such studies has reduced. In this paper, we intend to take advantage of recent advances in HVS models and watermarking techniques to revisit this issue. We will demonstrate that it is possible to resist too many attacks, including geometrical distortions, in HVS based watermarking algorithms. The perceptual model used here takes into account advanced features of the HVS identified from psychophysics experiments conducted in our laboratory. This model has been successfully applied in quality assessment and image coding schemes M. Carnec, P. Le Callet, D. Barba, An image quality assessment method based on perception of structural information, IEEE Internat. Conf. Image Process. 3 (2003) 185-188, N. Bekkat, A. Saadane, D. Barba, Masking effects in the quality assessment of coded images, in: SPIE Human Vision and Electronic Imaging V, 3959 (2000) 211-219. In this paper the human visual system model is used to create a perceptual mask in order to optimize the watermark strength. The optimal watermark obtained satisfies both invisibility and robustness requirements. Contrary to most watermarking schemes using advanced perceptual masks, in order to best thwart the de-synchronization problem induced by geometrical distortions, we propose here a Fourier domain embedding and detection technique optimizing the amplitude of the watermark. Finally, the robustness of the scheme obtained is assessed against all attacks provided by the Stirmark benchmark. This work proposes a new digital rights management technique using an advanced human visual system model that is able to resist various kind of attacks including many geometrical distortions
Investigating Epipolar Plane Image Representations for Objective Quality Evaluation of Light Field Images
International audienceWith the ongoing advances in Light Field(LF) technology, research in LF acquisition, compression, processing has gained momentum. This increased the need for objective quality evaluation of LF content. Many processing algorithms are still optimized against peak signal to noise ratio(PSNR). Lately, several attempts have been made to improve objective quality evaluation such as extending 2D metrics to 4D LF domain. However, there is still a great room for improvement. In this paper, we experiment with existing 2D image quality metrics on the Epipolar Plane Image representations of LF content to reveal characteristics of LF related distortions. We discuss the challenges and suggest possible directions towards a LF image quality evaluation on EPI representations
Video Quality Model based on a spatiotemporal features extraction for H.264-coded HDTV sequences
International audienceAs a contribution to the design of an objective quality metric in the specific context of High Definition Television (HDTV), this paper proposes a video quality evaluation model. A spatio-temporal segmentation of sequences provide features used with the bitrate to predict the subjective evaluation of the H.264-distorted sequences. In addition, subjective tests have been conducted to provide the mean observer's quality appreciation and assess the model against reality. Existing video quality algorithms have been compared to our model. They are outperformed on every performance criterion
How is Gaze Influenced by Image Transformations? Dataset and Model
Data size is the bottleneck for developing deep saliency models, because
collecting eye-movement data is very time consuming and expensive. Most of
current studies on human attention and saliency modeling have used high quality
stereotype stimuli. In real world, however, captured images undergo various
types of transformations. Can we use these transformations to augment existing
saliency datasets? Here, we first create a novel saliency dataset including
fixations of 10 observers over 1900 images degraded by 19 types of
transformations. Second, by analyzing eye movements, we find that observers
look at different locations over transformed versus original images. Third, we
utilize the new data over transformed images, called data augmentation
transformation (DAT), to train deep saliency models. We find that label
preserving DATs with negligible impact on human gaze boost saliency prediction,
whereas some other DATs that severely impact human gaze degrade the
performance. These label preserving valid augmentation transformations provide
a solution to enlarge existing saliency datasets. Finally, we introduce a novel
saliency model based on generative adversarial network (dubbed GazeGAN). A
modified UNet is proposed as the generator of the GazeGAN, which combines
classic skip connections with a novel center-surround connection (CSC), in
order to leverage multi level features. We also propose a histogram loss based
on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in
terms of luminance distribution. Extensive experiments and comparisons over 3
datasets indicate that GazeGAN achieves the best performance in terms of
popular saliency evaluation metrics, and is more robust to various
perturbations. Our code and data are available at:
https://github.com/CZHQuality/Sal-CFS-GAN
- …