Search CORE

481 research outputs found

Data Analysis in Multimedia Quality Assessment: Revisiting the Statistical Tests

Author: Callet Patrick Le
Krasula Lukas
Narwaria Manish
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2017
Field of study

Assessment of multimedia quality relies heavily on subjective assessment, and is typically done by human subjects in the form of preferences or continuous ratings. Such data is crucial for analysis of different multimedia processing algorithms as well as validation of objective (computational) methods for the said purpose. To that end, statistical testing provides a theoretical framework towards drawing meaningful inferences, and making well grounded conclusions and recommendations. While parametric tests (such as t test, ANOVA, and error estimates like confidence intervals) are popular and widely used in the community, there appears to be a certain degree of confusion in the application of such tests. Specifically, the assumption of normality and homogeneity of variance is often not well understood. Therefore, the main goal of this paper is to revisit them from a theoretical perspective and in the process provide useful insights into their practical implications. Experimental results on both simulated and real data are presented to support the arguments made. A software implementing the said recommendations is also made publicly available, in order to achieve the goal of reproducible research

arXiv.org e-Print Archive

Crossref

Hal-Diderot

On the perceptual similarity of realistic looking tone mapped High Dynamic Range images

Author: Barkowsky Marcus
Le Callet Patrick
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2010
Field of study

International audienceHigh Dynamic Range (HDR) images are usually displayed on conventional Low Dynamic Range (LDR) displays because of the limited availability of HDR displays. For the conversion of the large dynamic luminance range into the eight bit quantized values, parameterized Tone Mapping Operators (TMO) are applied. Human observers are able to optimize the parameters in order to get the highest Quality of Experience by judging the displayed LDR images on a realism scale. In the study presented in this paper, two TMOs with three parameters each were evaluated by observers in a subjective experiment. Although the chosen parameter settings vary largely, the chosen images appear to have the same QoE for the observers. In order to assess this similarity objectively, three commonly used image quality measurement algorithms were applied. Their agreement with the preference of the observers was analyzed and it was found that the Visual Difference Predictor (VDP) outperforms the Structural Similarity Index and the Root Mean Square Error. A threshold value for VDP is derived that indicates when two LDR images appear to have the same Quality of Experience

A robust image watermarking technique based on quantization noise visibility thresholds

Author: Autrusseau Florent
Le Callet Patrick
Publication venue: 'Elsevier BV'
Publication date: 01/06/2007
Field of study

International audienceA tremendous amount of digital multimedia data is broadcasted daily over the internet. Since digital data can be very quickly and easily duplicated, intellectual property right protection techniques have become important and ﬁrst appeared about ﬁfty years ago (see [I.J. Cox, M.L. Miller, The First 50 Years of Electronic Watermarking, EURASIP J. Appl. Signal Process. 2 (2002) 126-132. [52]] for an extended review). Digital watermarking was born. Since its inception, many watermarking techniques have appeared, in all possible transformed spaces. However, an important lack in watermarking literature concerns the human visual system models. Several human visual system (HVS) model based watermarking techniques were designed in the late 1990's. Due to the weak robustness results, especially concerning geometrical distortions, the interest in such studies has reduced. In this paper, we intend to take advantage of recent advances in HVS models and watermarking techniques to revisit this issue. We will demonstrate that it is possible to resist too many attacks, including geometrical distortions, in HVS based watermarking algorithms. The perceptual model used here takes into account advanced features of the HVS identiﬁed from psychophysics experiments conducted in our laboratory. This model has been successfully applied in quality assessment and image coding schemes M. Carnec, P. Le Callet, D. Barba, An image quality assessment method based on perception of structural information, IEEE Internat. Conf. Image Process. 3 (2003) 185-188, N. Bekkat, A. Saadane, D. Barba, Masking effects in the quality assessment of coded images, in: SPIE Human Vision and Electronic Imaging V, 3959 (2000) 211-219. In this paper the human visual system model is used to create a perceptual mask in order to optimize the watermark strength. The optimal watermark obtained satisﬁes both invisibility and robustness requirements. Contrary to most watermarking schemes using advanced perceptual masks, in order to best thwart the de-synchronization problem induced by geometrical distortions, we propose here a Fourier domain embedding and detection technique optimizing the amplitude of the watermark. Finally, the robustness of the scheme obtained is assessed against all attacks provided by the Stirmark benchmark. This work proposes a new digital rights management technique using an advanced human visual system model that is able to resist various kind of attacks including many geometrical distortions

Investigating Epipolar Plane Image Representations for Objective Quality Evaluation of Light Field Images

Author: Ak Ali
Le Callet Patrick
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/10/2019
Field of study

International audienceWith the ongoing advances in Light Field(LF) technology, research in LF acquisition, compression, processing has gained momentum. This increased the need for objective quality evaluation of LF content. Many processing algorithms are still optimized against peak signal to noise ratio(PSNR). Lately, several attempts have been made to improve objective quality evaluation such as extending 2D metrics to 4D LF domain. However, there is still a great room for improvement. In this paper, we experiment with existing 2D image quality metrics on the Epipolar Plane Image representations of LF content to reveal characteristics of LF related distortions. We discuss the challenges and suggest possible directions towards a LF image quality evaluation on EPI representations

Crossref

Hal-Diderot

Video Quality Model based on a spatiotemporal features extraction for H.264-coded HDTV sequences

Author: Barba Dominique
Le Callet Patrick
Péchard Stéphane
Publication venue: HAL CCSD
Publication date: 07/11/2007
Field of study

International audienceAs a contribution to the design of an objective quality metric in the specific context of High Definition Television (HDTV), this paper proposes a video quality evaluation model. A spatio-temporal segmentation of sequences provide features used with the bitrate to predict the subjective evaluation of the H.264-distorted sequences. In addition, subjective tests have been conducted to provide the mean observer's quality appreciation and assess the model against reality. Existing video quality algorithms have been compared to our model. They are outperformed on every performance criterion

How is Gaze Influenced by Image Transformations? Dataset and Model

Author: Borji Ali
Callet Patrick Le
Che Zhaohui
Guo Guodong
Min Xiongkuo
Zhai Guangtao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Data size is the bottleneck for developing deep saliency models, because collecting eye-movement data is very time consuming and expensive. Most of current studies on human attention and saliency modeling have used high quality stereotype stimuli. In real world, however, captured images undergo various types of transformations. Can we use these transformations to augment existing saliency datasets? Here, we first create a novel saliency dataset including fixations of 10 observers over 1900 images degraded by 19 types of transformations. Second, by analyzing eye movements, we find that observers look at different locations over transformed versus original images. Third, we utilize the new data over transformed images, called data augmentation transformation (DAT), to train deep saliency models. We find that label preserving DATs with negligible impact on human gaze boost saliency prediction, whereas some other DATs that severely impact human gaze degrade the performance. These label preserving valid augmentation transformations provide a solution to enlarge existing saliency datasets. Finally, we introduce a novel saliency model based on generative adversarial network (dubbed GazeGAN). A modified UNet is proposed as the generator of the GazeGAN, which combines classic skip connections with a novel center-surround connection (CSC), in order to leverage multi level features. We also propose a histogram loss based on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in terms of luminance distribution. Extensive experiments and comparisons over 3 datasets indicate that GazeGAN achieves the best performance in terms of popular saliency evaluation metrics, and is more robust to various perturbations. Our code and data are available at: https://github.com/CZHQuality/Sal-CFS-GAN

arXiv.org e-Print Archive