Search CORE

413 research outputs found

A practical guide and software for analysing pairwise comparison experiments

Author: Mantiuk Rafal K.
Perez-Ortiz Maria
Publication venue
Publication date: 11/12/2017
Field of study

Most popular strategies to capture subjective judgments from humans involve the construction of a unidimensional relative measurement scale, representing order preferences or judgments about a set of objects or conditions. This information is generally captured by means of direct scoring, either in the form of a Likert or cardinal scale, or by comparative judgments in pairs or sets. In this sense, the use of pairwise comparisons is becoming increasingly popular because of the simplicity of this experimental procedure. However, this strategy requires non-trivial data analysis to aggregate the comparison ranks into a quality scale and analyse the results, in order to take full advantage of the collected data. This paper explains the process of translating pairwise comparison data into a measurement scale, discusses the benefits and limitations of such scaling methods and introduces a publicly available software in Matlab. We improve on existing scaling methods by introducing outlier analysis, providing methods for computing confidence intervals and statistical testing and introducing a prior, which reduces estimation error when the number of observers is low. Most of our examples focus on image quality assessment.Comment: Code available at https://github.com/mantiuk/pwcm

arXiv.org e-Print Archive

UCL Discovery

Recommended from our members

Predicting visible flicker in temporally changing images

Author: Denes Gyorgy
Mantiuk Rafal
Publication venue: Human Vision and Electronic Imaging 2020: Proceedings
Publication date: 26/01/2020
Field of study

Novel display algorithms such as low-persistence displays, black frame insertion, and temporal resolution multiplexing in- troduce temporal change into images at 40-180 Hz, on the bound- ary of the temporal integration of the visual system. This can lead to flicker, a highly-objectionable artifact known to induce viewer discomfort. The critical flicker frequency (CFF) alone does not model this phenomenon well, as flicker sensitivity varies with contrast, and spatial frequency; a content-aware model is re- quired. In this paper, we introduce a visual model for predicting flicker visibility in temporally changing images. The model per- forms a multi-scale analysis on the difference between consecu- tive frames, normalizing values with the spatio-temporal contrast sensitivity function as approximated by the pyramid of visibility. The output of the model is a 2D detection probability map. We ran a subjective flicker marking experiment to fit the model pa- rameters, then analyze the difference between two display algo- rithms, black frame insertion and temporal resolution multiplex- ing, to demonstrate the application of our model

Apollo (Cambridge)

Recommended from our members

Transformation Consistency Regularization – A Semi-supervised Paradigm for Image-to-Image Translation

Author: Mantiuk Rafal
Mustafa A
Publication venue: 'Organisation for Economic Co-Operation and Development (OECD)'
Publication date: 01/01/2020
Field of study

Scarcity of labeled data has motivated the development of semi-supervised learning methods, which learn from large portions of unlabeled data alongside a few labeled samples. Consistency Regularization between model's predictions under different input perturbations, particularly has shown to provide state-of-the art results in a semi-supervised framework. However, most of these method have been limited to classification and segmentation applications. We propose Transformation Consistency Regularization, which delves into a more challenging setting of image-to-image translation, which remains unexplored by semi-supervised algorithms. The method introduces a diverse set of geometric transformations and enforces the model's predictions for unlabeled data to be invariant to those transformations. We evaluate the efficacy of our algorithm on three different applications: image colorization, denoising and super-resolution. Our method is significantly data efficient, requiring only around 10 - 20% of labeled samples to achieve similar image reconstructions to its fully-supervised counterpart. Furthermore, we show the effectiveness of our method in video processing applications, where knowledge from a few frames can be leveraged to enhance the quality of the rest of the movie

Apollo (Cambridge)

Robust estimation of exposure ratios in multi-exposure image stacks

Author: Hanji Param
Mantiuk Rafał K.
Publication venue
Publication date: 12/08/2023
Field of study

Merging multi-exposure image stacks into a high dynamic range (HDR) image requires knowledge of accurate exposure times. When exposure times are inaccurate, for example, when they are extracted from a camera's EXIF metadata, the reconstructed HDR images reveal banding artifacts at smooth gradients. To remedy this, we propose to estimate exposure ratios directly from the input images. We derive the exposure time estimation as an optimization problem, in which pixels are selected from pairs of exposures to minimize estimation error caused by camera noise. When pixel values are represented in the logarithmic domain, the problem can be solved efficiently using a linear solver. We demonstrate that the estimation can be easily made robust to pixel misalignment caused by camera or object motion by collecting pixels from multiple spatial tiles. The proposed automatic exposure estimation and alignment eliminates banding artifacts in popular datasets and is essential for applications that require physically accurate reconstructions, such as measuring the modulation transfer function of a display. The code for the method is available.Comment: 11 pages, 11 figures, journa

arXiv.org e-Print Archive

Recommended from our members

Noise-Aware Merging of High Dynamic Range Image Stacks Without Camera Calibration

Author: Hanji Param
Mantiuk Rafal
Zhong F
Publication venue: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Publication date: 01/01/2020
Field of study

A near-optimal reconstruction of the radiance of a High Dynamic Range scene from an exposure stack can be obtained by modeling the camera noise distribution. The latent radiance is then estimated using Maximum Likelihood Estimation. But this requires a well-calibrated noise model of the camera, which is difficult to obtain in practice. We show that an unbiased estimation of comparable variance can be obtained with a simpler Poisson noise estimator, which does not require the knowledge of camera-specific noise parameters. We demonstrate this empirically for four different cameras, ranging from a smartphone camera to a full-frame mirrorless camera. Our experimental results are consistent for simulated as well as real images, and across different camera settings

Apollo (Cambridge)

Distilling Style from Image Pairs for Global Forward and Inverse Tone Mapping

Author: Hanji Param
Mantiuk Rafal K.
Mustafa Aamir
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 04/10/2022
Field of study

Many image enhancement or editing operations, such as forward and inverse tone mapping or color grading, do not have a unique solution, but instead a range of solutions, each representing a different style. Despite this, existing learning-based methods attempt to learn a unique mapping, disregarding this style. In this work, we show that information about the style can be distilled from collections of image pairs and encoded into a 2- or 3-dimensional vector. This gives us not only an efficient representation but also an interpretable latent space for editing the image style. We represent the global color mapping between a pair of images as a custom normalizing flow, conditioned on a polynomial basis of the pixel color. We show that such a network is more effective than PCA or VAE at encoding image style in low-dimensional space and lets us obtain an accuracy close to 40 dB, which is about 7-10 dB improvement over the state-of-the-art methods.Comment: Published in European Conference on Visual Media Production (CVMP '22

arXiv.org e-Print Archive

Exploiting the limitations of spatio-temporal vision for more efficient VR rendering

Author: Denes G
Mantiuk RK
Maruszczyk K
Publication venue: ACM SIGGRAPH 2018 Posters, SIGGRAPH 2018
Publication date: 12/08/2018
Field of study

Increasingly higher virtual reality (VR) display resolutions and good-quality anti-aliasing make rendering in VR prohibitively expensive. The generation of these complex frames 90 times per second in a binocular setup demands substantial computational power. Wireless transmission of the frames from the GPU to the VR headset poses another challenge, requiring high-bandwidth dedicated links

Crossref

Apollo (Cambridge)

Recommended from our members

The effect of display brightness and viewing distance: A dataset for visually lossless image compression

Author: Mantiuk RK
Mikhailiuk A
Ye N
Publication venue: IS and T International Symposium on Electronic Imaging Science and Technology
Publication date: 18/12/2021
Field of study

Visibility of image artifacts depends on the viewing conditions, such as display brightness and distance to the display. However, most image and video quality metrics operate under the assumption of a single standard viewing condition, without considering luminance or distance to the display. To address this limitation, we isolate brightness and distance as the components impacting the visibility of artifacts and collect a new dataset for visually lossless image compression. The dataset includes images encoded with JPEG andWebP at the quality level that makes compression artifacts imperceptible to an average observer. The visibility thresholds are collected under two luminance conditions: 10 cd/m2, simulating a dimmed mobile phone, and 220 cd/m2, which is a typical peak luminance of modern computer displays; and two distance conditions: 30 and 60 pixels per visual degree. The dataset was used to evaluate existing image quality and visibility metrics in their ability to consider display brightness and its distance to viewer. We include two deep neural network architectures, proposed to control image compression for visually lossless coding in our experiments. </jats:p

Apollo (Cambridge)