54 research outputs found

    Flexible modeling of next-generation displays using a differentiable toolkit

    Get PDF
    We introduce an open-source toolkit for simulating optics and visual perception. The toolkit offers differentiable functions that ease the optimization process in design. In addition, this toolkit supports applications spanning from calculating holograms for holographic displays to foveation in computer graphics. We believe this toolkit offers a gateway to remove overheads in scientific research related to next-generation displays

    Beyond Flicker, Beyond Blur: View-coherent Metameric Light Fields for Foveated Display

    Get PDF
    Ventral metamers, pairs of images which may differ substantially in the periphery, but are perceptually identical, offer exciting new possibilities in foveated rendering and image compression, as well as offering insights into the human visual system. However, existing lit-erature has mainly focused on creating metamers of static images. In this work, we develop a method for creating sequences of metameric frames, specifically light fields, with enforced consistency along the temporal, or angular, dimension. This greatly expands the potential applications for these metamers, and expanding metamers along the third dimension offers further new potential for compression

    Content-prioritised video coding for British Sign Language communication.

    Get PDF
    Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people

    Far touch: integrating visual and haptic perceptual processing on wearables

    Get PDF
    The evolution of electronic computers seems to have now reached the ubiquitous realm of wearable computing. Although a vast gamut of systems has been proposed so far, we believe most systems lack proper feedback for the user. In this dissertation, we not only contribute to solving the feedback problem, but we also consider the design of a system to acquire and reproduce the sense of touch. In order for such a system to be feasible, a few important problems need to be considered. Here, we address two of them. First, we know that wireless streaming of high resolution video to a head-mounted display requires high compression ratio. Second, we know that the choice of a proper feedback for the user depends on his/her ability to perceive it confidently across different scenarios. In order to solve the first problem, we propose a new limit that promises theoretically achievable data reduction ratios up to approximately 9:1 with no perceptual loss in typical scenarios. Also, we introduce a novel Gaussian foveation scheme that provides experimentally achievable gains up to approximately 2 times the compression ratio of typical compression schemes with less perceptual loss than in typical transmissions. The background material of both the limit and the foveation scheme includes a proposed pointwise retina-based constraint called pixel efficiency, that can be globally processed to reveal the perceptual efficiency of a display, and can be used together with a lossy parameter to locally control the spatial resolution of a foveated image. In order to solve the second problem, we provide an estimation of difference threshold that suggests that typically humans are able to discriminate between at least 6 different frequencies of an electrotactile stimulation. Also, we propose a novel sequence of experiments that suggests that a change from active touch to passive touch, or from a visual-haptic environment to a haptic environment, typically yields a reduction of the sensitivity index d' and in an increase of the response bias c

    Learning GAN-based Foveated Reconstruction to Recover Perceptually Important Image Features

    Get PDF
    A foveated image can be entirely reconstructed from a sparse set of samples distributed according to the retinal sensitivity of the human visual system, which rapidly decreases with increasing eccentricity. The use of Generative Adversarial Networks has recently been shown to be a promising solution for such a task, as they can successfully hallucinate missing image information. As in the case of other supervised learning approaches, the definition of the loss function and the training strategy heavily influence the quality of the output. In this work,we consider the problem of efficiently guiding thetraining of foveated reconstruction techniques such that they are more aware of the capabilities and limitations of the human visual system, and thus can reconstruct visually important image features. Our primary goal is to make the training procedure less sensitive to distortions that humans cannot detect and focus on penalizing perceptually important artifacts. Given the nature of GAN-based solutions, we focus on the sensitivity of human vision to hallucination in case of input samples with different densities. We propose psychophysical experiments, a dataset, and a procedure for training foveated image reconstruction. The proposed strategy renders the generator network flexible by penalizing only perceptually important deviations in the output. As a result, the method emphasized the recovery of perceptually important image features. We evaluated our strategy and compared it with alternative solutions by using a newly trained objective metric, a recent foveated video quality metric, and user experiments. Our evaluations revealed significant improvements in the perceived image reconstruction quality compared with the standard GAN-based training approach

    Space-variant picture coding

    Get PDF
    PhDSpace-variant picture coding techniques exploit the strong spatial non-uniformity of the human visual system in order to increase coding efficiency in terms of perceived quality per bit. This thesis extends space-variant coding research in two directions. The first of these directions is in foveated coding. Past foveated coding research has been dominated by the single-viewer, gaze-contingent scenario. However, for research into the multi-viewer and probability-based scenarios, this thesis presents a missing piece: an algorithm for computing an additive multi-viewer sensitivity function based on an established eye resolution model, and, from this, a blur map that is optimal in the sense of discarding frequencies in least-noticeable- rst order. Furthermore, for the application of a blur map, a novel algorithm is presented for the efficient computation of high-accuracy smoothly space-variant Gaussian blurring, using a specialised filter bank which approximates perfect space-variant Gaussian blurring to arbitrarily high accuracy and at greatly reduced cost compared to the brute force approach of employing a separate low-pass filter at each image location. The second direction is that of artifi cially increasing the depth-of- field of an image, an idea borrowed from photography with the advantage of allowing an image to be reduced in bitrate while retaining or increasing overall aesthetic quality. Two synthetic depth of field algorithms are presented herein, with the desirable properties of aiming to mimic occlusion eff ects as occur in natural blurring, and of handling any number of blurring and occlusion levels with the same level of computational complexity. The merits of this coding approach have been investigated by subjective experiments to compare it with single-viewer foveated image coding. The results found the depth-based preblurring to generally be significantly preferable to the same level of foveation blurring
    corecore