411 research outputs found

    Naturalistic depth perception and binocular vision

    Get PDF
    Humans continuously move both their eyes to redirect their foveae to objects at new depths. To correctly execute these complex combinations of saccades, vergence eye movements and accommodation changes, the visual system makes use of multiple sources of depth information, including binocular disparity and defocus. Furthermore, during development, both fine-tuning of oculomotor control as well as correct eye growth are likely driven by complex interactions between eye movements, accommodation, and the distributions of defocus and depth information across the retina. I have employed photographs of natural scenes taken with a commercial plenoptic camera to examine depth perception while varying perspective, blur and binocular disparity. Using a gaze contingent display with these natural images, I have shown that disparity and peripheral blur interact to modify eye movements and facilitate binocular fusion. By decoupling visual feedback for each eye, I have found it possible to induces both conjugate and disconjugate changes in saccadic adaptation, which helps us understand to what degree the eyes can be individually controlled. To understand the aetiology of myopia, I have developed geometric models of emmetropic and myopic eye shape, from which I have derived psychophysically testable predictions about visual function. I have then tested the myopic against the emmetropic visual system and have found that some aspects of visual function decrease in the periphery at a faster rate in best-corrected myopic observers than in emmetropes. To study the effects of different depth cues on visual development, I have investigated accommodation response and sensitivity to blur in normal and myopic subjects. This body of work furthers our understanding of oculomotor control and 3D perception, has applied implications regarding discomfort in the use of virtual reality, and provides clinically relevant insights regarding the development of refractive error and potential approaches to prevent incorrect emmetropization

    Binocular fusion and invariant category learning due to predictive remapping during scanning of a depthful scene with eye movements

    Get PDF
    How does the brain maintain stable fusion of 3D scenes when the eyes move? Every eye movement causes each retinal position to process a different set of scenic features, and thus the brain needs to binocularly fuse new combinations of features at each position after an eye movement. Despite these breaks in retinotopic fusion due to each movement, previously fused representations of a scene in depth often appear stable. The 3D ARTSCAN neural model proposes how the brain does this by unifying concepts about how multiple cortical areas in the What and Where cortical streams interact to coordinate processes of 3D boundary and surface perception, spatial attention, invariant object category learning, predictive remapping, eye movement control, and learned coordinate transformations. The model explains data from single neuron and psychophysical studies of covert visual attention shifts prior to eye movements. The model further clarifies how perceptual, attentional, and cognitive interactions among multiple brain regions (LGN, V1, V2, V3A, V4, MT, MST, PPC, LIP, ITp, ITa, SC) may accomplish predictive remapping as part of the process whereby view-invariant object categories are learned. These results build upon earlier neural models of 3D vision and figure-ground separation and the learning of invariant object categories as the eyes freely scan a scene. A key process concerns how an object's surface representation generates a form-fitting distribution of spatial attention, or attentional shroud, in parietal cortex that helps maintain the stability of multiple perceptual and cognitive processes. Predictive eye movement signals maintain the stability of the shroud, as well as of binocularly fused perceptual boundaries and surface representations.Published versio

    Visual Perception in Simulated Reality

    Get PDF

    Activity in area V3A predicts positions of moving objects

    Get PDF
    No description supplie

    Media Space: an analysis of spatial practices in planar pictorial media.

    Get PDF
    The thesis analyses the visual space displayed in pictures, film, television and digital interactive media. The argument is developed that depictions are informed by the objectives of the artefact as much as by any simple visual correspondence to the observed world. The simple concept of ‘realism’ is therefore anatomised and a more pragmatic theory proposed which resolves some of the traditional controversies concerning the relation between depiction and vision. This is then applied to the special problems of digital interactive media. An introductory chapter outlines the topic area and the main argument and provides an initial definition of terms. To provide a foundation for the ensuing arguments, a brief account is given of two existing and contrasted approaches to the notion of space: that of perception science which gives priority to acultural aspects, and that of visual culture which emphasises aspects which are culturally contingent. An existing approach to spatial perception (that of JJ Gibson originating in the 1940s and 50s) is applied to spatial depiction in order to explore the differences between seeing and picturing, and also to emphasise the many different cues for spatial perception beyond those commonly considered (such as binocularity and linear perspective). At this stage a simple framework of depiction is introduced which identifies five components or phases: the objectives of the picture, the idea chosen to embody the objectives, the model (essentially, the visual ‘subject matter’), the characteristics of the view and finally the substantive picture or depiction itself. This framework draws attention to the way in which each of the five phases presents an opportunity for decision-making about representation. The framework is used and refined throughout the thesis. Since pictures are considered in some everyday sense to be ‘realistic’ (otherwise, in terms of this thesis, they would not count as depictions), the nature of realism is considered at some length. The apparently unitary concept is broken down into several different types of realism and it is argued that, like the different spatial cues, each lends itself to particular objectives intended for the artefact. From these several types, two approaches to realism are identified, one prioritising the creation of a true illusion (that the picture is in fact a scene) and the other (of which there are innumerably more examples both across cultures and over historical time) one which evokes aspects of vision without aiming to exactly imitate the optical stimulus of the scene. Various reasons for the latter approach, and the variety of spatial practices to which it leads, are discussed. In addition to analysing traditional pictures, computer graphics images are discussed in conjunction with the claims for realism offered by their authors. In the process, informational and affective aspects of picture-making are distinguished, a distinction which it is argued is useful and too seldom made. Discussion of still pictures identifies the evocation of movement (and other aspects of time) as one of the principal motives for departing from attempts at straightforward optical matching. The discussion proceeds to the subject of film where, perhaps surprisingly now that the depiction of movement is possible, the lack of straightforward imitation of the optical is noteworthy again. This is especially true of the relationship between shots rather than within them; the reasons for this are analysed. This reinforces the argument that the spatial form of the fiction film, like that of other kinds of depiction, arises from its objectives, presenting realism once again as a contingent concept. The separation of depiction into two broad classes – one which aims to negate its own mediation, to seem transparent to what it depicts, and one which presents the fact of depiction ostensively to the viewer – is carried through from still pictures, via film, into a discussion of factual television and finally of digital interactive media. The example of factual television is chosen to emphasise how, despite the similarities between the technologies of film and television, spatial practices within some television genres contrast strongly with those of the mainstream fiction film. By considering historic examples, it is shown that many of the spatial practices now familiar in factual television were gradually expunged from the classical film when the latter became centred on the concerns of narrative fiction. By situating the spaces of interactive media in the context of other kinds of pictorial space, questions are addressed concerning the transferability of spatial usages from traditional media to those which are interactive. During the thesis the spatial practices of still-picture-making, film and television are characterised as ‘mature’ and ‘expressive’ (terms which are defined in the text). By contrast the spatial practices of digital interactive media are seen to be immature and inexpressive. It is argued that this is to some degree inevitable given the context in which interactive media artefacts are made and experienced – the lack of a shared ‘language’ or languages in any new media. Some of the difficult spatial problems which digital interactive media need to overcome are identified, especially where, as is currently normal, interaction is based on the relation between a pointer and visible objects within a depiction. The range of existing practice in digital interactive media is classified in a seven-part taxonomy, which again makes use of the objective-idea-model-view-picture framework, and again draws out the difference between self-concealing approaches to depiction and those which offer awareness of depiction as a significant component of the experience. The analysis indicates promising lines of enquiry for the future and emphasises the need for further innovation. Finally the main arguments are summarised and the thesis concludes with a short discussion of the implications for design arising from the key concepts identified – expressivity and maturity, pragmatism and realism

    Media Space: an analysis of spatial practices in planar pictorial media.

    Get PDF
    The thesis analyses the visual space displayed in pictures, film, television and digital interactive media. The argument is developed that depictions are informed by the objectives of the artefact as much as by any simple visual correspondence to the observed world. The simple concept of ‘realism’ is therefore anatomised and a more pragmatic theory proposed which resolves some of the traditional controversies concerning the relation between depiction and vision. This is then applied to the special problems of digital interactive media. An introductory chapter outlines the topic area and the main argument and provides an initial definition of terms. To provide a foundation for the ensuing arguments, a brief account is given of two existing and contrasted approaches to the notion of space: that of perception science which gives priority to acultural aspects, and that of visual culture which emphasises aspects which are culturally contingent. An existing approach to spatial perception (that of JJ Gibson originating in the 1940s and 50s) is applied to spatial depiction in order to explore the differences between seeing and picturing, and also to emphasise the many different cues for spatial perception beyond those commonly considered (such as binocularity and linear perspective). At this stage a simple framework of depiction is introduced which identifies five components or phases: the objectives of the picture, the idea chosen to embody the objectives, the model (essentially, the visual ‘subject matter’), the characteristics of the view and finally the substantive picture or depiction itself. This framework draws attention to the way in which each of the five phases presents an opportunity for decision-making about representation. The framework is used and refined throughout the thesis. Since pictures are considered in some everyday sense to be ‘realistic’ (otherwise, in terms of this thesis, they would not count as depictions), the nature of realism is considered at some length. The apparently unitary concept is broken down into several different types of realism and it is argued that, like the different spatial cues, each lends itself to particular objectives intended for the artefact. From these several types, two approaches to realism are identified, one prioritising the creation of a true illusion (that the picture is in fact a scene) and the other (of which there are innumerably more examples both across cultures and over historical time) one which evokes aspects of vision without aiming to exactly imitate the optical stimulus of the scene. Various reasons for the latter approach, and the variety of spatial practices to which it leads, are discussed. In addition to analysing traditional pictures, computer graphics images are discussed in conjunction with the claims for realism offered by their authors. In the process, informational and affective aspects of picture-making are distinguished, a distinction which it is argued is useful and too seldom made. Discussion of still pictures identifies the evocation of movement (and other aspects of time) as one of the principal motives for departing from attempts at straightforward optical matching. The discussion proceeds to the subject of film where, perhaps surprisingly now that the depiction of movement is possible, the lack of straightforward imitation of the optical is noteworthy again. This is especially true of the relationship between shots rather than within them; the reasons for this are analysed. This reinforces the argument that the spatial form of the fiction film, like that of other kinds of depiction, arises from its objectives, presenting realism once again as a contingent concept. The separation of depiction into two broad classes – one which aims to negate its own mediation, to seem transparent to what it depicts, and one which presents the fact of depiction ostensively to the viewer – is carried through from still pictures, via film, into a discussion of factual television and finally of digital interactive media. The example of factual television is chosen to emphasise how, despite the similarities between the technologies of film and television, spatial practices within some television genres contrast strongly with those of the mainstream fiction film. By considering historic examples, it is shown that many of the spatial practices now familiar in factual television were gradually expunged from the classical film when the latter became centred on the concerns of narrative fiction. By situating the spaces of interactive media in the context of other kinds of pictorial space, questions are addressed concerning the transferability of spatial usages from traditional media to those which are interactive. During the thesis the spatial practices of still-picture-making, film and television are characterised as ‘mature’ and ‘expressive’ (terms which are defined in the text). By contrast the spatial practices of digital interactive media are seen to be immature and inexpressive. It is argued that this is to some degree inevitable given the context in which interactive media artefacts are made and experienced – the lack of a shared ‘language’ or languages in any new media. Some of the difficult spatial problems which digital interactive media need to overcome are identified, especially where, as is currently normal, interaction is based on the relation between a pointer and visible objects within a depiction. The range of existing practice in digital interactive media is classified in a seven-part taxonomy, which again makes use of the objective-idea-model-view-picture framework, and again draws out the difference between self-concealing approaches to depiction and those which offer awareness of depiction as a significant component of the experience. The analysis indicates promising lines of enquiry for the future and emphasises the need for further innovation. Finally the main arguments are summarised and the thesis concludes with a short discussion of the implications for design arising from the key concepts identified – expressivity and maturity, pragmatism and realism

    Augmented Reality and Its Application

    Get PDF
    Augmented Reality (AR) is a discipline that includes the interactive experience of a real-world environment, in which real-world objects and elements are enhanced using computer perceptual information. It has many potential applications in education, medicine, and engineering, among other fields. This book explores these potential uses, presenting case studies and investigations of AR for vocational training, emergency response, interior design, architecture, and much more

    Neural dynamics of invariant object recognition: relative disparity, binocular fusion, and predictive eye movements

    Full text link
    How does the visual cortex learn invariant object categories as an observer scans a depthful scene? Two neural processes that contribute to this ability are modeled in this thesis. The first model clarifies how an object is represented in depth. Cortical area V1 computes absolute disparity, which is the horizontal difference in retinal location of an image in the left and right foveas. Many cells in cortical area V2 compute relative disparity, which is the difference in absolute disparity of two visible features. Relative, but not absolute, disparity is unaffected by the distance of visual stimuli from an observer, and by vergence eye movements. A laminar cortical model of V2 that includes shunting lateral inhibition of disparity-sensitive layer 4 cells causes a peak shift in cell responses that transforms absolute disparity from V1 into relative disparity in V2. The second model simulates how the brain maintains stable percepts of a 3D scene during binocular movements. The visual cortex initiates the formation of a 3D boundary and surface representation by binocularly fusing corresponding features from the left and right retinotopic images. However, after each saccadic eye movement, every scenic feature projects to a different combination of retinal positions than before the saccade. Yet the 3D representation, resulting from the prior fusion, is stable through the post-saccadic re-fusion. One key to stability is predictive remapping: the system anticipates the new retinal positions of features entailed by eye movements by using gain fields that are updated by eye movement commands. The 3D ARTSCAN model developed here simulates how perceptual, attentional, and cognitive interactions across different brain regions within the What and Where visual processing streams interact to coordinate predictive remapping, stable 3D boundary and surface perception, spatial attention, and the learning of object categories that are invariant to changes in an object's retinal projections. Such invariant learning helps the system to avoid treating each new view of the same object as a distinct object to be learned. The thesis hereby shows how a process that enables invariant object category learning can be extended to also enable stable 3D scene perception

    Efficient streaming for high fidelity imaging

    Get PDF
    Researchers and practitioners of graphics, visualisation and imaging have an ever-expanding list of technologies to account for, including (but not limited to) HDR, VR, 4K, 360°, light field and wide colour gamut. As these technologies move from theory to practice, the methods of encoding and transmitting this information need to become more advanced and capable year on year, placing greater demands on latency, bandwidth, and encoding performance. High dynamic range (HDR) video is still in its infancy; the tools for capture, transmission and display of true HDR content are still restricted to professional technicians. Meanwhile, computer graphics are nowadays near-ubiquitous, but to achieve the highest fidelity in real or even reasonable time a user must be located at or near a supercomputer or other specialist workstation. These physical requirements mean that it is not always possible to demonstrate these graphics in any given place at any time, and when the graphics in question are intended to provide a virtual reality experience, the constrains on performance and latency are even tighter. This thesis presents an overall framework for adapting upcoming imaging technologies for efficient streaming, constituting novel work across three areas of imaging technology. Over the course of the thesis, high dynamic range capture, transmission and display is considered, before specifically focusing on the transmission and display of high fidelity rendered graphics, including HDR graphics. Finally, this thesis considers the technical challenges posed by incoming head-mounted displays (HMDs). In addition, a full literature review is presented across all three of these areas, detailing state-of-the-art methods for approaching all three problem sets. In the area of high dynamic range capture, transmission and display, a framework is presented and evaluated for efficient processing, streaming and encoding of high dynamic range video using general-purpose graphics processing unit (GPGPU) technologies. For remote rendering, state-of-the-art methods of augmenting a streamed graphical render are adapted to incorporate HDR video and high fidelity graphics rendering, specifically with regards to path tracing. Finally, a novel method is proposed for streaming graphics to a HMD for virtual reality (VR). This method utilises 360° projections to transmit and reproject stereo imagery to a HMD with minimal latency, with an adaptation for the rapid local production of depth maps
    • …
    corecore