325 research outputs found

    Edges and bars: where do people see features in 1-D images?

    Get PDF
    AbstractThere have been two main approaches to feature detection in human and computer visionā€“ā€“based either on the luminance distribution and its spatial derivatives, or on the spatial distribution of local contrast energy. Thus, bars and edges might arise from peaks of luminance and luminance gradient respectively, or bars and edges might be found at peaks of local energy, where local phases are aligned across spatial frequency. This basic issue of definition is important because it guides more detailed models and interpretations of early vision. Which approach better describes the perceived positions of features in images? We used the class of 1-D images defined by Morrone and Burr in which the amplitude spectrum is that of a (partially blurred) square-wave and all Fourier components have a common phase. Observers used a cursor to mark where bars and edges were seen for different test phases (Experiment 1) or judged the spatial alignment of contours that had different phases (e.g. 0Ā° and 45Ā°; Experiment 2). The feature positions defined by both tasks shifted systematically to the left or right according to the sign of the phase offset, increasing with the degree of blur. These shifts were well predicted by the location of luminance peaks (bars) and gradient peaks (edges), but not by energy peaks which (by design) predicted no shift at all. These results encourage models based on a Gaussian-derivative framework, but do not support the idea that human vision uses points of phase alignment to find local, first-order features. Nevertheless, we argue that both approaches are presently incomplete and a better understanding of early vision may combine insights from both

    Paradoxical psychometric functions ("swan functions") are explained by dilution masking in four stimulus dimensions

    Get PDF
    The visual system dissects the retinal image into millions of local analyses along numerous visual dimensions. However, our perceptions of the world are not fragmentary, so further processes must be involved in stitching it all back together. Simply summing up the responses would not work because this would convey an increase in image contrast with an increase in the number of mechanisms stimulated. Here, we consider a generic model of signal combination and counter-suppression designed to address this problem. The model is derived and tested for simple stimulus pairings (e.g. A + B), but is readily extended over multiple analysers. The model can account for nonlinear contrast transduction, dilution masking, and signal combination at threshold and above. It also predicts nonmonotonic psychometric functions where sensitivity to signal A in the presence of pedestal B first declines with increasing signal strength (paradoxically dropping below 50% correct in two-interval forced choice), but then rises back up again, producing a contour that follows the wings and neck of a swan. We looked for and found these "swan" functions in four different stimulus dimensions (ocularity, space, orientation, and time), providing some support for our proposal

    Binocular summation revisited: beyond āˆš2

    Get PDF
    Our ability to detect faint images is better with two eyes than with one, but how great is this improvement? A meta-analysis of 65 studies published across more than five decades shows definitively that psychophysical binocular summation (the ratio of binocular to monocular contrast sensitivity) is significantly greater than the canonical value of āˆš2. Several methodological factors were also found to affect summation estimates. Binocular summation was significantly affected by both the spatial and temporal frequency of the stimulus, and stimulus speed (the ratio of temporal to spatial frequency) systematically predicts summation levels, with slow speeds (high spatial and low temporal frequencies) producing the strongest summation. We furthermore show that empirical summation estimates are affected by the ratio of monocular sensitivities, which varies across individuals, and is abnormal in visual disorders such as amblyopia. A simple modeling framework is presented to interpret the results of summation experiments. In combination with the empirical results, this model suggests that there is no single value for binocular summation, but instead that summation ratios depend on methodological factors that influence the strength of a nonlinearity occurring early in the visual pathway, before binocular combination of signals. Best practice methodological guidelines are proposed for obtaining accurate estimates of neural summation in future studies, including those involving patient groups with impaired binocular vision

    Contrast and lustre:a model that accounts for eleven different forms of contrast discrimination in binocular vision

    Get PDF
    Our goal here is a more complete understanding of how information about luminance contrast is encoded and used by the binocular visual system. In two-interval forced-choice experiments we assessed observers' ability to discriminate changes in contrast that could be an increase or decrease of contrast in one or both eyes, or an increase in one eye coupled with a decrease in the other (termed IncDec). The base or pedestal contrasts were either in-phase or out-of-phase in the two eyes. The opposed changes in the IncDec condition did not cancel each other out, implying that along with binocular summation, information is also available from mechanisms that do not sum the two eyes' inputs. These might be monocular mechanisms. With a binocular pedestal, monocular increments of contrast were much easier to see than monocular decrements. These findings suggest that there are separate binocular (B) and monocular (L,R) channels, but only the largest of the three responses, max(L,B,R), is available to perception and decision. Results from contrast discrimination and contrast matching tasks were described very accurately by this model. Stimuli, data, and model responses can all be visualized in a common binocular contrast space, allowing a more direct comparison between models and data. Some results with out-of-phase pedestals were not accounted for by the max model of contrast coding, but were well explained by an extended model in which gratings of opposite polarity create the sensation of lustre. Observers can discriminate changes in lustre alongside changes in contrast

    Adaption reveals a neural code for the visual location of orientation change

    Full text link
    We apply an adaptation technique to explore the neural code for the visual location of textures defined by modulation of orientation over space. In showing that adaptation to textures modulated around one orientation shifts the perceived location of textures modulated around a different orientation, we demonstrate the existence of a neural code for the location of orientation change that generalises across orientation content. Using competitive adaptation, we characterise the neural processes underlying this code as single-opponent for orientation, that is with concentric excitatory/inhibitory receptive areas tuned to a single orientation.<br /

    From filters to features:Scale-space analysis of edge and blur coding in human vision

    Get PDF
    To make vision possible, the visual nervous system must represent the most informative features in the light pattern captured by the eye. Here we use Gaussian scale-space theory to derive a multiscale model for edge analysis and we test it in perceptual experiments. At all scales there are two stages of spatial filtering. An odd-symmetric, Gaussian first derivative filter provides the input to a Gaussian second derivative filter. Crucially, the output at each stage is half-wave rectified before feeding forward to the next. This creates nonlinear channels selectively responsive to one edge polarity while suppressing spurious or "phantom" edges. The two stages have properties analogous to simple and complex cells in the visual cortex. Edges are found as peaks in a scale-space response map that is the output of the second stage. The position and scale of the peak response identify the location and blur of the edge. The model predicts remarkably accurately our results on human perception of edge location and blur for a wide range of luminance profiles, including the surprising finding that blurred edges look sharper when their length is made shorter. The model enhances our understanding of early vision by integrating computational, physiological, and psychophysical approaches. Ā© ARVO

    Motion sharpening and contrast: Gain control precedes compressive non-linearity?

    Get PDF
    AbstractBlurred edges appear sharper in motion than when they are stationary. We (Vision Research 38 (1998) 2108) have previously shown how such distortions in perceived edge blur may be accounted for by a model which assumes that luminance contrast is encoded by a local contrast transducer whose response becomes progressively more compressive as speed increases. If the form of the transducer is fixed (independent of contrast) for a given speed, then a strong prediction of the model is that motion sharpening should increase with increasing contrast. We measured the sharpening of periodic patterns over a large range of contrasts, blur widths and speeds. The results indicate that whilst sharpening increases with speed it is practically invariant with contrast. The contrast invariance of motion sharpening is not explained by an early, static compressive non-linearity alone. However, several alternative explanations are also inconsistent with these results. We show that if a dynamic contrast gain control precedes the static non-linear transducer then motion sharpening, its speed dependence, and its invariance with contrast, can be predicted with reasonable accuracy

    Orientation tuning of a two-stimulus afterimage: Implications for theories of filling-in.

    Get PDF
    Sequential viewing of 2 orthogonally related gratings produces an afterimage related to the firstgrating (Vidyasagar, Buzas, Kisyarday, & Eysel, 1999; Francis & Rothmayer, 2003). We investigated how the appearance of the afterimage depended on the relative orientations of the 2 stimulus gratings. We firstanalyzethetheoretical explanation of the appearance of the afterimage that was proposed by Francis and Rothameyer (2003). From the analysis, we show that the model must predict a rapid drop in afterimage occurrence as the gratings deviate from orthogonal. We also show that the model predicts that the shape of the afterimage should always be orthogonal to the second grating. We then report on 2 experiments that test the properties of the model and find that the experimental data are strikingly different from the model predictions. From these discrepancies we identify the key deficits of the current version of the model

    A common contrast pooling rule for suppression within and between the eyes

    Get PDF
    Recent work has revealed multiple pathways for cross-orientation suppression in cat and human vision. In particular, ipsiocular and interocular pathways appear to assert their influence before binocular summation in human but have different (1) spatial tuning, (2) temporal dependencies, and (3) adaptation after-effects. Here we use mask components that fall outside the excitatory passband of the detecting mechanism to investigate the rules for pooling multiple mask components within these pathways. We measured psychophysical contrast masking functions for vertical 1 cycle/deg sine-wave gratings in the presence of left or right oblique (645 deg) 3 cycles/deg mask gratings with contrast C%, or a plaid made from their sum, where each component (i) had contrast 0.5Ci%. Masks and targets were presented to two eyes (binocular), one eye (monoptic), or different eyes (dichoptic). Binocular-masking functions superimposed when plotted against C, but in the monoptic and dichoptic conditions, the grating produced slightly more suppression than the plaid when Ci $ 16%. We tested contrast gain control models involving two types of contrast combination on the denominator: (1) spatial pooling of the mask after a local nonlinearity (to calculate either root mean square contrast or energy) and (2) "linear suppression" (Holmes & Meese, 2004, Journal of Vision 4, 1080ā€“1089), involving the linear sum of the mask component contrasts. Monoptic and dichoptic masking were typically better fit by the spatial pooling models, but binocular masking was not: it demanded strict linear summation of the Michelson contrast across mask orientation. Another scheme, in which suppressive pooling followed compressive contrast responses to the mask components (e.g., oriented cortical cells), was ruled out by all of our data. We conclude that the different processes that underlie monoptic and dichoptic masking use the same type of contrast pooling within their respective suppressive fields, but the effects do not sum to predict the binocular case

    Motion, flash, and flicker:A unified spatiotemporal model of perceived edge sharpening

    Get PDF
    Blurred edges appear sharper in motion than when they are stationary. We proposed a model of this motion sharpening that invokes a local, nonlinear contrast transducer function (Hammett et al, 1998 Vision Research 38 2099-2108). Response saturation in the transducer compresses or 'clips' the input spatial waveform, rendering the edges as sharper. To explain the increasing distortion of drifting edges at higher speeds, the degree of nonlinearity must increase with speed or temporal frequency. A dynamic contrast gain control before the transducer can account for both the speed dependence and approximate contrast invariance of motion sharpening (Hammett et al, 2003 Vision Research, in press). We show here that this model also predicts perceived sharpening of briefly flashed and flickering edges, and we show that the model can account fairly well for experimental data from all three modes of presentation (motion, flash, and flicker). At moderate durations and lower temporal frequencies the gain control attenuates the input signal, thus protecting it from later compression by the transducer. The gain control is somewhat sluggish, and so it suffers both a slow onset, and loss of power at high temporal frequencies. Consequently, brief presentations and high temporal frequencies of drift and flicker are less protected from distortion, and show greater perceptual sharpening
    • ā€¦
    corecore