697 research outputs found

    Curve Reconstruction via the Global Statistics of Natural Curves

    Full text link
    Reconstructing the missing parts of a curve has been the subject of much computational research, with applications in image inpainting, object synthesis, etc. Different approaches for solving that problem are typically based on processes that seek visually pleasing or perceptually plausible completions. In this work we focus on reconstructing the underlying physically likely shape by utilizing the global statistics of natural curves. More specifically, we develop a reconstruction model that seeks the mean physical curve for a given inducer configuration. This simple model is both straightforward to compute and it is receptive to diverse additional information, but it requires enough samples for all curve configurations, a practical requirement that limits its effective utilization. To address this practical issue we explore and exploit statistical geometrical properties of natural curves, and in particular, we show that in many cases the mean curve is scale invariant and oftentimes it is extensible. This, in turn, allows to boost the number of examples and thus the robustness of the statistics and its applicability. The reconstruction results are not only more physically plausible but they also lead to important insights on the reconstruction problem, including an elegant explanation why certain inducer configurations are more likely to yield consistent perceptual completions than others.Comment: CVPR versio

    Recovering metric properties of objects through spatiotemporal interpolation

    Get PDF
    AbstractSpatiotemporal interpolation (STI) refers to perception of complete objects from fragmentary information across gaps in both space and time. It differs from static interpolation in that requirements for interpolation are not met in any static frame. It has been found that STI produced objective performance advantages in a shape discrimination paradigm for both illusory and occluded objects when contours met conditions of spatiotemporal relatability. Here we report psychophysical studies testing whether spatiotemporal interpolation allows recovery of metric properties of objects. Observers viewed virtual triangles specified only by sequential partial occlusions of background elements by their vertices (the STI condition) and made forced choice judgments of the object’s size relative to a reference standard. We found that length could often be accurately recovered for conditions where fragments were relatable and formed illusory triangles. In the first control condition, three moving dots located at the vertices provided the same spatial and timing information as the virtual object in the STI condition but did not induce perception of interpolated contours or a coherent object. In the second control condition oriented line segments were added to the dots and mid-points between the dots in a way that did not induce perception of interpolated contours. Control stimuli did not lead to accurate size judgments. We conclude that spatiotemporal interpolation can produce representations, from fragmentary information, of metric properties in addition to shape

    Multigranularity Representations for Human Inter-Actions: Pose, Motion and Intention

    Get PDF
    Tracking people and their body pose in videos is a central problem in computer vision. Standard tracking representations reason about temporal coherence of detected people and body parts. They have difficulty tracking targets under partial occlusions or rare body poses, where detectors often fail, since the number of training examples is often too small to deal with the exponential variability of such configurations. We propose tracking representations that track and segment people and their body pose in videos by exploiting information at multiple detection and segmentation granularities when available, whole body, parts or point trajectories. Detections and motion estimates provide contradictory information in case of false alarm detections or leaking motion affinities. We consolidate contradictory information via graph steering, an algorithm for simultaneous detection and co-clustering in a two-granularity graph of motion trajectories and detections, that corrects motion leakage between correctly detected objects, while being robust to false alarms or spatially inaccurate detections. We first present a motion segmentation framework that exploits long range motion of point trajectories and large spatial support of image regions. We show resulting video segments adapt to targets under partial occlusions and deformations. Second, we augment motion-based representations with object detection for dealing with motion leakage. We demonstrate how to combine dense optical flow trajectory affinities with repulsions from confident detections to reach a global consensus of detection and tracking in crowded scenes. Third, we study human motion and pose estimation. We segment hard to detect, fast moving body limbs from their surrounding clutter and match them against pose exemplars to detect body pose under fast motion. We employ on-the-fly human body kinematics to improve tracking of body joints under wide deformations. We use motion segmentability of body parts for re-ranking a set of body joint candidate trajectories and jointly infer multi-frame body pose and video segmentation. We show empirically that such multi-granularity tracking representation is worthwhile, obtaining significantly more accurate multi-object tracking and detailed body pose estimation in popular datasets

    Early completion of occluded objects

    Get PDF
    We show that early vision can use monocular cues to rapidly complete partially-occluded objects. Visual search for easily detected fragments becomes difficult when the completed shape is similar to others in the display; conversely, search for fragments that are difficult to detect becomes easy when the completed shape is distinctive. Results indicate that completion occurs via the occlusion-triggered removal of occlusion edges and linking of associated regions. We fail to find evidence for a visible filling-in of contours or surfaces, but do find evidence for a "functional" filling-in that prevents the constituent fragments from being rapidly accessed. As such, it is only the completed structures—and not the fragments themselves—that serve as the basis for rapid recognition

    Object completion effects in attention and memory

    Get PDF

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Cortical Dynamics of 3-D Figure-Ground Perception of 2-D Pictures

    Full text link
    This article develops the FACADE theory of 3-D vision and figure-ground separation to explain data concerning how 2-D pictures give rise to 3-D percepts of occluding and occluded objects. These percepts include pop-out of occluding figures and amodal completion of occluded figures in response to line drawings, to Bregman-Kanizsa displays in which the relative contrasts of occluding and occluded surfaces are reversed, to White displays from which either transparent or opaque occlusion percepts can obtain, to Egusa and Kanizsa square displays in which brighter regions look closer, and to Kanizsa stratification displays in which bistable reversals of occluding and occluded surfaces occurs, and in which real contours and illusory contours compete to alter the reversal percept. The model describes how changes in contrast can alter a percept without a change in geometry, and conversely. More generally it shows how geometrical and contrastive properties of a picture can either cooperate or compete when forming the boundaries and surface representations that subserve conscious percepts. Spatially long-range cooperation and spatially short-range competition work together to separate the boundaries of occluding figures from their occluded neighbors. This boundary ownership process is sensitive to image T-junctions at which occluded figures contact occluding figures, but there are no explicit T-junction detectors in the network. Rather, the contextual balance of boundary cooperation and competition strengthens some boundaries while breaking others. These boundaries control the filling-in of color within multiple, depth-sensitive surface respresentations. Feedback between surface and boundary representations strengthens consistent boundaries while inhibiting inconsistent ones. It is suggested how both the boundary and the surface representations of occluded objects may be amodally completed, even while the surface representations of unocclucled objects become visible through modal completion. Distinct functional roles for conscious modal and amodal representations in object recognition, spatial attention, and reaching behaviors are discussed. Model interactions are interpreted in terms of visual, temporal, and parietal cortex. Model concepts provide a mechanistic neural explanation and revision of such Gestalt principles as good continuation, stratification, and non-accidental solution.Office of Naval Research (N00014-91-J-4100, N00014-95-I-0409, N00014-95-I-0657, N00014-92-J-11015

    The Impact of 2-D and 3-D Grouping Cues on Depth From Binocular Disparity

    Get PDF
    Stereopsis is a powerful source of information about the relative depth of objects in the world. In isolation, humans can see depth from binocular disparity without any other depth cues. However, many different stimulus properties can dramatically influence the depth we perceive. For example, there is an abundance of research showing that the configuration of a stimulus can impact the percept of depth, in some cases diminishing the amount of depth experience. Much of the previous research has focused on discrimination thresholds; in one example, stereoacuity for a pair of vertical lines was shown to be markedly reduced when these lines were connected to form a rectangle apparently slanted in depth (eg: McKee, 1983). The contribution of Gestalt figural grouping to this phenomenon has not been studied. This dissertation addresses the role that perceptual grouping plays in the recovery of suprathreshold depth from disparity. First, I measured the impact of perceptual closure on depth magnitude. Observers estimated the separation in depth of a pair of vertical lines as the amount of perceptual closure was varied. In a series of experiments, I characterized the 2-D and 3-D properties that contribute to 3-D closure and the estimates of apparent depth. Estimates of perceived depth were highly correlated to the strength of subjective closure. Furthermore, I highlighted the perceptual consequences (both costs and benefits) of a new disparity-based grouping cue that interacts with perceived closure, which I call good stereoscopic continuation. This cue was shown to promote detection in a visual search task but reduces depth percepts compared to isolated features. Taken together, the results reported here show that specific 2-D and 3-D grouping constraints are required to promote recovery of a 3-D object. As a consequence, quantitative depth is reduced, but the object is rapidly detected in a visual search task. I propose that these phenomena are the result of object-based disparity smoothing operations that enhance object cohesion
    • …
    corecore