73,655 research outputs found

    Figure-Ground Organization Emerges in a Deep Net with a Feedback Loop

    Get PDF
    We used a deep net to model how object-specific activation at the high levels of a hierarchical neural network could be fed back to modify representations at lower levels. We first identified a subset of nodes in the uppermost hidden layer that were preferentially activated by images of people. We then ran a procedure to recursively modify an image so as to increase activation of the \u27person-selective\u27 nodes. The image was modified by choosing a rectangular region (of random size and position) and reducing contrast in that region. The modification was kept if the activation of the \u27person-selective’ nodes became larger relative to the activation of the remaining nodes in that layer, and discarded otherwise. This process led to appearance modification according to learned statistics, which includes: (i) recovery of figural details in the occlusion zone, and (ii) modification of figural details in un-occluded zone according to what is consistent with object category statistics, and suppression of distractors in the background. We also tried this process with the classic ambiguous face-vase image of Rubin. Depending of the focus of the feedback signals, either the faces or the center figure would be developed in details. These results indicate that feedback of object-specific information can be used to facilitate figure-ground segregation and drive low-level representation towards enhancing perceptual interpretation

    Mass segregation of different populations inside the cluster NGC6101

    Get PDF
    We have used ESO telescopes at La Silla and the Hubble Space Telescope (HST) in order to obtain accurate B,V,I CCD photometry for the stars located within 200" (~= 2 half-mass radii, r_h = 1.71') from the center of the cluster NGC 6101. Color-Magnitude Diagrams extending from the red-giant tip to about 5 magnitudes below the main-sequence turnoff MSTO (V = 20.05 +- 0.05) have been constructed. The following results have been obtained from the analysis of the CMDs: a) The overall morphology of the main branches confirms previous results from the literature, in particular the existence of a sizeable population of 73 "blue stragglers", which had been already partly detected (27).They are considerably more concentrated than either the subgiant branch or the main sequence stars, and have the same spatial distribution as the horizontal branch stars (84% prob. from K-S test). An hypothesis on the possible BSS progeny is also presented. b) The HB is narrow and the bulk of stars is blue, as expected for a typical metal-poor globular cluster. c) The derived magnitudes for the HB and the MSTO, $V(ZAHB) = 16.59+-0.10, V(TO) = 20.05+-0.05, coupled with the values E(B-V) = 0.1, [Fe/H] = -1.80, Y = 0.23 yield a distance modulus (m-M)_V = 16.23 and an age similar to other ``old'' metal-poor globular clusters. In particular, from the comparison with theoretical isochrones, we derive for this cluster an age of 13 Gyrs. d) By using the large statistical sample of Red Giant Branch (RGB) stars, we detected with high accuracy the position of the bump in the RGB luminosity function. This observational feature has been compared with theoretical prescriptions, yielding a good agreement within the current theoretical and observational uncertainties.Comment: 13 pages, 17 figures, uses documentclass 'aa' v 5.01 with package 'graphicx'. Accepted for publication in Astronomy & Astrophysic

    A neural model of border-ownership from kinetic occlusion

    Full text link
    Camouflaged animals that have very similar textures to their surroundings are difficult to detect when stationary. However, when an animal moves, humans readily see a figure at a different depth than the background. How do humans perceive a figure breaking camouflage, even though the texture of the figure and its background may be statistically identical in luminance? We present a model that demonstrates how the primate visual system performs figure–ground segregation in extreme cases of breaking camouflage based on motion alone. Border-ownership signals develop as an emergent property in model V2 units whose receptive fields are nearby kinetically defined borders that separate the figure and background. Model simulations support border-ownership as a general mechanism by which the visual system performs figure–ground segregation, despite whether figure–ground boundaries are defined by luminance or motion contrast. The gradient of motion- and luminance-related border-ownership signals explains the perceived depth ordering of the foreground and background surfaces. Our model predicts that V2 neurons, which are sensitive to kinetic edges, are selective to border-ownership (magnocellular B cells). A distinct population of model V2 neurons is selective to border-ownership in figures defined by luminance contrast (parvocellular B cells). B cells in model V2 receive feedback from neurons in V4 and MT with larger receptive fields to bias border-ownership signals toward the figure. We predict that neurons in V4 and MT sensitive to kinetically defined figures play a crucial role in determining whether the foreground surface accretes, deletes, or produces a shearing motion with respect to the background.This work was supported in part by CELEST (NSF SBE-0354378 and OMA-0835976), the Office of Naval Research (ONR N00014-11-1-0535) and Air Force Office of Scientific Research (AFOSR FA9550-12-1-0436). (NSF SBE-0354378 - CELEST; OMA-0835976 - CELEST; ONR N00014-11-1-0535 - Office of Naval Research; AFOSR FA9550-12-1-0436 - Air Force Office of Scientific Research)Published versio

    Exploiting surroundedness for saliency detection: a boolean map approach

    Full text link
    We demonstrate the usefulness of surroundedness for eye fixation prediction by proposing a Boolean Map based Saliency model (BMS). In our formulation, an image is characterized by a set of binary images, which are generated by randomly thresholding the image's feature maps in a whitened feature space. Based on a Gestalt principle of figure-ground segregation, BMS computes a saliency map by discovering surrounded regions via topological analysis of Boolean maps. Furthermore, we draw a connection between BMS and the Minimum Barrier Distance to provide insight into why and how BMS can properly captures the surroundedness cue via Boolean maps. The strength of BMS is verified by its simplicity, efficiency and superior performance compared with 10 state-of-the-art methods on seven eye tracking benchmark datasets.US National Science Foundation; 1059218; 1029430http://cs-people.bu.edu/jmzhang/BMS/BMS_iccv13_preprint.pdfAccepted manuscrip

    Texture Segregation By Visual Cortex: Perceptual Grouping, Attention, and Learning

    Get PDF
    A neural model is proposed of how laminar interactions in the visual cortex may learn and recognize object texture and form boundaries. The model brings together five interacting processes: region-based texture classification, contour-based boundary grouping, surface filling-in, spatial attention, and object attention. The model shows how form boundaries can determine regions in which surface filling-in occurs; how surface filling-in interacts with spatial attention to generate a form-fitting distribution of spatial attention, or attentional shroud; how the strongest shroud can inhibit weaker shrouds; and how the winning shroud regulates learning of texture categories, and thus the allocation of object attention. The model can discriminate abutted textures with blurred boundaries and is sensitive to texture boundary attributes like discontinuities in orientation and texture flow curvature as well as to relative orientations of texture elements. The model quantitatively fits a large set of human psychophysical data on orientation-based textures. Object boundar output of the model is compared to computer vision algorithms using a set of human segmented photographic images. The model classifies textures and suppresses noise using a multiple scale oriented filterbank and a distributed Adaptive Resonance Theory (dART) classifier. The matched signal between the bottom-up texture inputs and top-down learned texture categories is utilized by oriented competitive and cooperative grouping processes to generate texture boundaries that control surface filling-in and spatial attention. Topdown modulatory attentional feedback from boundary and surface representations to early filtering stages results in enhanced texture boundaries and more efficient learning of texture within attended surface regions. Surface-based attention also provides a self-supervising training signal for learning new textures. Importance of the surface-based attentional feedback in texture learning and classification is tested using a set of textured images from the Brodatz micro-texture album. Benchmark studies vary from 95.1% to 98.6% with attention, and from 90.6% to 93.2% without attention.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-01-1-0423); National Science Foundation (SBE-0354378); Office of Naval Research (N00014-01-1-0624

    Fine-Grained Car Detection for Visual Census Estimation

    Full text link
    Targeted socioeconomic policies require an accurate understanding of a country's demographic makeup. To that end, the United States spends more than 1 billion dollars a year gathering census data such as race, gender, education, occupation and unemployment rates. Compared to the traditional method of collecting surveys across many years which is costly and labor intensive, data-driven, machine learning driven approaches are cheaper and faster--with the potential ability to detect trends in close to real time. In this work, we leverage the ubiquity of Google Street View images and develop a computer vision pipeline to predict income, per capita carbon emission, crime rates and other city attributes from a single source of publicly available visual data. We first detect cars in 50 million images across 200 of the largest US cities and train a model to predict demographic attributes using the detected cars. To facilitate our work, we have collected the largest and most challenging fine-grained dataset reported to date consisting of over 2600 classes of cars comprised of images from Google Street View and other web sources, classified by car experts to account for even the most subtle of visual differences. We use this data to construct the largest scale fine-grained detection system reported to date. Our prediction results correlate well with ground truth income data (r=0.82), Massachusetts department of vehicle registration, and sources investigating crime rates, income segregation, per capita carbon emission, and other market research. Finally, we learn interesting relationships between cars and neighborhoods allowing us to perform the first large scale sociological analysis of cities using computer vision techniques.Comment: AAAI 201

    On the Computational Modeling of Human Vision

    Full text link
    • …
    corecore