29,213 research outputs found

    Segmentation-Aware Convolutional Networks Using Local Attention Masks

    Get PDF
    We introduce an approach to integrate segmentation information within a convolutional neural network (CNN). This counter-acts the tendency of CNNs to smooth information across regions and increases their spatial precision. To obtain segmentation information, we set up a CNN to provide an embedding space where region co-membership can be estimated based on Euclidean distance. We use these embeddings to compute a local attention mask relative to every neuron position. We incorporate such masks in CNNs and replace the convolution operation with a "segmentation-aware" variant that allows a neuron to selectively attend to inputs coming from its own region. We call the resulting network a segmentation-aware CNN because it adapts its filters at each image point according to local segmentation cues. We demonstrate the merit of our method on two widely different dense prediction tasks, that involve classification (semantic segmentation) and regression (optical flow). Our results show that in semantic segmentation we can match the performance of DenseCRFs while being faster and simpler, and in optical flow we obtain clearly sharper responses than networks that do not use local attention masks. In both cases, segmentation-aware convolution yields systematic improvements over strong baselines. Source code for this work is available online at http://cs.cmu.edu/~aharley/segaware

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Personalized Automatic Estimation of Self-reported Pain Intensity from Facial Expressions

    Full text link
    Pain is a personal, subjective experience that is commonly evaluated through visual analog scales (VAS). While this is often convenient and useful, automatic pain detection systems can reduce pain score acquisition efforts in large-scale studies by estimating it directly from the participants' facial expressions. In this paper, we propose a novel two-stage learning approach for VAS estimation: first, our algorithm employs Recurrent Neural Networks (RNNs) to automatically estimate Prkachin and Solomon Pain Intensity (PSPI) levels from face images. The estimated scores are then fed into the personalized Hidden Conditional Random Fields (HCRFs), used to estimate the VAS, provided by each person. Personalization of the model is performed using a newly introduced facial expressiveness score, unique for each person. To the best of our knowledge, this is the first approach to automatically estimate VAS from face images. We show the benefits of the proposed personalized over traditional non-personalized approach on a benchmark dataset for pain analysis from face images.Comment: Computer Vision and Pattern Recognition Conference, The 1st International Workshop on Deep Affective Learning and Context Modelin

    The influence of visual landscape on the free flight behavior of the fruit fly Drosophila melanogaster

    Get PDF
    To study the visual cues that control steering behavior in the fruit fly Drosophila melanogaster, we reconstructed three-dimensional trajectories from images taken by stereo infrared video cameras during free flight within structured visual landscapes. Flies move through their environment using a series of straight flight segments separated by rapid turns, termed saccades, during which the fly alters course by approximately 90° in less than 100 ms. Altering the amount of background visual contrast caused significant changes in the fly’s translational velocity and saccade frequency. Between saccades, asymmetries in the estimates of optic flow induce gradual turns away from the side experiencing a greater motion stimulus, a behavior opposite to that predicted by a flight control model based upon optomotor equilibrium. To determine which features of visual motion trigger saccades, we reconstructed the visual environment from the fly’s perspective for each position in the flight trajectory. From these reconstructions, we modeled the fly’s estimation of optic flow on the basis of a two-dimensional array of Hassenstein–Reichardt elementary motion detectors and, through spatial summation, the large-field motion stimuli experienced by the fly during the course of its flight. Event-triggered averages of the large-field motion preceding each saccade suggest that image expansion is the signal that triggers each saccade. The asymmetry in output of the local motion detector array prior to each saccade influences the direction (left versus right) but not the magnitude of the rapid turn. Once initiated, visual feedback does not appear to influence saccade kinematics further. The total expansion experienced before a saccade was similar for flight within both uniform and visually textured backgrounds. In summary, our data suggest that complex behavioral patterns seen during free flight emerge from interactions between the flight control system and the visual environment
    corecore