39 research outputs found

    Action Recognition with a Bio--Inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions

    Get PDF
    International audienceHere we show that reproducing the functional properties of MT cells with various center--surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio--inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure, and more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio--inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos

    Laminar cortical dynamics of visual form and motion interactions during coherent object motion perception

    No full text
    How do visual form and motion processes cooperate to compute object motion when each process separately is insufficient? Consider, for example, a deer moving behind a bush. Here the partially occluded fragments of motion signals available to an observer must be coherently grouped into the motion of a single object. A 3D FORMOTION model comprises five important functional interactions involving the brain’s form and motion systems that address such situations. Because the model’s stages are analogous to areas of the primate visual system, we refer to the stages by corresponding anatomical names. In one of these functional interactions, 3D boundary representations, in which figures are separated from their backgrounds, are formed in cortical area V2. These depth-selective V2 boundaries select motion signals at the appropriate depths in MT via V2-to-MT signals. In another, motion signals in MT disambiguate locally incomplete or ambiguous boundary signals in V2 via MT-to-V1-to-V2 feedback. The third functional property concerns resolution of the aperture problem along straight moving contours by propagating the influence of unambiguous motion signals generated at contour terminators or corners. Here, sparse “feature tracking signals” from, e.g., line ends, are amplified to overwhelm numerically superior ambiguous motion signals along line segment interiors. In the fourth, a spatially anisotropic motion grouping process takes place across perceptual space via MT-MST feedback to integrate veridical feature-tracking and ambiguous motion signals to determine a global object motion percept. The fifth property uses the MT-MST feedback loop to convey an attentional priming signal from higher brain areas back to V1 and V2. The model's use of mechanisms such as divisive normalization, endstopping, cross-orientation inhibition, and longrange cooperation is described. Simulated data include: the degree of motion coherence of rotating shapes observed through apertures, the coherent vs. element motion percepts separated in depth during the chopsticks illusion, and the rigid vs. non-rigid appearance of rotating ellipses.Air Force Office of Scientific Research (F49620-01-1-0397); National Geospatial-Intelligence Agency (NMA201-01-1-2016); National Science Foundation (BCS-0235398, SBE-0354378); Office of Naval Research (N00014-95-1-0409, N00014-01-1-0624
    corecore