research

A neural model for the visual tuning properties of action-selective neurons

Abstract

SUMMARY: The recognition of actions of conspecifics is crucial for survival and social interaction. Most current models on the recognition of transitive (goal-directed) actions rely on the hypothesized role of internal motor simulations for action recognition. However, these models do not specify how visual information can be processed by cortical mechanisms in order to be compared with such motor representations. This raises the question how such visual processing might be accomplished, and in how far motor processing is critical in order to account for the visual properties of action-selective neurons.
We present a neural model for the visual processing of transient actions that is consistent with physiological data and that accomplishes recognition of grasping actions from real video stimuli. Shape recognition is accomplished by a view-dependent hierarchical neural architecture that retains some coarse position information on the highest level that can be exploited by subsequent stages. Additionally, simple recurrent neural circuits integrate effector information over time and realize selectivity for temporal sequences. A novel mechanism combines information about the shape and position of object and effector in an object-centered frame of reference. Action-selective model neurons defined in such a relative reference frame are tuned to learned associations between object and effector shapes, as well as their relative position and motion. 
We demonstrate that this model reproduces a variety of electrophysiological findings on the visual properties of action-selective neurons in the superior temporal sulcus, and of mirror neurons in area F5. Specifically, the model accounts for the fact that a majority of mirror neurons in area F5 show view dependence. The model predicts a number of electrophysiological results, which partially could be confirmed in recent experiments.
We conclude that the tuning of action-selective neurons given visual stimuli can be accounted for by well-established, predominantly visual neural processes rather than internal motor simulations.

METHODS: The shape recognition relies on a hierarchy of feature detectors of increasing complexity and invariance [1]. The mid-level features are learned from sequences of gray-level images depicting segmented views of hand and object shapes. The highest hierarchy level consists of detector populations for complete shapes with a coarse spatial resolution of approximately 3.7°. Additionally, effector shapes are integrated over time by asymmetric lateral connections between shape detectors using a neural field approach [2]. These model neurons thus encode actions such as hand opening or closing for particular grip types. 
We exploit gain field mechanism in order to implement the central coordinate transformation of the shape representations to an object-centered reference frame [3]. Typical effector-object-interactions correspond to activity regions in such a relative reference frame and are learned from training examples. Similarly, simple motion-energy detectors are applied in the object-centered reference frame and encode relative motion. The properties of transitive action neurons are modeled as a multiplicative combination of relative shape and motion detectors.

RESULTS: The model performance was tested on a set of 160 unsegmented sequences of hand grasping or placing actions performed on objects of different sizes, using different grip types and views. Hand actions and objects could be reliably recognized despite their mutual occlusions. Detectors on the highest level showed correct action tuning in more than 95% of the examples and generalized to untrained views. 
Furthermore, the model replicates a number of electrophysiological as well as imaging experiments on action-selective neurons, such as their particular selectivity for transitive actions compared to mimicked actions, the invariance to stimulus position, and their view-dependence. In particular, using the same stimulus set the model nicely fits neural data from a recent electrophysiological experiment that confirmed sequence selectivity in mirror neurons in area F5, as was predicted before by the model.

References
[1] Serre, T. et al. (2007): IEEE Pattern Anal. Mach. Int. 29, 411-426.
[2] Giese, A.M. and Poggio, T. (2003): Nat. Rev. Neurosci. 4, 179-192.
[3] Deneve, S. and Pouget, A. (2003). Neuron 37: 347-359.
&#xa

    Similar works