research

Representing an Object by Interchanging What with Where

Abstract

Exploring representations is a fundamental step towards understanding vision. The visual system carries two types of information along separate pathways: One is about what it is and the other is about where it is. Initially, the what is represented by a pattern of activity that is distributed across millions of photoreceptors, whereas the where is 'implicitly' given as their retinotopic positions. Many computational theories of object recognition rely on such pixel-based representations, but they are insufficient to learn spatial information such as position and size due to the implicit encoding of the where information. 
Here we try transforming a retinal image of an object into its internal image via interchanging the what with the where, which means that patterns of intensity in internal image describe the spatial information rather than the object information. To be concrete, the retinal image of an object is deformed and turned over into a negative image, in which light areas appear dark and vice versa, and the object's spatial information is quantified with levels of intensity on borders of that image. 
Interestingly, the inner part excluding the borders of the internal image shows the position and scale invariance. In order to further understand how the internal image associates the what and where, we examined the internal image of a face which moves or is scaled on the retina. As a result, we found that the internal images form a linear vector space under the object translation and scaling. 
In conclusion, these results show that the what-where interchangeability might play an important role for organizing those two into internal representation of brain

    Similar works