Crowding is a visual effect suffered by humans, in which an object that can
be recognized in isolation can no longer be recognized when other objects,
called flankers, are placed close to it. In this work, we study the effect of
crowding in artificial Deep Neural Networks for object recognition. We analyze
both standard deep convolutional neural networks (DCNNs) as well as a new
version of DCNNs which is 1) multi-scale and 2) with size of the convolution
filters change depending on the eccentricity wrt to the center of fixation.
Such networks, that we call eccentricity-dependent, are a computational model
of the feedforward path of the primate visual cortex. Our results reveal that
the eccentricity-dependent model, trained on target objects in isolation, can
recognize such targets in the presence of flankers, if the targets are near the
center of the image, whereas DCNNs cannot. Also, for all tested networks, when
trained on targets in isolation, we find that recognition accuracy of the
networks decreases the closer the flankers are to the target and the more
flankers there are. We find that visual similarity between the target and
flankers also plays a role and that pooling in early layers of the network
leads to more crowding. Additionally, we show that incorporating the flankers
into the images of the training set does not improve performance with crowding.Comment: CBMM mem