8 research outputs found
LocalNorm: Robust image classification through dynamically regularized normalization
While modern convolutional neural networks achieve outstanding accuracy on many image classification tasks, they are, compared to humans, much more sensitive to image degradation. Here, we describe a variant of Batch Normalization, LocalNorm, that regularizes the normalization layer in the spirit of Dropout while dynamically adapting to the local image intensity and contrast at test-time. We show that the resulting deep neural networks are much more resistant to noise-induced image degradation, improving accuracy by up to three times, while achieving the same or slightly better accuracy on non-degraded classical benchmarks. In computational terms, LocalNorm adds negligible training cost and little or no cost at inference time, and can be applied to already-trained networks in a straightforward manner
Arousal state affects perceptual decision-making by modulating hierarchical sensory processing in a large-scale visual system model
Arousal levels strongly affect task performance. Yet, what arousal level is optimal for a task depends on its difficulty. Easy task performance peaks at higher arousal levels, whereas performance on difficult tasks displays an inverted U-shape relationship with arousal, peaking at medium arousal levels, an observation first made by Yerkes and Dodson in 1908. It is commonly proposed that the noradrenergic locus coeruleus system regulates these effects on performance through a widespread release of noradrenaline resulting in changes of cortical gain. This account, however, does not explain why performance decays with high arousal levels only in difficult, but not in simple tasks. Here, we present a mechanistic model that revisits the Yerkes-Dodson effect from a sensory perspective: a deep convolutional neural network augmented with a global gain mechanism reproduced the same interaction between arousal state and task difficulty in its performance. Investigating this model revealed that global gain states differentially modulated sensory information encoding across the processing hierarchy, which explained their differential effects on performance on simple versus difficult tasks. These findings offer a novel hierarchical sensory processing account of how, and why, arousal state affects task performance
Leveraging spiking deep neural networks to understand the neural mechanisms underlying selective attention
Spatial attention enhances sensory processing of goalrelevant information and improves perceptual sensitivity. Yet, the specific neural mechanisms underlying the effects of spatial attention on performance are still contested. Here, we examine different attention mechanisms in spiking deep convolutional neural networks. We directly contrast effects of precision (internal noise suppression) and two different gain modulation mechanisms on performance on a visual search task with complex real-world images. Unlike standard artificial neurons, biological neurons have saturating activation functions, permitting implementation of attentional gain as gain on a neuron’s input or on its outgoing connection. We show that modulating the connection is most effective in selectively enhancing information processing by redistributing spiking activity and by introducing additional task-relevant information, as shown by representational similarity analyses. Precision only produced minor attentional effects in performance. Our results, which mirror empirical findings, show that it is possible to adjudicate between attention mechanisms using more biologically realistic models and natural stimuli
Depth in convolutional neural networks solves scene segmentation
Feedforward deep convolutional neural networks (DCNNs) are, under specific conditions, matching and even surpassing human performance in object recognition in natural scenes. This performance suggests that the analysis of a loose collection of image features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Research in humans however suggests that while feedforward activity may suffice for sparse scenes with isolated objects, additional visual operations ('routines') that aid the recognition process (e.g. segmentation or grouping) are needed for more complex scenes. Linking human visual processing to performance of DCNNs with increasing depth, we here explored if, how, and when object information is differentiated from the backgrounds they appear on. To this end, we controlled the information in both objects and backgrounds, as well as the relationship between them by adding noise, manipulating background congruence and systematically occluding parts of the image. Results indicate that with an increase in network depth, there is an increase in the distinction between objectand background information. For more shallow networks, results indicated a benefit of training on segmented objects. Overall, these results indicate that, de facto, scene segmentation can be performed by a network of sufficient depth. We conclude that the human brain could perform scene segmentation in the context of object identification without an explicit mechanism, by selecting or "binding"features that belong to the object and ignoring other features, in a manner similar to a very deep convolutional neural network. Copyright
Mechanisms of human dynamic object recognition revealed by sequential deep neural networks
In this project, we model human performance on an RSVP task using a variety of different computational models