4,775 research outputs found
When and where do feed-forward neural networks learn localist representations?
According to parallel distributed processing (PDP) theory in psychology,
neural networks (NN) learn distributed rather than interpretable localist
representations. This view has been held so strongly that few researchers have
analysed single units to determine if this assumption is correct. However,
recent results from psychology, neuroscience and computer science have shown
the occasional existence of local codes emerging in artificial and biological
neural networks. In this paper, we undertake the first systematic survey of
when local codes emerge in a feed-forward neural network, using generated input
and output data with known qualities. We find that the number of local codes
that emerge from a NN follows a well-defined distribution across the number of
hidden layer neurons, with a peak determined by the size of input data, number
of examples presented and the sparsity of input data. Using a 1-hot output code
drastically decreases the number of local codes on the hidden layer. The number
of emergent local codes increases with the percentage of dropout applied to the
hidden layer, suggesting that the localist encoding may offer a resilience to
noisy networks. This data suggests that localist coding can emerge from
feed-forward PDP networks and suggests some of the conditions that may lead to
interpretable localist representations in the cortex. The findings highlight
how local codes should not be dismissed out of hand
Dynamic Steerable Blocks in Deep Residual Networks
Filters in convolutional networks are typically parameterized in a pixel
basis, that does not take prior knowledge about the visual world into account.
We investigate the generalized notion of frames designed with image properties
in mind, as alternatives to this parametrization. We show that frame-based
ResNets and Densenets can improve performance on Cifar-10+ consistently, while
having additional pleasant properties like steerability. By exploiting these
transformation properties explicitly, we arrive at dynamic steerable blocks.
They are an extension of residual blocks, that are able to seamlessly transform
filters under pre-defined transformations, conditioned on the input at training
and inference time. Dynamic steerable blocks learn the degree of invariance
from data and locally adapt filters, allowing them to apply a different
geometrical variant of the same filter to each location of the feature map.
When evaluated on the Berkeley Segmentation contour detection dataset, our
approach outperforms all competing approaches that do not utilize pre-training.
Our results highlight the benefits of image-based regularization to deep
networks
- …