15,869 research outputs found
Investigating the Encoding of Words in BERT's Neurons using Feature Textualization
Pretrained language models (PLMs) form the basis of most state-of-the-art NLP
technologies. Nevertheless, they are essentially black boxes: Humans do not
have a clear understanding of what knowledge is encoded in different parts of
the models, especially in individual neurons. The situation is different in
computer vision, where feature visualization provides a decompositional
interpretability technique for neurons of vision models. Activation
maximization is used to synthesize inherently interpretable visual
representations of the information encoded in individual neurons. Our work is
inspired by this but presents a cautionary tale on the interpretability of
single neurons, based on the first large-scale attempt to adapt activation
maximization to NLP, and, more specifically, large PLMs. We propose feature
textualization, a technique to produce dense representations of neurons in the
PLM word embedding space. We apply feature textualization to the BERT model
(Devlin et al., 2019) to investigate whether the knowledge encoded in
individual neurons can be interpreted and symbolized. We find that the produced
representations can provide insights about the knowledge encoded in individual
neurons, but that individual neurons do not represent clearcut symbolic units
of language such as words. Additionally, we use feature textualization to
investigate how many neurons are needed to encode words in BERT.Comment: To be published in 'BlackboxNLP 2023: The 6th Workshop on Analysing
and Interpreting Neural Networks for NLP'. Camera-ready versio
A simple model for low variability in neural spike trains
Neural noise sets a limit to information transmission in sensory systems. In
several areas, the spiking response (to a repeated stimulus) has shown a higher
degree of regularity than predicted by a Poisson process. However, a simple
model to explain this low variability is still lacking. Here we introduce a new
model, with a correction to Poisson statistics, which can accurately predict
the regularity of neural spike trains in response to a repeated stimulus. The
model has only two parameters, but can reproduce the observed variability in
retinal recordings in various conditions. We show analytically why this
approximation can work. In a model of the spike emitting process where a
refractory period is assumed, we derive that our simple correction can well
approximate the spike train statistics over a broad range of firing rates. Our
model can be easily plugged to stimulus processing models, like
Linear-nonlinear model or its generalizations, to replace the Poisson spike
train hypothesis that is commonly assumed. It estimates the amount of
information transmitted much more accurately than Poisson models in retinal
recordings. Thanks to its simplicity this model has the potential to explain
low variability in other areas
Sensor Adaptation and Development in Robots by Entropy Maximization of Sensory Data
A method is presented for adapting the sensors of a robot to the statistical structure of its current environment. This enables the robot to compress incoming sensory information and to find informational relationships between sensors. The method is applied to creating sensoritopic maps of the informational relationships of the sensors of a developing robot, where the informational distance between sensors is computed using information theory and adaptive binning. The adaptive binning method constantly estimates the probability distribution of the latest inputs to maximize the entropy in each individual sensor, while conserving the correlations between different sensors. Results from simulations and robotic experiments with visual sensors show how adaptive binning of the sensory data helps the system to discover structure not found by ordinary binning. This enables the developing perceptual system of the robot to be more adapted to the particular embodiment of the robot and the environment
- …