3 research outputs found
Learning to Abstract with Nonparametric Variational Information Bottleneck
Learned representations at the level of characters, sub-words, words and
sentences, have each contributed to advances in understanding different NLP
tasks and linguistic phenomena. However, learning textual embeddings is costly
as they are tokenization specific and require different models to be trained
for each level of abstraction. We introduce a novel language representation
model which can learn to compress to different levels of abstraction at
different layers of the same model. We apply Nonparametric Variational
Information Bottleneck (NVIB) to stacked Transformer self-attention layers in
the encoder, which encourages an information-theoretic compression of the
representations through the model. We find that the layers within the model
correspond to increasing levels of abstraction and that their representations
are more linguistically informed. Finally, we show that NVIB compression
results in a model which is more robust to adversarial perturbations.Comment: Accepted to Findings of EMNLP 202
Vocal-visual combinations in wild chimpanzees
Living organisms throughout the animal kingdom habitually communicate with multi-modal signals that use multiple sensory channels. Such composite signals vary in their communicative function, as well as the extent to which they are recombined freely. Humans typically display complex forms of multi-modal communication, yet the evolution of this capacity remains unknown. One of our two closest living relatives, chimpanzees, also produce multi-modal combinations and therefore may offer a valuable window into the evolutionary roots of human communication. However, a currently neglected step in describing multi-modal systems is to disentangle non-random combinations from those that occur simply by chance. Here we aimed to provide a systematic quantification of communicative behaviour in our closest living relatives, describing non-random combinations produced across auditory and visual modalities. Through recording the behaviour of wild chimpanzees from the Kibale forest, Uganda we generated the first repertoire of non-random combined vocal and visual components. Using collocation analysis, we identified more than 100 vocal-visual combinations which occurred more frequently than expected by chance. We also probed how multi-modal production varied in the population, finding no differences in the number of visual components produced with vocalisations as a function of age, sex or rank. As expected, chimpanzees produced more visual components alongside vocalizations during longer vocalization bouts, however, this was only the case for some vocalization types, not others. We demonstrate that chimpanzees produce a vast array of combined vocal and visual components, exhibiting a hitherto underappreciated level of multi-modal complexity
Universal Adversarial Attacks on Text Classifiers
Despite the vast success neural networks have achieved in different application domains, they have been proven to be vulnerable to adversarial perturbations (small changes in the input), which lead them to produce the wrong output. In this paper, we propose a novel method, based on gradient projection, for generating universal adversarial perturbations for text; namely sequence of words that can be added to any input in order to fool the classifier with high probability. We observed that text classifiers are quite vulnerable to such perturbations: inserting even a single adversarial word to the beginning of every input sequence can drop the accuracy from 93% to 50%