A better understanding of the emergent computation and problem-solving
capabilities of recent large language models is of paramount importance to
further improve them and broaden their applicability. This work investigates
how a language model, trained to predict the next token, can perform arithmetic
computations generalizing beyond training data. Binary addition and
multiplication constitute a good testbed for this purpose, since they require a
very small vocabulary and exhibit relevant input/output discontinuities making
smooth input interpolation ineffective for novel data. We successfully trained
a light language model to learn these tasks and ran a number of experiments to
investigate the extrapolation capabilities and internal information processing.
Our findings support the hypotheses that the language model works as an
Encoding-Regression-Decoding machine where the computation takes place in the
value space once the input token representation is mapped to an appropriate
internal representation