Some Transformer-based models can perform cross-lingual transfer learning:
those models can be trained on a specific task in one language and give
relatively good results on the same task in another language, despite having
been pre-trained on monolingual tasks only. But, there is no consensus yet on
whether those transformer-based models learn universal patterns across
languages. We propose a word-level task-agnostic method to evaluate the
alignment of contextualized representations built by such models. We show that
our method provides more accurate translated word pairs than previous methods
to evaluate word-level alignment. And our results show that some inner layers
of multilingual Transformer-based models outperform other explicitly aligned
representations, and even more so according to a stricter definition of
multilingual alignment.Comment: accepted at IJCNN 202