Word alignment is essential for the down-streaming cross-lingual language
understanding and generation tasks. Recently, the performance of the neural
word alignment models has exceeded that of statistical models. However, they
heavily rely on sophisticated translation models. In this study, we propose a
super lightweight unsupervised word alignment (SLUA) model, in which
bidirectional symmetric attention trained with a contrastive learning objective
is introduced, and an agreement loss is employed to bind the attention maps,
such that the alignments follow mirror-like symmetry hypothesis. Experimental
results on several public benchmarks demonstrate that our model achieves
competitive, if not better, performance compared to the state of the art in
word alignment while significantly reducing the training and decoding time on
average. Further ablation analysis and case studies show the superiority of our
proposed SLUA. Notably, we recognize our model as a pioneer attempt to unify
bilingual word embedding and word alignments. Encouragingly, our approach
achieves 16.4x speedup against GIZA++, and 50x parameter compression} compared
with the Transformer-based alignment methods. We will release our code to
facilitate the community.Comment: Work in progres