28 research outputs found
A million tweets are worth a few points : tuning transformers for customer service tasks
In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such social media customer service settings, especially under multilingual conditions. We address this gap by collecting a multilingual social media corpus containing customer service conversations (865k tweets), comparing various pipelines of pretraining and finetuning approaches, applying them on 5 different end tasks. We show that pretraining a generic multilingual transformer model on our in-domain dataset, before finetuning on specific end tasks, consistently boosts performance, especially in non-English settings
Cross-Align: Modeling Deep Cross-lingual Interactions for Word Alignment
Word alignment which aims to extract lexicon translation equivalents between
source and target sentences, serves as a fundamental tool for natural language
processing. Recent studies in this area have yielded substantial improvements
by generating alignments from contextualized embeddings of the pre-trained
multilingual language models. However, we find that the existing approaches
capture few interactions between the input sentence pairs, which degrades the
word alignment quality severely, especially for the ambiguous words in the
monolingual context. To remedy this problem, we propose Cross-Align to model
deep interactions between the input sentence pairs, in which the source and
target sentences are encoded separately with the shared self-attention modules
in the shallow layers, while cross-lingual interactions are explicitly
constructed by the cross-attention modules in the upper layers. Besides, to
train our model effectively, we propose a two-stage training framework, where
the model is trained with a simple Translation Language Modeling (TLM)
objective in the first stage and then finetuned with a self-supervised
alignment objective in the second stage. Experiments show that the proposed
Cross-Align achieves the state-of-the-art (SOTA) performance on four out of
five language pairs.Comment: Accepted by EMNLP 202