356 research outputs found
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Pretrained contextual representation models (Peters et al., 2018; Devlin et
al., 2018) have pushed forward the state-of-the-art on many NLP tasks. A new
release of BERT (Devlin, 2018) includes a model simultaneously pretrained on
104 languages with impressive performance for zero-shot cross-lingual transfer
on a natural language inference task. This paper explores the broader
cross-lingual potential of mBERT (multilingual) as a zero shot language
transfer model on 5 NLP tasks covering a total of 39 languages from various
language families: NLI, document classification, NER, POS tagging, and
dependency parsing. We compare mBERT with the best-published methods for
zero-shot cross-lingual transfer and find mBERT competitive on each task.
Additionally, we investigate the most effective strategy for utilizing mBERT in
this manner, determine to what extent mBERT generalizes away from language
specific features, and measure factors that influence cross-lingual transfer.Comment: EMNLP 2019 Camera Read
AiM: Taking Answers in Mind to Correct Chinese Cloze Tests in Educational Applications
To automatically correct handwritten assignments, the traditional approach is
to use an OCR model to recognize characters and compare them to answers. The
OCR model easily gets confused on recognizing handwritten Chinese characters,
and the textual information of the answers is missing during the model
inference. However, teachers always have these answers in mind to review and
correct assignments. In this paper, we focus on the Chinese cloze tests
correction and propose a multimodal approach (named AiM). The encoded
representations of answers interact with the visual information of students'
handwriting. Instead of predicting 'right' or 'wrong', we perform the sequence
labeling on the answer text to infer which answer character differs from the
handwritten content in a fine-grained way. We take samples of OCR datasets as
the positive samples for this task, and develop a negative sample augmentation
method to scale up the training data. Experimental results show that AiM
outperforms OCR-based methods by a large margin. Extensive studies demonstrate
the effectiveness of our multimodal approach.Comment: Accepted to COLING 202
- …