14 research outputs found
Learning to Embed Words in Context for Syntactic Tasks
We present models for embedding words in the context of surrounding words.
Such models, which we refer to as token embeddings, represent the
characteristics of a word that are specific to a given context, such as word
sense, syntactic category, and semantic role. We explore simple, efficient
token embedding models based on standard neural network architectures. We learn
token embeddings on a large amount of unannotated text and evaluate them as
features for part-of-speech taggers and dependency parsers trained on much
smaller amounts of annotated data. We find that predictors endowed with token
embeddings consistently outperform baseline predictors across a range of
context window and training set sizes.Comment: Accepted by ACL 2017 Repl4NLP worksho
Prompt-Tuning Can Be Much Better Than Fine-Tuning on Cross-lingual Understanding With Multilingual Language Models
Pre-trained multilingual language models show significant performance gains
for zero-shot cross-lingual model transfer on a wide range of natural language
understanding (NLU) tasks. Previously, for zero-shot cross-lingual evaluation,
pre-trained models are only fine-tuned on English data and tested on a variety
of target languages. In this paper, we do cross-lingual evaluation on various
NLU tasks (sentence classification, sequence labeling, question answering)
using prompt-tuning and compare it with fine-tuning. The results show that
prompt tuning achieves much better cross-lingual transfer than fine-tuning
across datasets, with only 0.1% to 0.3% tuned parameters. Additionally, we
demonstrate through the analysis that prompt tuning can have better
cross-lingual transferability of representations on downstream tasks with
better aligned decision boundaries.Comment: EMNLP 2022. Code link is adde
Efficiently Aligned Cross-Lingual Transfer Learning for Conversational Tasks using Prompt-Tuning
Cross-lingual transfer of language models trained on high-resource languages
like English has been widely studied for many NLP tasks, but focus on
conversational tasks has been rather limited. This is partly due to the high
cost of obtaining non-English conversational data, which results in limited
coverage. In this work, we introduce XSGD for cross-lingual alignment
pretraining, a parallel and large-scale multilingual conversation dataset that
we created by translating the English-only Schema-Guided Dialogue (SGD) dataset
(Rastogi et al., 2020) into 105 other languages. XSGD contains approximately
330k utterances per language. To facilitate aligned cross-lingual
representations, we develop an efficient prompt-tuning-based method for
learning alignment prompts. We also investigate two different classifiers:
NLI-based and vanilla classifiers, and test cross-lingual capability enabled by
the aligned prompts. We evaluate our model's cross-lingual generalization
capabilities on two conversation tasks: slot-filling and intent classification.
Our results demonstrate the strong and efficient modeling ability of NLI-based
classifiers and the large cross-lingual transfer improvements achieved by our
aligned prompts, particularly in few-shot settings. In addition, we highlight
the nice results of our approach compared to LLMs such as text-davinci-003 and
ChatGPT in both zero-shot and few-shot settings. While LLMs exhibit impressive
performance in English, their cross-lingual capabilities in other languages,
particularly low-resource languages, are limited