Task-oriented dialogue (ToD) systems have been mostly created for
high-resource languages, such as English and Chinese. However, there is a need
to develop ToD systems for other regional or local languages to broaden their
ability to comprehend the dialogue contexts in various languages. This paper
introduces IndoToD, an end-to-end multi domain ToD benchmark in Indonesian. We
extend two English ToD datasets to Indonesian, comprising four different
domains by delexicalization to efficiently reduce the size of annotations. To
ensure a high-quality data collection, we hire native speakers to manually
translate the dialogues. Along with the original English datasets, these new
Indonesian datasets serve as an effective benchmark for evaluating Indonesian
and English ToD systems as well as exploring the potential benefits of
cross-lingual and bilingual transfer learning approaches.Comment: 2023 1st Workshop in South East Asian Language Processing (SEALP),
Co-located with AACL 202