9,988 research outputs found
XL-NBT: A Cross-lingual Neural Belief Tracking Framework
Task-oriented dialog systems are becoming pervasive, and many companies
heavily rely on them to complement human agents for customer service in call
centers. With globalization, the need for providing cross-lingual customer
support becomes more urgent than ever. However, cross-lingual support poses
great challenges---it requires a large amount of additional annotated data from
native speakers. In order to bypass the expensive human annotation and achieve
the first step towards the ultimate goal of building a universal dialog system,
we set out to build a cross-lingual state tracking framework. Specifically, we
assume that there exists a source language with dialog belief tracking
annotations while the target languages have no annotated dialog data of any
form. Then, we pre-train a state tracker for the source language as a teacher,
which is able to exploit easy-to-access parallel data. We then distill and
transfer its own knowledge to the student state tracker in target languages. We
specifically discuss two types of common parallel resources: bilingual corpus
and bilingual dictionary, and design different transfer learning strategies
accordingly. Experimentally, we successfully use English state tracker as the
teacher to transfer its knowledge to both Italian and German trackers and
achieve promising results.Comment: 13 pages, 5 figures, 3 tables, accepted to EMNLP 2018 conferenc
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
Information Access in a Multilingual World: Transitioning from Research to Real-World Applications
Multilingual Information Access (MLIA) is at a turning point wherein substantial real-world applications are being introduced after fifteen years of research into cross-language information retrieval, question answering, statistical machine translation and named entity recognition. Previous workshops on this topic have focused on research and small- scale applications. The focus of this workshop was on technology transfer from research to applications and on what future research needs to be done which facilitates MLIA in an increasingly connected multilingual world
- …