195 research outputs found
Dynamic topic adaptation for improved contextual modelling in statistical machine translation
In recent years there has been an increased interest in domain adaptation techniques
for statistical machine translation (SMT) to deal with the growing amount of data from
different sources. Topic modelling techniques applied to SMT are closely related to the
field of domain adaptation but more flexible in dealing with unstructured text. Topic
models can capture latent structure in texts and are therefore particularly suitable for
modelling structure in between and beyond corpus boundaries, which are often arbitrary.
In this thesis, the main focus is on dynamic translation model adaptation to texts of
unknown origin, which is a typical scenario for an online MT engine translating web
documents. We introduce a new bilingual topic model for SMT that takes the entire
document context into account and for the first time directly estimates topic-dependent
phrase translation probabilities in a Bayesian fashion. We demonstrate our modelâs
ability to improve over several domain adaptation baselines and further provide evidence
for the advantages of bilingual topic modelling for SMT over the more common
monolingual topic modelling. We also show improved performance when deriving further
adapted translation features from the same model which measure different aspects
of topical relatedness.
We introduce another new topic model for SMT which exploits the distributional
nature of phrase pair meaning by modelling topic distributions over phrase pairs using
their distributional profiles. Using this model, we explore combinations of local and
global contextual information and demonstrate the usefulness of different levels of contextual
information, which had not been previously examined for SMT. We also show
that combining this model with a topic model trained at the document-level further improves
performance. Our dynamic topic adaptation approach performs competitively
in comparison with two supervised domain-adapted systems.
Finally, we shed light on the relationship between domain adaptation and topic
adaptation and propose to combine multi-domain adaptation and topic adaptation in a
framework that entails automatic prediction of domain labels at the document level.
We show that while each technique provides complementary benefits to the overall
performance, there is an amount of overlap between domain and topic adaptation. This
can be exploited to build systems that require less adaptation effort at runtime
Language technologies for a multilingual Europe
This volume of the series âTranslation and Multilingual Natural Language Processingâ includes most of the papers presented at the Workshop âLanguage Technology for a Multilingual Europeâ, held at the University of Hamburg on September 27, 2011 in the framework of the conference GSCL 2011 with the topic âMultilingual Resources and Multilingual Applicationsâ, along with several additional contributions. In addition to an overview article on Machine Translation and two contributions on the European initiatives META-NET and Multilingual Web, the volume includes six full research articles. Our intention with this workshop was to bring together various groups concerned with the umbrella topics of multilingualism and language technology, especially multilingual technologies. This encompassed, on the one hand, representatives from research and development in the field of language technologies, and, on the other hand, users from diverse areas such as, among others, industry, administration and funding agencies. The Workshop âLanguage Technology for a Multilingual Europeâ was co-organised by the two GSCL working groups âText Technologyâ and âMachine Translationâ (http://gscl.info) as well as by META-NET (http://www.meta-net.eu)
Current Trends in Atherogenesis
This book collects the state of the art of the antioxidants from the clinical and experimental approaches in order to bring a better understanding of the mechanisms and useful therapies for these diseases. We hope that it can indicate new "current trends" for identifying new aspects regarding this scientific problem involving not only anatomical and functional, but also clinical questions
EVALITA Evaluation of NLP and Speech Tools for Italian Proceedings of the Final Workshop
Editor of the proceedings of EVALITA 2016
Proceedings of the 17th Annual Conference of the European Association for Machine Translation
Proceedings of the 17th Annual Conference of the European Association for Machine Translation (EAMT
- âŚ