Search CORE

100 research outputs found

Language modeling and transcription of the TED corpus lectures

Author: Cettolo M.
Federico M.
Leeuwis E.
Publication venue: IEEE
Publication date: 01/01/2003
Field of study

Transcribing lectures is a challenging task, both in acoustic and in language modeling. In this work, we present our first results on the automatic transcription of lectures from the TED corpus, recently released by ELRA and LDC. In particular, we concentrated our effort on language modeling. Baseline acoustic and language models were developed using respectively 8 hours of TED transcripts and various types of texts: conference proceedings, lecture transcripts, and conversational speech transcripts. Then, adaptation of the language model to single speakers was investigated by exploiting different kinds of information: automatic transcripts of the talk, the title of the talk, the abstract and, finally, the paper. In the last case, a 39.2% WER was achieved

Archivio della ricerca - Fondazione Bruno Kessler

University of Twente Research Information

The 1-s interpolation of breath-by-breath O2 uptake data to determine kinetic parameters: the misleading procedure

Author: Cettolo V.
Francescato M. P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Archivio istituzionale della ricerca - Università degli Studi di Udine

Neural <em>versus</em> Phrase-Based Machine Translation Quality: a Case Study

Author: Bentivogli L.
Bisazza A.
Cettolo M.
Federico M.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

International Migration, Integration and Social Cohesion online publications

On correct computation of confidence intervals for kinetic parameters

Author: Bellio R.
Cettolo V.
Francescato M. P.
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Università degli Studi di Udine

Comparison of different breath-by-breath gas exchange algorithms using a gas exchange simulation system

Author: Cettolo V.
Francescato M. P.
Hoffmann U.
Thieschafer L.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Università degli Studi di Udine

Optimized MT Online Learning in Computer Assisted Translation

Author: Cettolo M.
Mathur P.
Publication venue
Publication date
Field of study

In this paper we propose a cascading framework for optimizing online learning in machine translation for computer assisted translation scenario. With the use of online learning, one introduces several hyper parameters associated with the learning algorithm. Number of iterations of online learning can affect the quality of translation as well. We discuss these issues and propose a few approaches that can be used to optimize the hyper parameters and also to find the number of iterations required for online learning. We experimentally show that using optimal number of iterations in online learning proves to be useful and we get consistent improvement against baseline results

Archivio della ricerca - Fondazione Bruno Kessler

Adattamento al Progetto dei Modelli di Traduzione Automatica nella Traduzione Assistita

Author: Bertoldi N.
Cettolo M.
Federico M.
Publication venue: 'Pisa University Press'
Publication date: 01/01/2014
Field of study

L'integrazione della traduzione automatica nei sistemi di traduzione assistita è una sfida sia per la ricerca accademica sia per quella industriale. Infatti, i traduttori professionisti percepiscono come cruciale l'abilità dei sistemi automatici di adattarsi al loro stile e alle loro correzioni. In questo articolo proponiamo uno schema di adattamento dei sistemi di traduzione automatica ad uno specifico documento sulla base di una limitata quantità di testo, corretto manualmente, pari a quella prodotta giornalmente da un singolo traduttore

Archivio della ricerca - Fondazione Bruno Kessler

Bootstrapping Arabic-Italian SMT through Comparable Texts and Pivot Translation

Author: M. Cettolo
M. Federico
N. Bertoldi
Publication venue
Publication date
Field of study

This paper describes efforts towards the development of an Arabic to Italian SMT system for the news domain. Since only very little parallel data are available for this language pair, we investigated both the exploitation of comparable corpora and pivot translation. Experimental evaluation was conducted on a new benchmark developed by extending two Arabic-to-English NIST evaluation sets with Italian and French translations, produced from the source language by experts. Preliminary results show potentials of both approaches with respect to performance achieved by a popular state-of-the-art Web-based translation service

Archivio della ricerca - Fondazione Bruno Kessler

Project Adaptation for MT-Enhanced Computer Assisted Translation

Author: Bertoldi N.
Cettolo M.
Federico M.
Publication venue
Publication date
Field of study

The effective integration of MT technology into CAT tools is a challenging topic both for academic research and the translation industry. Particularly, professional translators feel crucial the ability of MT systems to adapt to their feedback. In this paper, we propose an adaptation scheme to tune a statistical MT system to a translation project using small amounts of post-edited texts. By running field tests on two domains with 8 professional translators working with a CAT tool, productivity gains up to over 20% were measured after applying MT project adaptation

Archivio della ricerca - Fondazione Bruno Kessler

Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation

Author: Bertoldi N.
Cettolo M.
Federico M.
Publication venue
Publication date
Field of study

The integration of machine translation in the human translation work flow rises intriguing and challenging research issues. One of them, addressed in this work, is how to dynamically adapt phrase-based statistical MT from user post-editing. By casting the problem in the online machine learning paradigm, we propose a cache-based adaptation technique method that dynamically stores target n-gram and phrase-pair features used by the translator. For the sake of adaptation, during decoding not only recency of the features stored in the cache is rewarded but also their occurrence in similar already translated sentences in the document. Our experimental results show the effectiveness of the devised method both on standard benchmarks and on documents post-edited by professional translators through the real use of the MateCat tool

Archivio della ricerca - Fondazione Bruno Kessler