274 research outputs found
Developing Deployable Spoken Language Translation Systems given Limited Resources
Approaches are presented that support the deployment of spoken language translation systems. Newly developed methods allow low cost portability to new language pairs. Proposed translation model pruning techniques achieve a high translation performance even in low memory situations. The named entity and specialty vocabulary coverage, particularly on small and mobile devices, is targeted to an individual user by translation model personalization
Individual and Domain Adaptation in Sentence Planning for Dialogue
One of the biggest challenges in the development and deployment of spoken
dialogue systems is the design of the spoken language generation module. This
challenge arises from the need for the generator to adapt to many features of
the dialogue domain, user population, and dialogue context. A promising
approach is trainable generation, which uses general-purpose linguistic
knowledge that is automatically adapted to the features of interest, such as
the application domain, individual user, or user group. In this paper we
present and evaluate a trainable sentence planner for providing restaurant
information in the MATCH dialogue system. We show that trainable sentence
planning can produce complex information presentations whose quality is
comparable to the output of a template-based generator tuned to this domain. We
also show that our method easily supports adapting the sentence planner to
individuals, and that the individualized sentence planners generally perform
better than models trained and tested on a population of individuals. Previous
work has documented and utilized individual preferences for content selection,
but to our knowledge, these results provide the first demonstration of
individual preferences for sentence planning operations, affecting the content
order, discourse structure and sentence structure of system responses. Finally,
we evaluate the contribution of different feature sets, and show that, in our
application, n-gram features often do as well as features based on higher-level
linguistic representations
Procjena kvalitete strojnog prijevoda govora: studija sluÄaja aplikacije ILA
Machine translation (MT) is becoming qualitatively more successful and quantitatively more productive at an unprecedented pace. It is becoming a widespread solution to the challenges of a constantly rising demand for quick and affordable translations of both text and speech, causing disruption and adjustments of the translation practice and profession, but at the same time making multilingual communication easier than ever before. This paper focuses on the speech-to-speech (S2S) translation app Instant Language Assistant (ILA), which brings together the state-of-the-art translation technology: automatic speech recognition, machine translation and text-to-speech synthesis, and allows for MT-mediated multilingual communication. The aim of the paper is to assess the quality of translations of conversational language produced by the S2S translation app ILA for en-de and en-hr language pairs. The research includes several levels of translation quality analysis: human translation quality assessment by translation experts using the Fluency/Adequacy Metrics, light-post editing, and automated MT evaluation (BLEU). Moreover, the translation output is assessed with respect to language pairs to get an insight into whether they affect the MT output quality and how. The results show a relatively high quality of translations produced by the S2S translation app ILA across all assessment models and a correlation between human and automated assessment results.Strojno je prevoÄenje sve kvalitetnije i sve je viÅ”e prisutno u svakodnevnom životu. Zbog porasta potražnje za brzim i pristupaÄnim prijevodima teksta i govora, strojno se prevoÄenje nameÄe kao opÄeprihvaÄeno rjeÅ”enje, Å”to dovodi do korjenitih promjena i prilagodbi u prevoditeljskoj struci i praksi te istodobno viÅ”ejeziÄnu komunikaciju Äini lakÅ”om nego ikada do sada. Ovaj se rad bavi aplikacijom Instant Language Assistant (ILA) za strojni prijevod govora. ILA omoguÄuje viÅ”ejeziÄnu komunikaciju posredovanu strojnim prevoÄenjem, a temelji se na najnovijim tehnoloÅ”kim dostignuÄima, i to na automatskom prepoznavanju govora, strojnom prevoÄenju i sintezi teksta u govor. Cilj je rada procijeniti kvalitetu prijevoda razgovornog jezika dobivenog pomoÄu aplikacije ILA i to za parove jezika engleski ā njemaÄki te engleski ā hrvatski. Kvaliteta prijevoda analizira se u nekoliko faza: kvalitetu prijevoda procjenjuju struÄnjaci pomoÄu metode procjene teÄnosti i toÄnosti (engl. Fluency/Adequacy Metrics), zatim se provodi ograniÄena redaktura strojno prevedenih govora (engl. light post-editing), nakon Äega slijedi automatsko vrednovanje strojnog prijevoda (BLEU). Strojno prevedeni govor procjenjuje se i uzevÅ”i u obzir o kojem je jeziÄnom paru rijeÄ kako bi se dobio uvid u to utjeÄu li jeziÄni parovi na strojni prijevod i na koji naÄin. Rezultati pokazuju da su prijevodi dobiveni pomoÄu aplikacije ILA za strojni prijevod govora procijenjeni kao razmjerno visokokvalitetni bez obzira na metodu procjene, kao i da se ljudske procjene kvalitete prijevoda poklapaju sa strojnima
GEMv2 : Multilingual NLG benchmarking in a single line of code
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.Peer reviewe
GEMv2 : Multilingual NLG benchmarking in a single line of code
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.Peer reviewe
GEMv2: multilingual NLG benchmarking in a single line of code.
Evaluations in machine learning rarely use the latest metrics, datasets, or human evaluation in favor of remaining compatible with prior work. The compatibility, often facilitated through leaderboards, thus leads to outdated but standardized evaluation practices. We pose that the standardization is taking place in the wrong spot. Evaluation infrastructure should enable researchers to use the latest methods and what should be standardized instead is how to incorporate these new evaluation advances. We introduce GEMv2, the new version of the Generation, Evaluation, and Metrics Benchmark which uses a modular infrastructure for dataset, model, and metric developers to benefit from each other's work. GEMv2 supports 40 documented datasets in 51 languages, ongoing online evaluation for all datasets, and our interactive tools make it easier to add new datasets to the living benchmark
An Achillesā Heel? Helping Interpreting Students Gain Greater Awareness of Literal and Idiomatic English
This research paper reports on a study involving the use of literal and non-literal or idiomatic language in a multilingual interpreter classroom. Previous research has shown that interpreters are not always able to identify and correctly interpret idiomatic language. This study first examined student interpretersā perceptions of the importance of idiomatic language, then followed by assessing their ability to identify phrases that were literal, idiomatic or both. Lastly it looked at student interpretersā ability to correctly identify and explain idioms in short phrases and dialogues. Findings showed that, after this exercise, students\u27 awareness of the difference between literal and non-literal language increased, however their ability to correctly identify it did not. Furthermore, their previous focus on \u27specialized terminology\u27 led them to believe that language other than this was hardly worth learning. The article concludes with recommendations for incorporating the findings of this research into interpreter education
- ā¦