22,981 research outputs found
Method For Providing Metrics To Determine Transcription/Translation Quality
A system and method are disclosed for determining the transcription/translation quality of web video content based on metrics derived from indirect feedback of users on existing captioning. The method may take into account how often the closed caption (CC) option was activated by a user on videos and the number of times users stayed through the whole video content using the closed captions. The system can also be used in assessing quality of manual transcription for languages that do not have automated speech recognition and to validate acoustic and language models in machine translation/transcription
Procjena kvalitete strojnog prijevoda govora: studija sluÄaja aplikacije ILA
Machine translation (MT) is becoming qualitatively more successful and quantitatively more productive at an unprecedented pace. It is becoming a widespread solution to the challenges of a constantly rising demand for quick and affordable translations of both text and speech, causing disruption and adjustments of the translation practice and profession, but at the same time making multilingual communication easier than ever before. This paper focuses on the speech-to-speech (S2S) translation app Instant Language Assistant (ILA), which brings together the state-of-the-art translation technology: automatic speech recognition, machine translation and text-to-speech synthesis, and allows for MT-mediated multilingual communication. The aim of the paper is to assess the quality of translations of conversational language produced by the S2S translation app ILA for en-de and en-hr language pairs. The research includes several levels of translation quality analysis: human translation quality assessment by translation experts using the Fluency/Adequacy Metrics, light-post editing, and automated MT evaluation (BLEU). Moreover, the translation output is assessed with respect to language pairs to get an insight into whether they affect the MT output quality and how. The results show a relatively high quality of translations produced by the S2S translation app ILA across all assessment models and a correlation between human and automated assessment results.Strojno je prevoÄenje sve kvalitetnije i sve je viÅ”e prisutno u svakodnevnom životu. Zbog porasta potražnje za brzim i pristupaÄnim prijevodima teksta i govora, strojno se prevoÄenje nameÄe kao opÄeprihvaÄeno rjeÅ”enje, Å”to dovodi do korjenitih promjena i prilagodbi u prevoditeljskoj struci i praksi te istodobno viÅ”ejeziÄnu komunikaciju Äini lakÅ”om nego ikada do sada. Ovaj se rad bavi aplikacijom Instant Language Assistant (ILA) za strojni prijevod govora. ILA omoguÄuje viÅ”ejeziÄnu komunikaciju posredovanu strojnim prevoÄenjem, a temelji se na najnovijim tehnoloÅ”kim dostignuÄima, i to na automatskom prepoznavanju govora, strojnom prevoÄenju i sintezi teksta u govor. Cilj je rada procijeniti kvalitetu prijevoda razgovornog jezika dobivenog pomoÄu aplikacije ILA i to za parove jezika engleski ā njemaÄki te engleski ā hrvatski. Kvaliteta prijevoda analizira se u nekoliko faza: kvalitetu prijevoda procjenjuju struÄnjaci pomoÄu metode procjene teÄnosti i toÄnosti (engl. Fluency/Adequacy Metrics), zatim se provodi ograniÄena redaktura strojno prevedenih govora (engl. light post-editing), nakon Äega slijedi automatsko vrednovanje strojnog prijevoda (BLEU). Strojno prevedeni govor procjenjuje se i uzevÅ”i u obzir o kojem je jeziÄnom paru rijeÄ kako bi se dobio uvid u to utjeÄu li jeziÄni parovi na strojni prijevod i na koji naÄin. Rezultati pokazuju da su prijevodi dobiveni pomoÄu aplikacije ILA za strojni prijevod govora procijenjeni kao razmjerno visokokvalitetni bez obzira na metodu procjene, kao i da se ljudske procjene kvalitete prijevoda poklapaju sa strojnima
Example-based controlled translation
The first research on integrating controlled language data in an Example-Based Machine Translation (EBMT) system was published in [Gough & Way, 2003]. We improve on their sub-sentential alignment algorithm to populate the systemās databases with more than six times as many potentially useful fragments. Together with two simple novel improvementsācorrecting mistranslations in the lexicon, and allowing multiple translations in the lexiconātranslation quality improves considerably when target language
translations are constrained. We also develop the first EBMT system which attempts to filter the source language data using controlled language specifications. We provide
detailed automatic and human evaluations of a number of experiments carried out to test the quality of the system. We observe that our system outperforms Logomedia in a number of tests. Finally, despite conflicting results from different automatic evaluation metrics, we observe a preference for controlling the source data rather than the target translations
Automated Audio Captioning with Recurrent Neural Networks
We present the first approach to automated audio captioning. We employ an
encoder-decoder scheme with an alignment model in between. The input to the
encoder is a sequence of log mel-band energies calculated from an audio file,
while the output is a sequence of words, i.e. a caption. The encoder is a
multi-layered, bi-directional gated recurrent unit (GRU) and the decoder a
multi-layered GRU with a classification layer connected to the last GRU of the
decoder. The classification layer and the alignment model are fully connected
layers with shared weights between timesteps. The proposed method is evaluated
using data drawn from a commercial sound effects library, ProSound Effects. The
resulting captions were rated through metrics utilized in machine translation
and image captioning fields. Results from metrics show that the proposed method
can predict words appearing in the original caption, but not always correctly
ordered.Comment: Presented at the 11th IEEE Workshop on Applications of Signal
Processing to Audio and Acoustics (WASPAA), 201
Comparison and Adaptation of Automatic Evaluation Metrics for Quality Assessment of Re-Speaking
Re-speaking is a mechanism for obtaining high quality subtitles for use in
live broadcast and other public events. Because it relies on humans performing
the actual re-speaking, the task of estimating the quality of the results is
non-trivial. Most organisations rely on humans to perform the actual quality
assessment, but purely automatic methods have been developed for other similar
problems, like Machine Translation. This paper will try to compare several of
these methods: BLEU, EBLEU, NIST, METEOR, METEOR-PL, TER and RIBES. These will
then be matched to the human-derived NER metric, commonly used in re-speaking.Comment: Comparison and Adaptation of Automatic Evaluation Metrics for Quality
Assessment of Re-Speaking. arXiv admin note: text overlap with
arXiv:1509.0908
Intensity-based image registration using multiple distributed agents
Image registration is the process of geometrically aligning images taken from different sensors, viewpoints or instances in time. It plays a key role in the detection of defects or anomalies for automated visual inspection. A multiagent distributed blackboard system has been developed for intensity-based image registration. The images are divided into segments and allocated to agents on separate processors, allowing parallel computation of a similarity metric that measures the degree of likeness between reference and sensed images after the application of a transform. The need for a dedicated control module is removed by coordination of agents via the blackboard. Tests show that additional agents increase speed, provided the communication capacity of the blackboard is not saturated. The success of the approach in achieving registration, despite significant misalignment of the original images, is demonstrated in the detection of manufacturing defects on screen-printed plastic bottles and printed circuit boards
Generating High-Quality Surface Realizations Using Data Augmentation and Factored Sequence Models
This work presents a new state of the art in reconstruction of surface
realizations from obfuscated text. We identify the lack of sufficient training
data as the major obstacle to training high-performing models, and solve this
issue by generating large amounts of synthetic training data. We also propose
preprocessing techniques which make the structure contained in the input
features more accessible to sequence models. Our models were ranked first on
all evaluation metrics in the English portion of the 2018 Surface Realization
shared task
- ā¦