22,981 research outputs found

    Method For Providing Metrics To Determine Transcription/Translation Quality

    Get PDF
    A system and method are disclosed for determining the transcription/translation quality of web video content based on metrics derived from indirect feedback of users on existing captioning. The method may take into account how often the closed caption (CC) option was activated by a user on videos and the number of times users stayed through the whole video content using the closed captions. The system can also be used in assessing quality of manual transcription for languages that do not have automated speech recognition and to validate acoustic and language models in machine translation/transcription

    Procjena kvalitete strojnog prijevoda govora: studija slučaja aplikacije ILA

    Get PDF
    Machine translation (MT) is becoming qualitatively more successful and quantitatively more productive at an unprecedented pace. It is becoming a widespread solution to the challenges of a constantly rising demand for quick and affordable translations of both text and speech, causing disruption and adjustments of the translation practice and profession, but at the same time making multilingual communication easier than ever before. This paper focuses on the speech-to-speech (S2S) translation app Instant Language Assistant (ILA), which brings together the state-of-the-art translation technology: automatic speech recognition, machine translation and text-to-speech synthesis, and allows for MT-mediated multilingual communication. The aim of the paper is to assess the quality of translations of conversational language produced by the S2S translation app ILA for en-de and en-hr language pairs. The research includes several levels of translation quality analysis: human translation quality assessment by translation experts using the Fluency/Adequacy Metrics, light-post editing, and automated MT evaluation (BLEU). Moreover, the translation output is assessed with respect to language pairs to get an insight into whether they affect the MT output quality and how. The results show a relatively high quality of translations produced by the S2S translation app ILA across all assessment models and a correlation between human and automated assessment results.Strojno je prevođenje sve kvalitetnije i sve je viÅ”e prisutno u svakodnevnom životu. Zbog porasta potražnje za brzim i pristupačnim prijevodima teksta i govora, strojno se prevođenje nameće kao općeprihvaćeno rjeÅ”enje, Å”to dovodi do korjenitih promjena i prilagodbi u prevoditeljskoj struci i praksi te istodobno viÅ”ejezičnu komunikaciju čini lakÅ”om nego ikada do sada. Ovaj se rad bavi aplikacijom Instant Language Assistant (ILA) za strojni prijevod govora. ILA omogućuje viÅ”ejezičnu komunikaciju posredovanu strojnim prevođenjem, a temelji se na najnovijim tehnoloÅ”kim dostignućima, i to na automatskom prepoznavanju govora, strojnom prevođenju i sintezi teksta u govor. Cilj je rada procijeniti kvalitetu prijevoda razgovornog jezika dobivenog pomoću aplikacije ILA i to za parove jezika engleski ā€“ njemački te engleski ā€“ hrvatski. Kvaliteta prijevoda analizira se u nekoliko faza: kvalitetu prijevoda procjenjuju stručnjaci pomoću metode procjene tečnosti i točnosti (engl. Fluency/Adequacy Metrics), zatim se provodi ograničena redaktura strojno prevedenih govora (engl. light post-editing), nakon čega slijedi automatsko vrednovanje strojnog prijevoda (BLEU). Strojno prevedeni govor procjenjuje se i uzevÅ”i u obzir o kojem je jezičnom paru riječ kako bi se dobio uvid u to utječu li jezični parovi na strojni prijevod i na koji način. Rezultati pokazuju da su prijevodi dobiveni pomoću aplikacije ILA za strojni prijevod govora procijenjeni kao razmjerno visokokvalitetni bez obzira na metodu procjene, kao i da se ljudske procjene kvalitete prijevoda poklapaju sa strojnima

    Example-based controlled translation

    Get PDF
    The first research on integrating controlled language data in an Example-Based Machine Translation (EBMT) system was published in [Gough & Way, 2003]. We improve on their sub-sentential alignment algorithm to populate the systemā€™s databases with more than six times as many potentially useful fragments. Together with two simple novel improvementsā€”correcting mistranslations in the lexicon, and allowing multiple translations in the lexiconā€”translation quality improves considerably when target language translations are constrained. We also develop the first EBMT system which attempts to filter the source language data using controlled language specifications. We provide detailed automatic and human evaluations of a number of experiments carried out to test the quality of the system. We observe that our system outperforms Logomedia in a number of tests. Finally, despite conflicting results from different automatic evaluation metrics, we observe a preference for controlling the source data rather than the target translations

    Automated Audio Captioning with Recurrent Neural Networks

    Full text link
    We present the first approach to automated audio captioning. We employ an encoder-decoder scheme with an alignment model in between. The input to the encoder is a sequence of log mel-band energies calculated from an audio file, while the output is a sequence of words, i.e. a caption. The encoder is a multi-layered, bi-directional gated recurrent unit (GRU) and the decoder a multi-layered GRU with a classification layer connected to the last GRU of the decoder. The classification layer and the alignment model are fully connected layers with shared weights between timesteps. The proposed method is evaluated using data drawn from a commercial sound effects library, ProSound Effects. The resulting captions were rated through metrics utilized in machine translation and image captioning fields. Results from metrics show that the proposed method can predict words appearing in the original caption, but not always correctly ordered.Comment: Presented at the 11th IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 201

    Comparison and Adaptation of Automatic Evaluation Metrics for Quality Assessment of Re-Speaking

    Get PDF
    Re-speaking is a mechanism for obtaining high quality subtitles for use in live broadcast and other public events. Because it relies on humans performing the actual re-speaking, the task of estimating the quality of the results is non-trivial. Most organisations rely on humans to perform the actual quality assessment, but purely automatic methods have been developed for other similar problems, like Machine Translation. This paper will try to compare several of these methods: BLEU, EBLEU, NIST, METEOR, METEOR-PL, TER and RIBES. These will then be matched to the human-derived NER metric, commonly used in re-speaking.Comment: Comparison and Adaptation of Automatic Evaluation Metrics for Quality Assessment of Re-Speaking. arXiv admin note: text overlap with arXiv:1509.0908

    Intensity-based image registration using multiple distributed agents

    Get PDF
    Image registration is the process of geometrically aligning images taken from different sensors, viewpoints or instances in time. It plays a key role in the detection of defects or anomalies for automated visual inspection. A multiagent distributed blackboard system has been developed for intensity-based image registration. The images are divided into segments and allocated to agents on separate processors, allowing parallel computation of a similarity metric that measures the degree of likeness between reference and sensed images after the application of a transform. The need for a dedicated control module is removed by coordination of agents via the blackboard. Tests show that additional agents increase speed, provided the communication capacity of the blackboard is not saturated. The success of the approach in achieving registration, despite significant misalignment of the original images, is demonstrated in the detection of manufacturing defects on screen-printed plastic bottles and printed circuit boards

    Generating High-Quality Surface Realizations Using Data Augmentation and Factored Sequence Models

    Full text link
    This work presents a new state of the art in reconstruction of surface realizations from obfuscated text. We identify the lack of sufficient training data as the major obstacle to training high-performing models, and solve this issue by generating large amounts of synthetic training data. We also propose preprocessing techniques which make the structure contained in the input features more accessible to sequence models. Our models were ranked first on all evaluation metrics in the English portion of the 2018 Surface Realization shared task
    • ā€¦
    corecore