694 research outputs found

    Low-Resource Unsupervised NMT:Diagnosing the Problem and Providing a Linguistically Motivated Solution

    Get PDF
    Unsupervised Machine Translation hasbeen advancing our ability to translatewithout parallel data, but state-of-the-artmethods assume an abundance of mono-lingual data. This paper investigates thescenario where monolingual data is lim-ited as well, finding that current unsuper-vised methods suffer in performance un-der this stricter setting. We find that theperformance loss originates from the poorquality of the pretrained monolingual em-beddings, and we propose using linguis-tic information in the embedding train-ing scheme. To support this, we look attwo linguistic features that may help im-prove alignment quality: dependency in-formation and sub-word information. Us-ing dependency-based embeddings resultsin a complementary word representationwhich offers a boost in performance ofaround 1.5 BLEU points compared to stan-dardWORD2VECwhen monolingual datais limited to 1 million sentences per lan-guage. We also find that the inclusion ofsub-word information is crucial to improv-ing the quality of the embedding

    MLPerf Inference Benchmark

    Full text link
    Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability.Comment: ISCA 202

    An investigation into the effectiveness of PDF/X premedia workflows

    Get PDF
    PDF/X technology was introduced in the mid 90\u27s with the intentions of improving and automating the prepress workflows. Standards committees such as CGATS (Committee for Graphic Arts Technical Standards) and DDAP (Digital Distribution of Advertisers for Publications) started to draft and implement a plan of action to succeed the current file formats being used in premedia workflows. Previous file submissions that were handed off from advertising and design agencies became too time-consuming to prepare them for the final run. Top-tier printing and publishing companies such as RR Donnelley, Quad Graphics, Vertis, and Time Inc, started to invest in new technology integration in order to improve efficiency and automate the production workflow for publications and advertisements. Recently, the committees have finished and received approval on 2/3rds of the intended outline for ISO 15930: 2003 Standards for Graphic Technology - - Prepress digital data exchange - - for the use of PDF.1 PDF/X-1a and X-3 have been approved as standards and solutions for the publication and commercial printing markets. PDF/X-2 is still in the process of finalization, but will be beneficial mostly to catalog and commercial printing. The goal of these standards is to provide an efficient vehicle for exchanging raster and vector data for print production within the graphic arts industry. The intent was for the sender and the receiver of PDF/X files to use applications that have the same level of compliance. Therefore, digital data workflows could be processed seamlessly between these channels of communication without any surprises. This would change premedia operations and bridge the communication gap between the customer (agency, designer) and the service provider (prepress, publisher), eliminating any variables or errors. This new workflow is still a mystery to many. Only the larger companies that have already invested in this new technology are truly knowledgeable about it. With this transition taking place, a greater percentage medium-sized plant in the $2 million to 5 million-range plan to start investing in Portable Document Format (PDF) workflow solutions than other size-shops.2 The purpose of this thesis is measure the current PostScript workflow that the majority of premedia industry currently uses versus the new PDF/X workflow solution created to reduce the steps in current premedia operation. Most importantly, this will result in a new model based on PDF/X technology for mid-sized companies to invest in, that promises a high return-on-investment. Premedia operations are usually responsible for 20-25% of the overall selling costs according to 2002 PIA Ratios. Due to the variety of difficulties encountered in current premedia workflows, this model will be developed to decrease production time and costs. Within the first half of the study, data will be collected through measurement and visual assessment to evaluate the performance and functionality of both premedia operations. In the other half of this study, subjective testing will address the issues of raster image processing time and a reduction of storage requirements

    Deep Neural Networks for Automatic Speech-To-Speech Translation of Open Educational Resources

    Full text link
    [ES] En los últimos años, el aprendizaje profundo ha cambiado significativamente el panorama en diversas áreas del campo de la inteligencia artificial, entre las que se incluyen la visión por computador, el procesamiento del lenguaje natural, robótica o teoría de juegos. En particular, el sorprendente éxito del aprendizaje profundo en múltiples aplicaciones del campo del procesamiento del lenguaje natural tales como el reconocimiento automático del habla (ASR), la traducción automática (MT) o la síntesis de voz (TTS), ha supuesto una mejora drástica en la precisión de estos sistemas, extendiendo así su implantación a un mayor rango de aplicaciones en la vida real. En este momento, es evidente que las tecnologías de reconocimiento automático del habla y traducción automática pueden ser empleadas para producir, de forma efectiva, subtítulos multilingües de alta calidad de contenidos audiovisuales. Esto es particularmente cierto en el contexto de los vídeos educativos, donde las condiciones acústicas son normalmente favorables para los sistemas de ASR y el discurso está gramaticalmente bien formado. Sin embargo, en el caso de TTS, aunque los sistemas basados en redes neuronales han demostrado ser capaces de sintetizar voz de un realismo y calidad sin precedentes, todavía debe comprobarse si esta tecnología está lo suficientemente madura como para mejorar la accesibilidad y la participación en el aprendizaje en línea. Además, existen diversas tareas en el campo de la síntesis de voz que todavía suponen un reto, como la clonación de voz inter-lingüe, la síntesis incremental o la adaptación zero-shot a nuevos locutores. Esta tesis aborda la mejora de las prestaciones de los sistemas actuales de síntesis de voz basados en redes neuronales, así como la extensión de su aplicación en diversos escenarios, en el contexto de mejorar la accesibilidad en el aprendizaje en línea. En este sentido, este trabajo presta especial atención a la adaptación a nuevos locutores y a la clonación de voz inter-lingüe, ya que los textos a sintetizar se corresponden, en este caso, a traducciones de intervenciones originalmente en otro idioma.[CA] Durant aquests darrers anys, l'aprenentatge profund ha canviat significativament el panorama en diverses àrees del camp de la intel·ligència artificial, entre les quals s'inclouen la visió per computador, el processament del llenguatge natural, robòtica o la teoria de jocs. En particular, el sorprenent èxit de l'aprenentatge profund en múltiples aplicacions del camp del processament del llenguatge natural, com ara el reconeixement automàtic de la parla (ASR), la traducció automàtica (MT) o la síntesi de veu (TTS), ha suposat una millora dràstica en la precisió i qualitat d'aquests sistemes, estenent així la seva implantació a un ventall més ampli a la vida real. En aquest moment, és evident que les tecnologies de reconeixement automàtic de la parla i traducció automàtica poden ser emprades per a produir, de forma efectiva, subtítols multilingües d'alta qualitat de continguts audiovisuals. Això és particularment cert en el context dels vídeos educatius, on les condicions acústiques són normalment favorables per als sistemes d'ASR i el discurs està gramaticalment ben format. No obstant això, al cas de TTS, encara que els sistemes basats en xarxes neuronals han demostrat ser capaços de sintetitzar veu d'un realisme i qualitat sense precedents, encara s'ha de comprovar si aquesta tecnologia és ja prou madura com per millorar l'accessibilitat i la participació en l'aprenentatge en línia. A més, hi ha diverses tasques al camp de la síntesi de veu que encara suposen un repte, com ara la clonació de veu inter-lingüe, la síntesi incremental o l'adaptació zero-shot a nous locutors. Aquesta tesi aborda la millora de les prestacions dels sistemes actuals de síntesi de veu basats en xarxes neuronals, així com l'extensió de la seva aplicació en diversos escenaris, en el context de millorar l'accessibilitat en l'aprenentatge en línia. En aquest sentit, aquest treball presta especial atenció a l'adaptació a nous locutors i a la clonació de veu interlingüe, ja que els textos a sintetitzar es corresponen, en aquest cas, a traduccions d'intervencions originalment en un altre idioma.[EN] In recent years, deep learning has fundamentally changed the landscapes of a number of areas in artificial intelligence, including computer vision, natural language processing, robotics, and game theory. In particular, the striking success of deep learning in a large variety of natural language processing (NLP) applications, including automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS), has resulted in major accuracy improvements, thus widening the applicability of these technologies in real-life settings. At this point, it is clear that ASR and MT technologies can be utilized to produce cost-effective, high-quality multilingual subtitles of video contents of different kinds. This is particularly true in the case of transcription and translation of video lectures and other kinds of educational materials, in which the audio recording conditions are usually favorable for the ASR task, and there is a grammatically well-formed speech. However, although state-of-the-art neural approaches to TTS have shown to drastically improve the naturalness and quality of synthetic speech over conventional concatenative and parametric systems, it is still unclear whether this technology is already mature enough to improve accessibility and engagement in online learning, and particularly in the context of higher education. Furthermore, advanced topics in TTS such as cross-lingual voice cloning, incremental TTS or zero-shot speaker adaptation remain an open challenge in the field. This thesis is about enhancing the performance and widening the applicability of modern neural TTS technologies in real-life settings, both in offline and streaming conditions, in the context of improving accessibility and engagement in online learning. Thus, particular emphasis is placed on speaker adaptation and cross-lingual voice cloning, as the input text corresponds to a translated utterance in this context.Pérez González De Martos, AM. (2022). Deep Neural Networks for Automatic Speech-To-Speech Translation of Open Educational Resources [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/184019TESISPremios Extraordinarios de tesis doctorale

    Použití CAT nástrojů v univerzitních překladatelských kurzech: Případová studie založená na výuce v programu Memsource

    Get PDF
    Přestože jsou CAT nástroje pro práci překladatele čím dál důležitější, až téměř nezbytné, univerzitní prostředí se stále zdráhá zavést je do svých kurzů zabývajících se lokalizací. Cílem této diplomové práce je analyzovat a určit, jak se CAT nástroje používají na vybraných balkánských univerzitách, což je z hlediska výuky překladatelství důležité jako základ pro další výzkum. Sekundárním cílem práce je na základě výsledků zmíněného výzkumu navrhnout pedagogické postupy, které zefektivní stávající metody výuky CAT nástrojů. Konečným cílem je pak vytvořit kurz, který by mohly využít univerzity se zájmem o implementaci výuky CAT nástrojů do svých osnov jako úvod do těchto nástrojů. Primární motivací této diplomové práce je absence vědeckých prací zabývajících se použitím CAT nástrojů na vysokých školách a případových studií, které by mohly být využity jako referenční materiál. Ještě výraznější nedostatek je pak prací pojednávajících o restrukturalizaci sylabů za účelem zvýšení zaměstnatelnosti studentů. Zařazení CAT nástrojů do výuky přitom může do budoucna zaměstnatelnost pozitivně ovlivnit, což je cíl, který si klade tato diplomová práce. Výzkum bude proveden mezi univerzitami, jež jsou součástí akademického programu v Praze sídlící společnosti Memsource, která je vývojářem a zároveň poskytovatelem...Even though CAT tools are becoming increasingly important, and almost indispensable, in the translator's work routine, the academic environment is still very hesitant to implement them into translation and localisation courses. The primary objective of this master's thesis is to analyse and determine how CAT tools are used in translation courses at selected Balkan universities, which is important from the point of view of translation pedagogy as a baseline for further research. The secondary objective of the dissertation is to possibly, depending on the results of the primary objective, devise pedagogical practices that help universities increase the teaching effectiveness of their current methods of using CAT tools. Finally, the tertiary objective is to eventually design an introductory course that could help potentially interested universities to implement CAT tools in their curricula. The main motivation of this master's thesis is that the employment of CAT tools should, in the future, positively affect employability of translation students in comparison with those who do not have such experience. This motivation stems from the fact that there are very few scientific papers and reference works on how to use CAT tools in teaching at institutions of higher education, let alone papers that deal...Katedra jihoslovanských a balkanistických studiíDepartment of South Slavonic and Balkan StudiesFilozofická fakultaFaculty of Art

    Multi-modal post-editing of machine translation

    Get PDF
    As MT quality continues to improve, more and more translators switch from traditional translation from scratch to PE of MT output, which has been shown to save time and reduce errors. Instead of mainly generating text, translators are now asked to correct errors within otherwise helpful translation proposals, where repetitive MT errors make the process tiresome, while hard-to-spot errors make PE a cognitively demanding activity. Our contribution is three-fold: first, we explore whether interaction modalities other than mouse and keyboard could well support PE by creating and testing the MMPE translation environment. MMPE allows translators to cross out or hand-write text, drag and drop words for reordering, use spoken commands or hand gestures to manipulate text, or to combine any of these input modalities. Second, our interviews revealed that translators see value in automatically receiving additional translation support when a high CL is detected during PE. We therefore developed a sensor framework using a wide range of physiological and behavioral data to estimate perceived CL and tested it in three studies, showing that multi-modal, eye, heart, and skin measures can be used to make translation environments cognition-aware. Third, we present two multi-encoder Transformer architectures for APE and discuss how these can adapt MT output to a domain and thereby avoid correcting repetitive MT errors.Angesichts der stetig steigenden Qualität maschineller Übersetzungssysteme (MÜ) post-editieren (PE) immer mehr Übersetzer die MÜ-Ausgabe, was im Vergleich zur herkömmlichen Übersetzung Zeit spart und Fehler reduziert. Anstatt primär Text zu generieren, müssen Übersetzer nun Fehler in ansonsten hilfreichen Übersetzungsvorschlägen korrigieren. Dennoch bleibt die Arbeit durch wiederkehrende MÜ-Fehler mühsam und schwer zu erkennende Fehler fordern die Übersetzer kognitiv. Wir tragen auf drei Ebenen zur Verbesserung des PE bei: Erstens untersuchen wir, ob andere Interaktionsmodalitäten als Maus und Tastatur das PE unterstützen können, indem wir die Übersetzungsumgebung MMPE entwickeln und testen. MMPE ermöglicht es, Text handschriftlich, per Sprache oder über Handgesten zu verändern, Wörter per Drag & Drop neu anzuordnen oder all diese Eingabemodalitäten zu kombinieren. Zweitens stellen wir ein Sensor-Framework vor, das eine Vielzahl physiologischer und verhaltensbezogener Messwerte verwendet, um die kognitive Last (KL) abzuschätzen. In drei Studien konnten wir zeigen, dass multimodale Messung von Augen-, Herz- und Hautmerkmalen verwendet werden kann, um Übersetzungsumgebungen an die KL der Übersetzer anzupassen. Drittens stellen wir zwei Multi-Encoder-Transformer-Architekturen für das automatische Post-Editieren (APE) vor und erörtern, wie diese die MÜ-Ausgabe an eine Domäne anpassen und dadurch die Korrektur von sich wiederholenden MÜ-Fehlern vermeiden können.Deutsche Forschungsgemeinschaft (DFG), Projekt MMP
    corecore