2,695 research outputs found

    Improving English to Spanish out-of-domain translations by morphology generalization and generation

    Get PDF
    This paper presents a detailed study of a method for morphology generalization and generation to address out-of-domain translations in English-to-Spanish phrase-based MT. The paper studies whether the morphological richness of the target language causes poor quality translation when translating out-ofdomain. In detail, this approach first translates into Spanish simplified forms and then predicts the final inflected forms through a morphology generation step based on shallow and deep-projected linguistic information available from both the source and targetlanguage sentences. Obtained results highlight the importance of generalization, and therefore generation, for dealing with out-ofdomain data.Peer ReviewedPostprint (published version

    Political Text Scaling Meets Computational Semantics

    Full text link
    During the last fifteen years, automatic text scaling has become one of the key tools of the Text as Data community in political science. Prominent text scaling algorithms, however, rely on the assumption that latent positions can be captured just by leveraging the information about word frequencies in documents under study. We challenge this traditional view and present a new, semantically aware text scaling algorithm, SemScale, which combines recent developments in the area of computational linguistics with unsupervised graph-based clustering. We conduct an extensive quantitative analysis over a collection of speeches from the European Parliament in five different languages and from two different legislative terms, and show that a scaling approach relying on semantic document representations is often better at capturing known underlying political dimensions than the established frequency-based (i.e., symbolic) scaling method. We further validate our findings through a series of experiments focused on text preprocessing and feature selection, document representation, scaling of party manifestos, and a supervised extension of our algorithm. To catalyze further research on this new branch of text scaling methods, we release a Python implementation of SemScale with all included data sets and evaluation procedures.Comment: Updated version - accepted for Transactions on Data Science (TDS

    Automated text simplification as a preprocessing step for machine translation into an under-resourced language

    Get PDF
    In this work, we investigate the possibility of using fully automatic text simplification system on the English source in machine translation (MT) for improving its translation into an under-resourced language. We use the state-of-the-art automatic text simplification (ATS) system for lexically and syntactically simplifying source sentences, which are then translated with two state-of-the-art English-to-Serbian MT systems, the phrase-based MT (PBMT) and the neural MT (NMT). We explore three different scenarios for using the ATS in MT: (1) using the raw output of the ATS; (2) automatically filtering out the sentences with low grammaticality and meaning preservation scores; and (3) performing a minimal manual correction of the ATS output. Our results show improvement in fluency of the translation regardless of the chosen scenario, and difference in success of the three scenarios depending on the MT approach used (PBMT or NMT) with regards to improving translation fluency and post-editing effort

    TransBooster: boosting the performance of wide-coverage machine translation systems

    Get PDF
    We propose the design, implementation and evaluation of a novel and modular approach to boost the translation performance of existing, wide-coverage, freely available machine translation systems based on reliable and fast automatic decomposition of the translation input and corresponding composition of translation output. We provide details of our method, and experimental results compared to the MT systems SYSTRAN and Logomedia. While many avenues for further experimentation remain, to date we fall just behind the baseline systems on the full 800-sentence testset, but in certain cases our method causes the translation quality obtained via the MT systems to improve

    Audio description and plurilingual competence: new allies in language learning?

    Get PDF
    The CEFR (Council of Europe, 2001) and its companion volumes (Council of Europe, 2018, 2020) highlight the development of plurilingual and pluricultural competence (PPC) as one of the main objectives of language teaching and learning. Within this context, the plurilingual approach in education has placed tran slation in a prominent situation, with authors such as Cummins (2007) observing how it promotes not only the acquisition of foreign languages (FL) and the consolidation of L1s, but also biliterac y development and identity affirmation. Within translation, a udiovisual translation (AVT) has proven to be particularly effective in language learning (cf. Lertola, 2019). The polysemiotic nature of audiovisual texts incorporates elements that require the ac tivation of specific forms of mediation that cannot always be found in general translation. This article sets out to reflect on the influence that linguistic and semiotic transfer in AVT can exert on PPC (Baños, Marzà, & Torralba, 2021), drawing on the results of a quasi - experimental research undertaken within the PluriTAV project (Martínez -Sierra, 2021). This specific study aimed to assess the development of PPC through audio description (AD) in Spanish undergraduates studying English Philology , who were organi sed into an experimental and a control group, with onl y the former using AD as a didactic tool. Although results do not reveal a statistically significant improvement in PPC acquisition, they enable hypotheses to be formulated that can then be tested in further research . In addition, the experimental group showed some progress in the development of specific plurilingual and pluricultural skills , which suggests that the use of AD in the FL classroom can improve learners’ plurilingual and pluricultural repertoire.El MCER (Consejo de Europa, 2001) y sus volúmenes complementarios, Companion Volume with new descriptors (Consejo de Europa, 2018, 2020), destacan el desarrollo de la competencia plurilingüe y pluricultural (CPP) como uno de los principales objetivos de la enseñanza y el aprendizaje de lenguas. En este contexto, la presencia del enfoque plurilingüe en la educación ha situado a la traducción en una posición destacada. Así, autores como Cummins (2007) subrayan su papel no solo en la adquisición de lenguas extranjeras (LE) y la consolidación de las L1, sino también en el desarrollo de la biliteracidad y la afirmación de la identidad. Dentro del campo de la traducción, la traducción audiovisual (TAV) ha demostrado ser especialmente eficaz en la adquisición de lenguas (cf. Lertola, 2019). La naturaleza polisemiótica de los textos audiovisuales incorpora elementos que requieren la activación de formas específicas de mediación que no siempre se encuentran en la traducción general. En este artículo se reflexiona sobre la influencia que puede ejercer la transferencia lingüística y semiótica propia de la TAV sobre la CPP (Author, 2021), a partir de los resultados de una investigación cuasi-experimental llevada a cabo dentro del proyecto PluriTAV (cf. Martínez-Sierra, 2021). Este estudio específico tenía como objetivo evaluar el desarrollo de la CPP mediante la audiodescripción (AD) en estudiantes de filología inglesa divididos en un grupo experimental y otro de control, donde solo el primero utilizó la AD como herramienta didáctica. Aunque los resultados no revelan una mejora estadísticamente significativa en la adquisición de la CPP, permiten formular hipótesis a contrastar en futuras investigaciones. Además, el grupo experimental mostró ciertos progresos en el desarrollo de algunas habilidades plurilingües y pluriculturales específicas, lo que sugiere que el uso de la AD en el aula de LE puede enriquecer el repertorio plurilingüe y pluricultural del estudiantado

    Audio description and plurilingual competence: new allies in language learning?

    Get PDF
    The CEFR (Council of Europe, 2001) and its companion volumes (Council of Europe, 2018, 2020) highlight the development of plurilingual and pluricultural competence (PPC) as one of the main objectives of language teaching and learning. Within this context, the plurilingual approach in education has placed translation in a prominent situation, with authors such as Cummins (2007) observing how it promotes not only the acquisition of foreign languages (FL) and the consolidation of L1s, but also biliteracy development and identity affirmation. Within translation, audiovisual translation (AVT) has proven to be particularly effective in language learning (cf. Lertola, 2019). The polysemiotic nature of audiovisual texts incorporates elements that require the activation of specific forms of mediation that cannot always be found in general translation. This article sets out to reflect on the influence that linguistic and semiotic transfer in AVT can exert on PPC (Author, 2021), drawing on the results of a quasi-experimental research undertaken within the PluriTAV project (cf. Martínez-Sierra, 2021). This specific study was aimed at assessing the development of PPC through audio description (AD) in BA English undergraduate students, who were organised into an experimental and a control group, with only the former using AD as a didactic tool. Although results do not reveal a statistically significant improvement in PPC acquisition, they enable the formulation of hypotheses to be tested in further research. In addition, the experimental group showed some progress in the development of specific plurilingual and pluricultural skills, which suggests that the use of AD in the FL classroom can improve learners’ plurilingual and pluricultural repertoire.El MCER (Consejo de Europa, 2001) y sus volúmenes complementarios, Companion Volume with new descriptors (Consejo de Europa, 2018, 2020), destacan el desarrollo de la competencia plurilingüe y pluricultural (CPP) como uno de los principales objetivos de la enseñanza y el aprendizaje de lenguas. En este contexto, la presencia del enfoque plurilingüe en la educación ha situado a la traducción en una posición destacada. Así, autores como Cummins (2007) subrayan su papel no solo en la adquisición de lenguas extranjeras (LE) y la consolidación de las L1, sino también en el desarrollo de la biliteracidad y la afirmación de la identidad. Dentro del campo de la traducción, la traducción audiovisual (TAV) ha demostrado ser especialmente eficaz en la adquisición de lenguas (cf. Lertola, 2019). La naturaleza polisemiótica de los textos audiovisuales incorpora elementos que requieren la activación de formas específicas de mediación que no siempre se encuentran en la traducción general. En este artículo se reflexiona sobre la influencia que puede ejercer la transferencia lingüística y semiótica propia de la TAV sobre la CPP (Author, 2021), a partir de los resultados de una investigación cuasi-experimental llevada a cabo dentro del proyecto PluriTAV (cf. Martínez-Sierra, 2021). Este estudio específico tenía como objetivo evaluar el desarrollo de la CPP mediante la audiodescripción (AD) en estudiantes de filología inglesa divididos en un grupo experimental y otro de control, donde solo el primero utilizó la AD como herramienta didáctica. Aunque los resultados no revelan una mejora estadísticamente significativa en la adquisición de la CPP, permiten formular hipótesis a contrastar en futuras investigaciones. Además, el grupo experimental mostró ciertos progresos en el desarrollo de algunas habilidades plurilingües y pluriculturales específicas, lo que sugiere que el uso de la AD en el aula de LE puede enriquecer el repertorio plurilingüe y pluricultural del estudiantado
    corecore