55 research outputs found
Machine translation for institutional academic texts: Output quality, terminology translation and post-editor trust
The present work is a feasibility study on the application of Machine Translation (MT) to institutional academic texts, specifically course catalogues, for Italian-English and German-English. The first research question of this work focuses on the feasibility of profitably applying MT to such texts. Since the benefits of a good quality MT might be counteracted by preconceptions of translators towards the output, the second research question examines translator trainees' trust towards an MT output as compared to a human translation (HT).
Training and test sets are created for both language combinations in the institutional academic domain. MT systems used are ModernMT and Google Translate. Overall evaluations of the output quality are carried out using automatic metrics. Results show that applying neural MT to institutional academic texts can be beneficial even when bilingual data are not available. When small amounts of sentence pairs become available, MT quality improves. Then, a gold standard data set with manual annotations of terminology (MAGMATic) is created and used for an evaluation of the output focused on terminology translation. The gold standard was publicly released to stimulate research on terminology assessment. The assessment proves that domain-adaptation improves the quality of term translation.
To conclude, a method to measure trust in a post-editing task is proposed and results regarding translator trainees trust towards MT are outlined. All participants are asked to work on the same text. Half of them is told that it is an MT output to be post-edited, and the other half that it is a HT needing revision. Results prove that there is no statistically significant difference between post-editing and HT revision in terms of number of edits and temporal effort. Results thus suggest that a new generation of translators that received training on MT and post-editing is not influenced by preconceptions against MT
Transformer Models for Machine Translation and Streaming Automatic Speech Recognition
[ES] El procesamiento del lenguaje natural (NLP) es un conjunto de problemas
computacionales con aplicaciones de máxima relevancia, que junto con otras
tecnologÃas informáticas se ha beneficiado de la revolución que ha significado
el aprendizaje profundo. Esta tesis se centra en dos problemas fundamentales
para el NLP: la traducción automática (MT) y el reconocimiento automático
del habla o transcripción automática (ASR); asà como en una arquitectura
neuronal profunda, el Transformer, que pondremos en práctica para mejorar
las soluciones de MT y ASR en algunas de sus aplicaciones.
El ASR y MT pueden servir para obtener textos multilingües de alta calidad a
un coste razonable para una diversidad de contenidos audiovisuales. Concre-
tamente, esta tesis aborda problemas como el de traducción de noticias o el de
subtitulación automática de televisión. El ASR y MT también se pueden com-
binar entre sÃ, generando automáticamente subtÃtulos traducidos, o con otras
soluciones de NLP: resumen de textos para producir resúmenes de discursos, o
sÃntesis del habla para crear doblajes automáticos. Estas aplicaciones quedan
fuera del alcance de esta tesis pero pueden aprovechar las contribuciones que
contiene, en la meduda que ayudan a mejorar el rendimiento de los sistemas
automáticos de los que dependen.
Esta tesis contiene una aplicación de la arquitectura Transformer al MT tal y
como fue concebida, mediante la que obtenemos resultados de primer nivel en
traducción de lenguas semejantes. En capÃtulos subsecuentes, esta tesis aborda
la adaptación del Transformer como modelo de lenguaje para sistemas hÃbri-
dos de ASR en vivo. Posteriormente, describe la aplicación de este tipus de
sistemas al caso de uso de subtitulación de televisión, participando en una com-
petición pública de RTVE donde obtenemos la primera posición con un marge
importante. También demostramos que la mejora se debe principalmenta a la
tecnologÃa desarrollada y no tanto a la parte de los datos.[CA] El processament del llenguage natural (NLP) és un conjunt de problemes com-
putacionals amb aplicacions de mà xima rellevà ncia, que juntament amb al-
tres tecnologies informà tiques s'ha beneficiat de la revolució que ha significat
l'impacte de l'aprenentatge profund. Aquesta tesi se centra en dos problemes
fonamentals per al NLP: la traducció automà tica (MT) i el reconeixement
automà tic de la parla o transcripció automà tica (ASR); aixà com en una ar-
quitectura neuronal profunda, el Transformer, que posarem en prà ctica per a
millorar les solucions de MT i ASR en algunes de les seues aplicacions.
l'ASR i MT poden servir per obtindre textos multilingües d'alta qualitat a un
cost raonable per a un gran ventall de continguts audiovisuals. Concretament,
aquesta tesi aborda problemes com el de traducció de notÃcies o el de subtitu-
lació automà tica de televisió. l'ASR i MT també es poden combinar entre ells,
generant automà ticament subtÃtols traduïts, o amb altres solucions de NLP:
amb resum de textos per produir resums de discursos, o amb sÃntesi de la parla
per crear doblatges automà tics. Aquestes altres aplicacions es troben fora de
l'abast d'aquesta tesi però poden aprofitar les contribucions que conté, en la
mesura que ajuden a millorar els resultats dels sistemes automà tics dels quals
depenen.
Aquesta tesi conté una aplicació de l'arquitectura Transformer al MT tal com
va ser concebuda, mitjançant la qual obtenim resultats de primer nivell en
traducció de llengües semblants. En capÃtols subseqüents, aquesta tesi aborda
l'adaptació del Transformer com a model de llenguatge per a sistemes hÃbrids
d'ASR en viu. Posteriorment, descriu l'aplicació d'aquest tipus de sistemes al
cas d'ús de subtitulació de continguts televisius, participant en una competició
pública de RTVE on obtenim la primera posició amb un marge significant.
També demostrem que la millora es deu principalment a la tecnologia desen-
volupada i no tant a la part de les dades[EN] Natural language processing (NLP) is a set of fundamental computing prob-
lems with immense applicability, as language is the natural communication
vehicle for people. NLP, along with many other computer technologies, has
been revolutionized in recent years by the impact of deep learning. This thesis
is centered around two keystone problems for NLP: machine translation (MT)
and automatic speech recognition (ASR); and a common deep neural architec-
ture, the Transformer, that is leveraged to improve the technical solutions for
some MT and ASR applications.
ASR and MT can be utilized to produce cost-effective, high-quality multilin-
gual texts for a wide array of media. Particular applications pursued in this
thesis are that of news translation or that of automatic live captioning of tele-
vision broadcasts. ASR and MT can also be combined with each other, for
instance generating automatic translated subtitles from audio, or augmented
with other NLP solutions: text summarization to produce a summary of a
speech, or speech synthesis to create an automatic translated dubbing, for in-
stance. These other applications fall out of the scope of this thesis, but can
profit from the contributions that it contains, as they help to improve the
performance of the automatic systems on which they depend.
This thesis contains an application of the Transformer architecture to MT as it
was originally conceived, achieving state-of-the-art results in similar language
translation. In successive chapters, this thesis covers the adaptation of the
Transformer as a language model for streaming hybrid ASR systems. After-
wards, it describes how we applied the developed technology for a specific use
case in television captioning by participating in a competitive challenge and
achieving the first position by a large margin. We also show that the gains
came mostly from the improvement in technology capabilities over two years
including that of the Transformer language model adapted for streaming, and
the data component was minor.Baquero Arnal, P. (2023). Transformer Models for Machine Translation and Streaming Automatic Speech Recognition [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/19368
A continuing professional development framework for medical laboratory technologists/technicians in South Africa
Thesis (D.Tech) - Central University of Technology, Free State, 2006Since 2002 all medical technologists and technicians have been obliged to participate in the compulsory continuing professional development (CPD) programme implemented by the Health Professions Council of South Africa (HPCSA). It was foreseen that CPD would not be equally accessible to medical technologists and technicians in urban and rural areas. The reason for this survey was to identify obstacles that might prevent medical technologists and technicians, especially those in rural areas from participating in CPD activities and to identify ways to overcome these obstacles.
The survey was conducted in three phases. During the first phase quantitative information, concerning the profession of medical technology in South Africa, and CPD in general was obtained from registered medical technologists and technicians by means of a questionnaire. Information obtained from the questionnaire as well as that obtained from the literature led to the second phase in which an interview questionnaire was compiled. Structured interviews were conducted with medical technologists and technicians employed throughout South Africa, gathering mainly qualitative information regarding medical technology and CPD.
Lack of time and financial constraints and to a lesser extent travelling were identified as the major obstacles to participating in CPD activities. The obstacles were an even bigger problem to those employed in rural areas. It was also confirmed that everybody involved in medical technology should be positively motivated to create and participate in CPD activities. A method suggested was to practise CPD activities during working hours which is cost effective but restricted, because of the workload. In addition medical technologists and technicians should participate in activities offered by the Society of Medical Laboratory Technologists of South Africa (SMLTSA) and attempt formal further qualifications. Being involved in research projects and identifying case studies could result in publishing in accredited journals.
During the third phase of the survey a concept CPD framework was compiled. According to the framework all role players involved in the profession of medical technology must collaborate and contribute to making CPD activities accessible to all registered medical technologists and technicians and create a positive attitude to CPD. The role players include the HPCSA, employers and top management, the SMLTSA, medical companies, other health professionals, higher education institutions and the individual. It must be emphasised that the task of collecting CPD credits remains the responsibility of the medical technologist or medical technician. The framework offered suggestions for CPD activities whereby medical technologists and technicians could accumulate CPD credits. One major concern indicated in the framework, was that CPD should not only be measured by CPD credits but the outcomes of CPD should be reflected in the profession and the workplace and a system must be implemented to measure CPD outcomes.
The CPD framework was evaluated by a panel of experts familiar with the profession of medical technology and the CPD programme, using the Delphi technique. This final CPD framework will be referred to the HPCSA for implementation in all South African pathology laboratories and the blood transfusion services. The aim of the framework is to assist the CPD guidelines currently under revision in establishing a usable CPD programme
Accountability and governance practices in Islamic microfinance institutions : evidence from Indonesia
This thesis presents an explanatory analysis of the accountability and governance practices within Islamic microfinance institutions in Indonesia, adopting Bourdieu’s theory of practice with the triad of concepts of field, capital and habitus. This study is based on the argument that accountability and governance are socially and subjectively constructed through internal and external mechanisms. The internal mechanism includes the organisational history and its culture, while the external factor is the context in which the organisation operates.The concept of field is a guide to achieving the first research objective in examining the historical development of the Islamic microfinance sector in Indonesia. The changes and developments within the field are the external factors that affect the construction of accountability and governance practices within selected Islamic microfinance institutions in Indonesia (IIMFIs). The historical research method is deemed the most appropriate to provide a narrative history of the development of the field. This was done by interviewing and analysing the secondary data of journal research relating to Indonesia’s socio-political situation over the period, the initiative on microfinance services, the Islamic resurgence movement and the financial reforms. The findings demonstrate that the development of the Islamic microfinance sector is inseparable from the socio-economic and political situation in Indonesia. Therefore, the field of Islamic microfinance is dynamic rather than static: it changes and develops over time.Furthermore, the concept of capital and habitus helps to achieve the second research objective, to examine the construction and re-construction of accountability and governance within selected IMFIs in Indonesia: BMT A and BTM B. The internal factors of the growth of various types of capital (economic, culture, social and symbolic), imbued by the organisational habitus, constructed the accountability and governance practices. Such practices are developed over the period corresponding to the different stages of organisational growth. Both institutions have evolved from small pilot projects into two of the biggest IIMFIs in Indonesia by developing their own mechanisms of accountability and governance to ensure their capability to continue their operations in the future.This study offers a unique lens for exploring, in a specific context, the construction of accountability and governance practices, the dynamic aspects of which traditional governance theories are unable to explain
Coherence in Machine Translation
Coherence ensures individual sentences work together to form a meaningful document. When properly translated, a coherent document in one language should result in a coherent document in another language. In Machine Translation, however, due to reasons of modeling and computational complexity, sentences are pieced together from words or phrases based on short context windows and
with no access to extra-sentential context.
In this thesis I propose ways to automatically assess the coherence of machine translation output. The work is structured around three dimensions: entity-based coherence, coherence as evidenced via syntactic patterns, and coherence as
evidenced via discourse relations.
For the first time, I evaluate existing monolingual coherence models on this new task, identifying issues and challenges that are specific to the machine translation setting. In order to address these issues, I adapted a state-of-the-art syntax
model, which also resulted in improved performance for the monolingual task. The results clearly indicate how much more difficult the new task is than the task of detecting shuffled texts. I proposed a new coherence model, exploring the crosslingual transfer of discourse relations in machine translation. This model is novel in that it measures the correctness of the discourse relation by comparison to the source text rather than to a reference translation. I identified patterns of incoherence common across different language pairs, and created a corpus of machine translated output annotated with coherence errors for evaluation purposes. I then examined
lexical coherence in a multilingual context, as a preliminary study for crosslingual transfer. Finally, I determine how the new and adapted models correlate with human judgements of translation quality and suggest that improvements in general evaluation within machine translation would benefit from having a coherence component that evaluated the translation output with respect to the source text
- …