155,906 research outputs found
Pre-editing and post-editing
This chapter provides an accessible introductory view of pre-editing and post-editing as the starting-point for research or work in the language industry. It describes source text pre-editing and machine translation post-editing from an industrial as well as academic point of view. In the last ten to fifteen years, there has been a considerable growth in the number of studies and publications dealing with pre-editing, and especially post-editing, that have helped researchers and the industry to understand the impact machine translation technology has on translatorsâ output and their working environment. This interest is likely to continue in view of the recent developments in neural machine translation and artificial intelligence. Although the latest technology has taken a considerable leap forward, the existing body of work should not be disregarded as it has defined clear research lines and methods, as it is more necessary than ever to look at data in their appropriate context and avoid generalizing in the vast and diverse territory of human and machine translation
Pre-editing and post-editing
This chapter provides an accessible introductory view of pre-editing and post-editing as the starting-point for research or work in the language industry. It describes source text pre-editing and machine translation post-editing from an industrial as well as academic point of view. In the last ten to fifteen years, there has been a considerable growth in the number of studies and publications dealing with pre-editing, and especially post-editing, that have helped researchers and the industry to understand the impact machine translation technology has on translatorsâ output and their working environment. This interest is likely to continue in view of the recent developments in neural machine translation and artificial intelligence. Although the latest technology has taken a considerable leap forward, the existing body of work should not be disregarded as it has defined clear research lines and methods, as it is more necessary than ever to look at data in their appropriate context and avoid generalizing in the vast and diverse territory of human and machine translation
Taking statistical machine translation to the student translator
Despite the growth of statistical machine translation (SMT) research and development in recent years, it remains somewhat out of reach for the translation community where programming expertise and knowledge of statistics tend not to be commonplace. While the concept of SMT is relatively straightforward, its implementation in functioning systems remains difficult for most, regardless of expertise. More recently, however, developments such as SmartMATE have emerged which aim to assist users in creating their own customized SMT systems and thus reduce the learning curve associated with SMT. In addition to commercial uses, translator training stands to benefit from such increased levels of inclusion and access to state-of-the-art approaches to MT. In this paper we draw on experience in developing and evaluating a new syllabus in SMT for a cohort of post-graduate student translators: we identify several issues encountered in the introduction of student translators to SMT, and report on data derived from repeated measures questionnaires that aim to capture data on studentsâ self-efficacy in the use of SMT. Overall, results show that participants report significant increases in their levels of confidence and knowledge of MT in general, and of SMT in particular. Additional benefits â such as increased technical competence and confidence â and future refinements are also discussed
The Professional Linguist: language skills for the real world
This chapter reports on a compulsory final year employability skills module for Modern Foreign Languages (MFL) undergraduates at York St John University. The âProfessional Linguistâ aims to equip students with a range of skills which they may need when entering the workplace, whilst underpinning it with theory which would benefit those wishing to continue into postgraduate study in the field. The module covers a range of skills, from IT-based ones such as use of specialist software, online dictionaries, etc., to discussion of the ethics of recent developments in translation such as fansubbing and machine translation. The module also incorporates other elements including an introduction to interpreting, using the language features in Microsoft (MS) Word and talks from professionals such as subtitlers and project managers in recognition that not all graduates will goon to become translators
Evolving Gaussian Process Kernels for Translation Editing Effort Estimation
In many Natural Language Processing problems the combination of machine learning and optimization techniques is essential. One of these problems is estimating the effort required to improve, under direct human supervision, a text that has been translated using a machine translation method. Recent developments in this area have shown that Gaussian Processes can be accurate for post-editing effort prediction. However, the Gaussian Process kernel has to be chosen in advance, and this choice in- fluences the quality of the prediction. In this paper, we propose a Genetic Programming algorithm to evolve kernels for Gaussian Processes. We show that the combination of evolutionary optimization and Gaussian Processes removes the need for a-priori specification of the kernel choice, and achieves predictions that, in many cases, outperform those obtained with fixed kernels.TIN2016-78365-
Interoperability and machine-to-machine translation model with mappings to machine learning tasks
Modern large-scale automation systems integrate thousands to hundreds of
thousands of physical sensors and actuators. Demands for more flexible
reconfiguration of production systems and optimization across different
information models, standards and legacy systems challenge current system
interoperability concepts. Automatic semantic translation across information
models and standards is an increasingly important problem that needs to be
addressed to fulfill these demands in a cost-efficient manner under constraints
of human capacity and resources in relation to timing requirements and system
complexity. Here we define a translator-based operational interoperability
model for interacting cyber-physical systems in mathematical terms, which
includes system identification and ontology-based translation as special cases.
We present alternative mathematical definitions of the translator learning task
and mappings to similar machine learning tasks and solutions based on recent
developments in machine learning. Possibilities to learn translators between
artefacts without a common physical context, for example in simulations of
digital twins and across layers of the automation pyramid are briefly
discussed.Comment: 7 pages, 2 figures, 1 table, 1 listing. Submitted to the IEEE
International Conference on Industrial Informatics 2019, INDIN'1
Syntactically Supervised Transformers for Faster Neural Machine Translation
Standard decoders for neural machine translation autoregressively generate a
single target token per time step, which slows inference especially for long
outputs. While architectural advances such as the Transformer fully parallelize
the decoder computations at training time, inference still proceeds
sequentially. Recent developments in non- and semi- autoregressive decoding
produce multiple tokens per time step independently of the others, which
improves inference speed but deteriorates translation quality. In this work, we
propose the syntactically supervised Transformer (SynST), which first
autoregressively predicts a chunked parse tree before generating all of the
target tokens in one shot conditioned on the predicted parse. A series of
controlled experiments demonstrates that SynST decodes sentences ~ 5x faster
than the baseline autoregressive Transformer while achieving higher BLEU scores
than most competing methods on En-De and En-Fr datasets.Comment: 9 pages, 5 figures, accepted to ACL 201
NLSC: Unrestricted Natural Language-based Service Composition through Sentence Embeddings
Current approaches for service composition (assemblies of atomic services)
require developers to use: (a) domain-specific semantics to formalize services
that restrict the vocabulary for their descriptions, and (b) translation
mechanisms for service retrieval to convert unstructured user requests to
strongly-typed semantic representations. In our work, we argue that effort to
developing service descriptions, request translations, and matching mechanisms
could be reduced using unrestricted natural language; allowing both: (1)
end-users to intuitively express their needs using natural language, and (2)
service developers to develop services without relying on syntactic/semantic
description languages. Although there are some natural language-based service
composition approaches, they restrict service retrieval to syntactic/semantic
matching. With recent developments in Machine learning and Natural Language
Processing, we motivate the use of Sentence Embeddings by leveraging richer
semantic representations of sentences for service description, matching and
retrieval. Experimental results show that service composition development
effort may be reduced by more than 44\% while keeping a high precision/recall
when matching high-level user requests with low-level service method
invocations.Comment: This paper will appear on SCC'19 (IEEE International Conference on
Services Computing) on July 1
- âŠ