Search CORE

2 research outputs found

Joint translation and unit conversion for end-to-end localization

Author: Al-Onaizan Yaser
Dinu Georgiana
Federico Marcello
Lauly Stanislas
Mathur Prashant
Publication venue
Publication date: 01/01/2020
Field of study

A variety of natural language tasks require processing of textual data which contains a mix of natural language and formal languages such as mathematical expressions. In this paper, we take unit conversions as an example and propose a data augmentation technique which leads to models learning both translation and conversion tasks as well as how to adequately switch between them for end-to-end localization

arXiv.org e-Print Archive

Crossref

Number Translation and Unit Conversion Using Machine Learning

Author: Hong Haijie
Sun Lu
Publication venue: Technical Disclosure Commons
Publication date: 27/09/2021
Field of study

Machine translation is widely utilized to translate text between different language pairs. Applications of automatic translation include content localization. Different regions of the world utilize different measurement units (e.g., acre vs. hectare). Correctly converting and translating measurement units is thus an important part of content localization. Current machine translation models have low accuracy when translating numbers and are unable to handle unit conversions. This disclosure describes techniques to train a machine learning model such that it can generate accurate translations of numbers, including unit conversions. A base model is trained using input text that is tokenized, including splitting numbers into individual digits. Parameters of the trained base model are used to initialize a custom model that is fine-tuned using training data that has been augmented to include annotations, e.g., different values and units for each measurement in the source text. The trained custom model described can deliver correct number translations and unit conversions and can be used for content localization

Technical Disclosure Common