6,007 research outputs found
Instance-Based Model Adaptation For Direct Speech Translation
Despite recent technology advancements, the effectiveness of neural
approaches to end-to-end speech-to-text translation is still limited by the
paucity of publicly available training corpora. We tackle this limitation with
a method to improve data exploitation and boost the system's performance at
inference time. Our approach allows us to customize "on the fly" an existing
model to each incoming translation request. At its core, it exploits an
instance selection procedure to retrieve, from a given pool of data, a small
set of samples similar to the input query in terms of latent properties of its
audio signal. The retrieved samples are then used for an instance-specific
fine-tuning of the model. We evaluate our approach in three different
scenarios. In all data conditions (different languages, in/out-of-domain
adaptation), our instance-based adaptation yields coherent performance gains
over static models.Comment: 6 pages, under review at ICASSP 202
Ontological Engineering For Source Code Generation
Source Code Generation (SCG) is the sub-domain of the Automatic Programming (AP) that helps programmers to program using high-level abstraction. Recently, many researchers investigated many techniques to access SCG. The problem is to use the appropriate technique to generate the source code due to its purposes and the inputs. This paper introduces a review and an analysis related SCG techniques. Moreover, comparisons are presented for: techniques mapping, Natural Language Processing (NLP), knowledge base, ontology, Specification Configuration Template (SCT) model and deep learnin
Towards a better integration of fuzzy matches in neural machine translation through data augmentation
We identify a number of aspects that can boost the performance of Neural Fuzzy Repair (NFR), an easy-to-implement method to integrate translation memory matches and neural machine translation (NMT). We explore various ways of maximising the added value of retrieved matches within the NFR paradigm for eight language combinations, using Transformer NMT systems. In particular, we test the impact of different fuzzy matching techniques, sub-word-level segmentation methods and alignment-based features on overall translation quality. Furthermore, we propose a fuzzy match combination technique that aims to maximise the coverage of source words. This is supplemented with an analysis of how translation quality is affected by input sentence length and fuzzy match score. The results show that applying a combination of the tested modifications leads to a significant increase in estimated translation quality over all baselines for all language combinations
MORSE: Semantic-ally Drive-n MORpheme SEgment-er
We present in this paper a novel framework for morpheme segmentation which
uses the morpho-syntactic regularities preserved by word representations, in
addition to orthographic features, to segment words into morphemes. This
framework is the first to consider vocabulary-wide syntactico-semantic
information for this task. We also analyze the deficiencies of available
benchmarking datasets and introduce our own dataset that was created on the
basis of compositionality. We validate our algorithm across datasets and
present state-of-the-art results
- …