86 research outputs found
Context-aware graph segmentation for graph-based translation
In this paper, we present an improved
graph-based translation model which segments an input graph into node-induced
subgraphs by taking source context into
consideration. Translations are generated
by combining subgraph translations leftto-right using beam search. Experiments
on ChineseâEnglish and GermanâEnglish
demonstrate that the context-aware segmentation significantly improves the baseline
graph-based model
Combining translation memories and syntax-based SMT: experiments with real industrial data
One major drawback of using Translation Memories (TMs) in phrase-based Machine
Translation (MT) is that only continuous phrases are considered. In contrast, syntax-based MT
allows phrasal discontinuity by learning translation rules containing non-terminals. In this paper,
we combine a TM with syntax-based MT via sparse features. These features are extracted during
decoding based on translation rules and their corresponding patterns in the TM. We have tested
this approach by carrying out experiments on real EnglishâSpanish industrial data. Our results
show that these TM features significantly improve syntax-based MT. Our final system yields
improvements of up to +3.1 BLEU, +1.6 METEOR, and -2.6 TER when compared with a stateof-the-art phrase-based MT system
Topic-informed neural machine translation
In recent years, neural machine translation (NMT) has demonstrated state-of-the-art machine
translation (MT) performance. It is a new approach to MT, which tries to learn a set of parameters
to maximize the conditional probability of target sentences given source sentences. In this paper,
we present a novel approach to improve the translation performance in NMT by conveying topic
knowledge during translation. The proposed topic-informed NMT can increase the likelihood of
selecting words from the same topic and domain for translation. Experimentally, we demonstrate
that topic-informed NMT can achieve a 1.15 (3.3% relative) and 1.67 (5.4% relative) absolute
improvement in BLEU score on the Chinese-to-English language pair using NIST 2004 and 2005
test sets, respectively, compared to NMT without topic information
ProphetMT: controlled language authoring aid system description
This paper presents ProphetMT, a monolingual Controlled Language (CL) authoring tool which allows users to easily compose an
in-domain sentence with the help of tree-based SMT-driven auto-suggestions. The interface also visualizes target-language sentences
as they are built by the SMT system. When the user is finished composing, the final translation(s) are generated by a tree-based SMT
system using the text and structural information provided by the user. With this domain-specific controlled language, ProphetMT will
produce highly reliable translations. The contributions of this work are: 1) we develop a user-friendly auto-completion-based editor
which guarantees that the vocabulary and grammar chosen by a user are compatible with a tree-based SMT model; 2) by applying a
shift-reduce-like parsing feature, this editor allows users to write from left-to-right and generates the parsing results on the fly. Accordingly, with this in-domain composing restriction as well as the gold-standard parsing result, a highly reliable translation can be generated
SongRewriter: A Chinese Song Rewriting System with Controllable Content and Rhyme Scheme
Although lyrics generation has achieved significant progress in recent years,
it has limited practical applications because the generated lyrics cannot be
performed without composing compatible melodies. In this work, we bridge this
practical gap by proposing a song rewriting system which rewrites the lyrics of
an existing song such that the generated lyrics are compatible with the rhythm
of the existing melody and thus singable. In particular, we propose
SongRewriter, a controllable Chinese lyric generation and editing system which
assists users without prior knowledge of melody composition. The system is
trained by a randomized multi-level masking strategy which produces a unified
model for generating entirely new lyrics or editing a few fragments. To improve
the controllabiliy of the generation process, we further incorporate a keyword
prompt to control the lexical choices of the content and propose novel decoding
constraints and a vowel modeling task to enable flexible end and internal rhyme
schemes. While prior rhyming metrics are mainly for rap lyrics, we propose
three novel rhyming evaluation metrics for song lyrics. Both automatic and
human evaluations show that the proposed model performs better than the
state-of-the-art models in both contents and rhyming quality. Our code and
models implemented in MindSpore Lite tool will be available
- âŠ