Search CORE

33,768 research outputs found

Using TERp to augment the system combination for SMT

Author: Du Jinhua
Way Andy
Publication venue: Association for Machine Translation in the Americas
Publication date: 01/01/2010
Field of study

TER-Plus (TERp) is an extended TER evaluation metric incorporating morphology, synonymy and paraphrases. There are three new edit operations in TERp: Stem Matches, Synonym Matches and Phrase Substitutions (Para-phrases). In this paper, we propose a TERp-based augmented system combination in terms of the backbone selection and consensus decoding network. Combining the new properties\ud of the TERp, we also propose a two-pass decoding strategy for the lattice-based phrase-level confusion network(CN) to generate the final result. The experiments conducted on the NIST2008 Chinese-to-English test set show that our TERp-based augmented system combination framework achieves significant improvements in terms of BLEU and TERp scores compared to the state-of-the-art word-level system combination framework and a TER-based combination strategy

Irish Universities

DCU Online Research Access Service

Index-free Heat Kernel Coefficients

Author: Amsterdamski P
Anton E M van de Ven
Atiyah M F
Avramidi I G
Avramidi I G
Avramidi I G
Belger M
Belkov A A
Bertlmann R A
Branson T P
Delbourgo R
DeWitt B S
DeWitt B S
Dowker J S
Fateev V A
Fliegner D
Fliegner D
Fock V A
Fulling S A
Fulling S A
Fulling S A
Gilkey P B
Gilkey P B
Gilkey P B
Hadamard J
Kirsten K
Lüscher M
McLenaghan R G
McLenaghan R G
Müller U
Nielsen N K
Parker L
Romanov V N
Sakai T
Schwinger J S
Shore G M
Vermaseren J A M
Willmore T J
Wolfram S
Publication venue: 'IOP Publishing'
Publication date: 28/08/1997
Field of study

Using index-free notation, we present the diagonal values of the first five heat kernel coefficients associated with a general Laplace-type operator on a compact Riemannian space without boundary. The fifth coefficient appears here for the first time. For a flat space with a gauge connection, the sixth coefficient is given too. Also provided are the leading terms for any coefficient, both in ascending and descending powers of the Yang-Mills and Riemann curvatures, to the same order as required for the fourth coefficient. These results are obtained by directly solving the relevant recursion relations, working in Fock-Schwinger gauge and Riemann normal coordinates. Our procedure is thus noncovariant, but we show that for any coefficient the `gauged' respectively `curved' version is found from the corresponding `non-gauged' respectively `flat' coefficient by making some simple covariant substitutions. These substitutions being understood, the coefficients retain their `flat' form and size. In this sense the fifth and sixth coefficient have only 26 and 75 terms respectively, allowing us to write them down. Using index-free notation also clarifies the general structure of the heat kernel coefficients. In particular, in flat space we find that from the fifth coefficient onward, certain scalars are absent. This may be relevant for the anomalies of quantum field theories in ten or more dimensions.Comment: 38 pages, LaTe

arXiv.org e-Print Archive

Crossref

CERN Document Server

Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies

Author: A Kraskov
A Milosavljević
G Navarro
J Felsenstein
J Lake
J Rissanen
J Rissanen
J Thompson
J Varre
Konrad Scheffler
L Allison
M Brudno
M Brudno
M Cao
M Li
M Li
M Mahoney
M Nei
M Steel
Maya Paczuski
N Bray
N Bray
N Saitou
Orion Penner
P Buneman
P Lockhart
P Viola
Peter Grassberger
R Cilibrasi
R Durbin
S Altschul
S Altschul
S McGinnis
S Vinga
T Cover
T Lassmann
W Press
X Chen
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 19/08/2010
Field of study

Existing sequence alignment algorithms use heuristic scoring schemes which cannot be used as objective distance metrics. Therefore one relies on measures like the p- or log-det distances, or makes explicit, and often simplistic, assumptions about sequence evolution. Information theory provides an alternative, in the form of mutual information (MI) which is, in principle, an objective and model independent similarity measure. MI can be estimated by concatenating and zipping sequences, yielding thereby the "normalized compression distance". So far this has produced promising results, but with uncontrolled errors. We describe a simple approach to get robust estimates of MI from global pairwise alignments. Using standard alignment algorithms, this gives for animal mitochondrial DNA estimates that are strikingly close to estimates obtained from the alignment free methods mentioned above. Our main result uses algorithmic (Kolmogorov) information theory, but we show that similar results can also be obtained from Shannon theory. Due to the fact that it is not additive, normalized compression distance is not an optimal metric for phylogenetics, but we propose a simple modification that overcomes the issue of additivity. We test several versions of our MI based distance measures on a large number of randomly chosen quartets and demonstrate that they all perform better than traditional measures like the Kimura or log-det (resp. paralinear) distances. Even a simplified version based on single letter Shannon entropies, which can be easily incorporated in existing software packages, gave superior results throughout the entire animal kingdom. But we see the main virtue of our approach in a more general way. For example, it can also help to judge the relative merits of different alignment algorithms, by estimating the significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

IMT Institutional Repository