33,768 research outputs found
Using TERp to augment the system combination for SMT
TER-Plus (TERp) is an extended TER evaluation metric incorporating morphology, synonymy and paraphrases.
There are three new edit operations in TERp: Stem Matches, Synonym Matches and Phrase Substitutions (Para-phrases). In this paper, we propose a TERp-based augmented system combination in terms of the backbone selection and consensus decoding network. Combining the new properties\ud
of the TERp, we also propose a two-pass decoding strategy for the lattice-based phrase-level confusion network(CN) to generate the final result. The experiments conducted on the NIST2008 Chinese-to-English test set show that our TERp-based augmented system combination framework achieves significant improvements in terms of BLEU and TERp scores compared to the state-of-the-art word-level system combination framework and a TER-based combination strategy
Index-free Heat Kernel Coefficients
Using index-free notation, we present the diagonal values of the first five
heat kernel coefficients associated with a general Laplace-type operator on a
compact Riemannian space without boundary. The fifth coefficient appears here
for the first time. For a flat space with a gauge connection, the sixth
coefficient is given too. Also provided are the leading terms for any
coefficient, both in ascending and descending powers of the Yang-Mills and
Riemann curvatures, to the same order as required for the fourth coefficient.
These results are obtained by directly solving the relevant recursion
relations, working in Fock-Schwinger gauge and Riemann normal coordinates. Our
procedure is thus noncovariant, but we show that for any coefficient the
`gauged' respectively `curved' version is found from the corresponding
`non-gauged' respectively `flat' coefficient by making some simple covariant
substitutions. These substitutions being understood, the coefficients retain
their `flat' form and size. In this sense the fifth and sixth coefficient have
only 26 and 75 terms respectively, allowing us to write them down. Using
index-free notation also clarifies the general structure of the heat kernel
coefficients. In particular, in flat space we find that from the fifth
coefficient onward, certain scalars are absent. This may be relevant for the
anomalies of quantum field theories in ten or more dimensions.Comment: 38 pages, LaTe
Sequence alignment, mutual information, and dissimilarity measures for constructing phylogenies
Existing sequence alignment algorithms use heuristic scoring schemes which
cannot be used as objective distance metrics. Therefore one relies on measures
like the p- or log-det distances, or makes explicit, and often simplistic,
assumptions about sequence evolution. Information theory provides an
alternative, in the form of mutual information (MI) which is, in principle, an
objective and model independent similarity measure. MI can be estimated by
concatenating and zipping sequences, yielding thereby the "normalized
compression distance". So far this has produced promising results, but with
uncontrolled errors. We describe a simple approach to get robust estimates of
MI from global pairwise alignments. Using standard alignment algorithms, this
gives for animal mitochondrial DNA estimates that are strikingly close to
estimates obtained from the alignment free methods mentioned above. Our main
result uses algorithmic (Kolmogorov) information theory, but we show that
similar results can also be obtained from Shannon theory. Due to the fact that
it is not additive, normalized compression distance is not an optimal metric
for phylogenetics, but we propose a simple modification that overcomes the
issue of additivity. We test several versions of our MI based distance measures
on a large number of randomly chosen quartets and demonstrate that they all
perform better than traditional measures like the Kimura or log-det (resp.
paralinear) distances. Even a simplified version based on single letter Shannon
entropies, which can be easily incorporated in existing software packages, gave
superior results throughout the entire animal kingdom. But we see the main
virtue of our approach in a more general way. For example, it can also help to
judge the relative merits of different alignment algorithms, by estimating the
significance of specific alignments.Comment: 19 pages + 16 pages of supplementary materia
- âŠ