1,262 research outputs found
Machine translation evaluation resources and methods: a survey
We introduce the Machine Translation (MT) evaluation survey that contains both manual and automatic evaluation methods. The traditional human evaluation criteria mainly include the intelligibility, fidelity, fluency, adequacy, comprehension, and informativeness. The advanced human assessments include task-oriented measures, post-editing, segment ranking, and extended criteriea, etc. We classify the automatic evaluation methods into two categories, including lexical similarity scenario and linguistic features application. The lexical similarity methods contain edit distance, precision, recall, F-measure, and word order. The linguistic features can be divided into syntactic features and semantic features respectively. The syntactic features include part of speech tag, phrase types and sentence structures, and the semantic features include named entity, synonyms, textual entailment, paraphrase, semantic roles, and language models. The deep learning models for evaluation are very newly proposed. Subsequently, we also introduce the evaluation methods for MT evaluation including different correlation scores, and the recent quality estimation (QE) tasks for MT.
This paper differs from the existing works\cite {GALEprogram2009, EuroMatrixProject2007} from several aspects, by introducing some recent development of MT evaluation measures, the different classifications from manual to automatic evaluation measures, the introduction of recent QE tasks of MT, and the concise construction of the content
Algorithmic Aspects of a General Modular Decomposition Theory
A new general decomposition theory inspired from modular graph decomposition
is presented. This helps unifying modular decomposition on different
structures, including (but not restricted to) graphs. Moreover, even in the
case of graphs, the terminology ``module'' not only captures the classical
graph modules but also allows to handle 2-connected components, star-cutsets,
and other vertex subsets. The main result is that most of the nice algorithmic
tools developed for modular decomposition of graphs still apply efficiently on
our generalisation of modules. Besides, when an essential axiom is satisfied,
almost all the important properties can be retrieved. For this case, an
algorithm given by Ehrenfeucht, Gabow, McConnell and Sullivan 1994 is
generalised and yields a very efficient solution to the associated
decomposition problem
Lossless Compression of Color Palette Images with One-Dimensional Techniques
Palette images are widely used on the World Wide Web (WWW) and in game-cartridge applications. Many images used on the WWW are stored and transmitted after they are compressed losslessly with the standard graphics interchange format (GIF), or portable network graphics (PNG). Well-known 2-D compression schemes, such as JPEG-LS and JPEG-2000, fail to yield better compression than GIF or PNG due to the fact that the pixel values represent indices that point to color values in a look-up table. To improve the compression performance of JPEG-LS and JPEG-2000 techniques, several researchers have proposed various reindexing algorithms. We investigate various compression techniques for color palette images. We propose a new technique comprised of a traveling salesman problem (TSP)-based reindexing scheme, Burrows-Wheeler transformation, and inversion ranks. We show that the proposed technique yields better compression gain on average than all the other 1-D compressors and the reindexing schemes that utilize JPEG-LS or JPEG-2000
Similarity of Semantic Relations
There are at least two kinds of similarity. Relational similarity is
correspondence between relations, in contrast with attributional similarity,
which is correspondence between attributes. When two words have a high
degree of attributional similarity, we call them synonyms. When two pairs
of words have a high degree of relational similarity, we say that their
relations are analogous. For example, the word pair mason:stone is analogous
to the pair carpenter:wood. This paper introduces Latent Relational Analysis (LRA),
a method for measuring relational similarity. LRA has potential applications in many
areas, including information extraction, word sense disambiguation,
and information retrieval. Recently the Vector Space Model (VSM) of information
retrieval has been adapted to measuring relational similarity,
achieving a score of 47% on a collection of 374 college-level multiple-choice
word analogy questions. In the VSM approach, the relation between a pair of words is
characterized by a vector of frequencies of predefined patterns in a large corpus.
LRA extends the VSM approach in three ways: (1) the patterns are derived automatically
from the corpus, (2) the Singular Value Decomposition (SVD) is used to smooth the frequency
data, and (3) automatically generated synonyms are used to explore variations of the
word pairs. LRA achieves 56% on the 374 analogy questions, statistically equivalent to the
average human score of 57%. On the related problem of classifying semantic relations, LRA
achieves similar gains over the VSM
- …