17 research outputs found

    Taking MT evaluation metrics to extremes : beyond correlation with human judgments

    Get PDF
    Automatic Machine Translation (MT) evaluation is an active field of research, with a handful of new metrics devised every year. Evaluation metrics are generally benchmarked against manual assessment of translation quality, with performance measured in terms of overall correlation with human scores. Much work has been dedicated to the improvement of evaluation metrics to achieve a higher correlation with human judgments. However, little insight has been provided regarding the weaknesses and strengths of existing approaches and their behavior in different settings. In this work we conduct a broad meta-evaluation study of the performance of a wide range of evaluation metrics focusing on three major aspects. First, we analyze the performance of the metrics when faced with different levels of translation quality, proposing a local dependency measure as an alternative to the standard, global correlation coefficient. We show that metric performance varies significantly across different levels of MT quality: Metrics perform poorly when faced with low-quality translations and are not able to capture nuanced quality distinctions. Interestingly, we show that evaluating low-quality translations is also more challenging for humans. Second, we show that metrics are more reliable when evaluating neural MT than the traditional statistical MT systems. Finally, we show that the difference in the evaluation accuracy for different metrics is maintained even if the gold standard scores are based on different criteria

    Improving Statistical Machine Translation using Morpho-syntactic Information

    Get PDF
    In the framework of statistical machine translation, correspondences between the words in the source and the target language are learned from bilingual corpora, and often little or no linguistic knowledge is used to structure the underlying models. The work presented in this thesis is motivated by the well-known observation that training data typically does not sufficiently represent the range of phenomena in natural languages. In this thesis, various methods of incorporating morphological and syntactic information into systems for statistical machine translation are proposed and systematically assessed. The overall goal is to improve translation quality and to reduce the amount of parallel text necessary to train the model parameters. The development of the suggested methods is guided by the analysis of important causes of errors

    Towards an Automatic Sign Language Translation System

    Get PDF
    Overview: It was not a front-page story, but it was still a rather interesting. The newspaper article said that deaf students at Gallaudet University in Washington D.C. were celebrating the tenth anniversary of King Jordan as the president of the university. Many of the students said that Jordan was the most important president in the history of the school. Founded in 1864, Gallaudet was the first liberal arts college for deaf people. In 1989, student protests at Gallaudet shut down the university. What did the students want? After 125 years, they felt it was time that the university was run by a deaf president. After days of protest, the university agreed, and Jordan, who had been deafened at age twenty-one was named the first deaf president in the history of the school. From the article, it was clear that deaf people did not necessarily consider themselves handicapped, but rather consider themselves to be members of a distinct culture. Rather than consider their deafness a disability, they see it as a badge of uniqueness that allows them to be in the world and perceive it in different ways than other people. This bond can be so strong that they even reject technological and medical advances that would allow them to hear. They feel that being deaf is not a disadvantage but rather should be considered another culture in our society. Causes of deafness: What causes deafness? There are several answers to this question, but one important cause is genetics. This means that someone in the family had this condition which was passed on to the child. This condition is passed on to a child by a recessive gene that both parents most have. Because there might have not been a deaf person on either side of the family for a long time, some parents are often surprised and confused as to why their child cannot hear. In addition to genetics there are other causes of deafness. Some of the other leading causes of deafness include: disease, drugs, injury, and aging. Some illnesses that can cause deafness include memingitis, lupus, rubella, rheumatoid arthritis and diabetes. Benign or malignant growths in various parts of the ear may also cause deafness. Some medications are also known to cause deafness. Injury, such as a blow to the head or continious exposure to excessive noise can cause deafness. Among other leading causes of deafness is aging. Older people often experience hearing loss, gradually. Medical research into causes and treatments continue. The information mentioned above is only a very small part of what is available for learning. Communication: There are several different ways to communicate with someone who is deaf or hard of hearing. Within a deaf community, communication can be combined with speech- reading (lipreading) writing, signing, and/or finger spelling. What method is used usually depends on the person\u92s preference and communication abilities in the conversation. The preference in communication may depend on schooling the person has, the person\u92s age at the onset of the deafness, the environment (light, dark noise, distance), and technology. Before separate schools were established for the deaf, deaf people\u92

    Improving SMT quality with morpho-syntactic analysis

    Get PDF
    In the framework of statistical machine translation (SMT), correspondences between the words in the source and the target language are learned from bilingual corpora on the basis of so-called alignment models. Many of the statistical systems use little or no linguistic knowledge to structure the underlying models. In this paper we argue that training data is typically not large enough to suciently represent the range of di erent phenomena in natural languages and that SMT can take advantage of the explicit introduction of some knowledge about the languages under consideration. The improvement of the translation results is demonstrated on two di erent German-English corpora. 1 Introduction In this paper, we address the question of how morphological and syntactic analysis can help statistical machine translation (SMT). In our approach, we introduce several transformations to the source string (in our experiments the source language is German) to demonstrate how linguistic knowledge can..
    corecore