241,380 research outputs found
Parallel Text Alignment
Tato práce se zabĂ˝vá zarovnávánĂm paralelnĂch textĹŻ. V prvnà části popisuje pĹ™Ăstupy k zarovnávánĂ a nÄ›kterĂ© nástroje na zarovnávánĂ. V práci je nejprve jednoduše popsáno statistickĂ© zarovnávánĂ, a dále je popsáno zarovnávánĂ s vyuĹľitĂm slovnĂku, jeĹľ je hlavnĂm tĂ©matem tĂ©to práce. V dalšà částii práce je uveden princip slovnĂkovĂ©ho zarovnávánĂ a takĂ© ukázka zarovnánĂ dat na vybranĂ©m vzorku dat. V závÄ›ru práce jsou shrnuty zĂskanĂ© vĂ˝sledky a takĂ© návhy na budoucĂ rozvoj v danĂ©m tĂ©matu.This thesis is concerned to align parallel corpus. In the first part of thesis are describe acceses to align and some tool to align. As first describe a statistical align, but the main part is specialize to align with use dictionary, which is the main part of this thesis. In the midle part is introduce the princip of dictionary align and a simple example of align. At the end of work are sumarize obtained results and are noted proposals for future develop.
Developing Word-aligned Myanmar-English Parallel Corpus based on the IBM Models
Word alignment in bilingual corpora has been an active research
topic in the Machine Translation research groups. Corpus is the
body of text collections, which are useful for Language
Processing (NLP). Parallel text alignment is the identification of
the corresponding sentences in the parallel text. Large
collections of parallel level are prerequisite for many areas of
linguistic research. Parallel corpus helps in making statistical
bilingual dictionary, in supporting statistical machine translation
and in supporting as training data for word sense disambiguation
and translation disambiguation. Nowadays, the world is a global
network and everybody will be learned more than one language.
So, multilingual corpora are more processing. Thus, the main
purpose of this system is to construct word-aligned parallel
corpus to be able in Myanmar-English machine translation. One
useful concept is to identify correspondences between words in
one language and in other language. The proposed approach is
based on the first three IBM models and EM algorithm. It also
shows that the approach can also be improved by using a list of
cognates and morphological analysis
Multiple Media Correlation: Theory and Applications
This thesis introduces multiple media correlation, a new technology for the automatic alignment of multiple media objects such as text, audio, and video. This research began with the question: what can be learned when multiple multimedia components are analyzed simultaneously? Most ongoing research in computational multimedia has focused on queries, indexing, and retrieval within a single media type. Video is compressed and searched independently of audio, text is indexed without regard to temporal relationships it may have to other media data. Multiple media correlation provides a framework for locating and exploiting correlations between multiple, potentially heterogeneous, media streams. The goal is computed synchronization, the determination of temporal and spatial alignments that optimize a correlation function and indicate commonality and synchronization between media objects. The model also provides a basis for comparison of media in unrelated domains. There are many real-world applications for this technology, including speaker localization, musical score alignment, and degraded media realignment. Two applications, text-to-speech alignment and parallel text alignment, are described in detail with experimental validation. Text-to-speech alignment computes the alignment between a textual transcript and speech-based audio. The presented solutions are effective for a wide variety of content and are useful not only for retrieval of content, but in support of automatic captioning of movies and video. Parallel text alignment provides a tool for the comparison of alternative translations of the same document that is particularly useful to the classics scholar interested in comparing translation techniques or styles. The results presented in this thesis include (a) new media models more useful in analysis applications, (b) a theoretical model for multiple media correlation, (c) two practical application solutions that have wide-spread applicability, and (d) Xtrieve, a multimedia database retrieval system that demonstrates this new technology and demonstrates application of multiple media correlation to information retrieval. This thesis demonstrates that computed alignment of media objects is practical and can provide immediate solutions to many information retrieval and content presentation problems. It also introduces a new area for research in media data analysis
- …