5 research outputs found

    The Effects of Turkish Vowel Harmony In Word Recognition

    Get PDF
    This thesis examines the effects of Turkish vowel harmony in visual word recognition. Turkish is among a few languages with vowel harmony, which is a process in which words contain vowels only from one specific vowel category. These categories are defined by the vowel’s phonological qualities (i.e., similar mouth and lip movement in pronunciation). In Turkish, categories depend on the vowel’s roundness/flatness, backness/frontness and whether the vowel’s pitch. Vowel harmony occurs naturally in language and is not taught formally. Instead, it is believed to occur due to decreased effort of words with vowel harmony in speech production (Khalilzadeh, 2010). Vowel harmony is very common in Turkish, with over half of all Turkish words (root words and affixes), containing vowel harmony (Güngör, 2003). Turkish is particularly interesting because it contains two types of vowel harmony: primary and secondary vowel harmony. Primary vowel harmony depends on the frontness and backness in vowels and secondary vowel harmony depends on whether a vowel is high or low pitch, in addition to the roundness and flatness. Although vowel harmony is very common, disharmony exists among some native Turkish words, foreign loanwords and compounds. Vowel harmony was explored in this thesis within the context of reading, with a focus on primary vowel harmony. There were two studies, including three experiments. The first study consisted of the development of a database of words in Turkish. The database includes all words from an obtained Turkish lexicon, the number of vowels in each word, the word length, whether the word has primary or secondary vowel harmony, word frequency and the syllabified version of each word. The second study consisted of three separate lexical decision task experiments, with each having 30 Turkish speaking participants. Experiment 1 consisted of a straight lexical decision task, with a 3 (Target harmony type: front harmony, back harmony, no harmony) x 2 (Target type: word, nonword) design. Experiments 2 and 3 were masked priming studies with word (Experiment 2) and nonword (Experiment 3) primes, in a 3 (Prime harmony type: front harmony, back harmony, no harmony) x 3 (Target harmony type: front harmony, back harmony, no harmony) x 2 (Target type: word, nonword) design. As predicted for Experiment 1, words with vowel harmony had faster and more accurate responses than words without vowel harmony. Nonwords with back vowel harmony had slower and less accurate responses than nonwords without harmony, which was also in line with the prediction. For Experiments 2 and 3, it was predicted that matching harmony types (i.e., front vowel harmony prime - front vowel harmony target) would have faster and more accurate responses. Results of Experiment 2 did not support the prediction in both latency and accuracy. Results of Experiment 3 supported the predicted results in both latency and accuracy. Overall, the results these experiments suggest that primary vowel harmony facilitates word recognition. This is believed to occur due to the usage of phonemic cues in word recognition. Past research has shown that both phonology and orthography is involved in word recognition, especially in languages with shallow orthography such as Turkish (Frost, 1998; Katz & Frost, 1992). In addition, it has been shown that words with harmony are easier to pronounce (Walker, 2005). Word recognition could have been facilitated since vowel harmony is a phonological category of words that are easier to pronounce

    Large vocabulary recognition for online Turkish handwriting with sublexical units

    Get PDF
    We present a system for large vocabulary recognition of online Turkish handwriting, using hidden Markov models. While using a traditional approach for the recognizer, we have identified and developed solutions for the main problems specific to Turkish handwriting recognition. First, since large amounts of Turkish handwriting samples are not available, the system is trained and optimized using the large UNIPEN dataset of English handwriting, before extending it to Turkish using a small Turkish dataset. The delayed strokes, which pose a significant source of variation in writing order due to the large number of diacritical marks in Turkish, are removed during preprocessing. Finally, as a solution to the high out-of-vocabulary rates encountered when using a fixed size lexicon in general purpose recognition, a lexicon is constructed from sublexical units (stems and endings) learned from a large Turkish corpus. A statistical bigram language model learned from the same corpus is also applied during the decoding process. The system obtains a 91.7% word recognition rate when tested on a small Turkish handwritten word dataset using a medium sized (1950 words) lexicon corresponding to the vocabulary of the test set and 63.8% using a large, general purpose lexicon (130,000 words). However, with the proposed stem+ending lexicon (12,500 words) and bigram language model with lattice expansion, a 67.9% word recognition accuracy is obtained, surpassing the results obtained with the general purpose lexicon while using a much smaller one

    Negatif bağlantılı öğrenme algoritmalı yapay sinir ağları ile mobil cihazlarda optik karakter tanıma uygulaması

    Get PDF
    06.03.2018 tarihli ve 30352 sayılı Resmi Gazetede yayımlanan “Yükseköğretim Kanunu İle Bazı Kanun Ve Kanun Hükmünde Kararnamelerde Değişiklik Yapılması Hakkında Kanun” ile 18.06.2018 tarihli “Lisansüstü Tezlerin Elektronik Ortamda Toplanması, Düzenlenmesi ve Erişime Açılmasına İlişkin Yönerge” gereğince tam metin erişime açılmıştır

    A character recognizer for Turkish language

    No full text
    This paper presents particularly a contextual post processing subsystem for a Turkish machine printed character recognition system. The contextual post processing subsystem is based on positional binary 3-gram statistics for Turkish language, an error corrector parser and a lexicon, which contains root words and the inflected forms of the root words. Error corrector parser is used for correcting CR alternatives using Turkish Morphology
    corecore