7 research outputs found

    The Caucasian language material in Evliya Çelebi’s "Travel Book" : a revision

    Get PDF
    When in 1934, Robert BLEICHSTEINER published the Caucasian language specimina contained in the "travel book" of the 17th century Turkish writer Evliya Çelebi , he was struck by the amount of reliability he found in Evliya’s notations: "(Die Sprachproben) sind, von einzelnen Mißverständnissen abgesehen, und wenn man die falschen Punktierungen und Irrtümer der Kopisten abrechnet, außerordentlich gut, ja zuweilen mit einem gewissen phonetischen Geschick wiedergegeben, was der Auffassungsgabe und dem Eifer Evliyas ein hohes Zeugnis ausstellt. Man muß bedenken, wie schwer das arabische Alphabet, ohne weitere Unterscheidungszeichen, wie sie die islamischen Kaukasusvölker anwenden, die verwickelten, oft über 70 verschiedene Phoneme umfassenden Lautsysteme wiederzugeben imstande ist. Wenn trotzdem die Entzifferung der Sprachproben zum größten Teil geglückt ist, so muß man der ungewöhnlichen Begabung des türkischen Reisenden und Gelehrten schrankenlose Bewunderung zollen" (85). ..

    Computational Analysis of Morphosyntactic Categories in Georgian.

    Get PDF
    This thesis describes the development of part-of-speech tagging resources for the Georgian language, consisting of i.) a new morphosyntactic language model for part-of-speech (POS) tagging purposes; ii.) tagging guidelines for tagging and post-editing; iii.) the KATAG tagset and iv.) the trained parameter files the probabilistic TreeTagger program needs to work on Georgian texts. A new morphosyntactic model of Georgian for part-of-speech tagging purposes is described in the thesis. The thesis also describes a tagset (KATAG) defined in accordance with a new morphosyntactic model of the language and a set of design principles and tagging guidelines. A stochastic methodology is used here to perform tagging in Georgian. Namely, the Treetagger - a probabilistic part-of-speech tagging program has been trained on Georgian texts. The justification for this choice is discussed. I use two tokenisation approaches in part-of-speech tagging. An accuracy of 92.41% using an enclitic tokenisation approach and accuracy of 87.13% was achieved using a non-enclitic tokenisation approach, corroborating my hypothesis that treating enclitic elements separately from the host words results in better tagging performance. To make the tagger program easily adaptable for a range of inputs (type, variety or genre of text), the performance of the probabilistic TreeTagger program was evaluated according to the obtained test set consisting of five different genres such as academic, informal, legal, fiction and news

    Tibetan

    Get PDF

    Reconstructing word order in Proto-Germanic: A comparative Branching Direction Theory (BDT) analysis of Old Saxon

    Get PDF
    286 p.Tesi honen helburua germaniar hizkuntza guztiek amankomunean daukaten arbasoaren hitzordenaberreraikitzea da. Horretarako orain arte egin diren saiakerekin zerikusia duen eta aldiberean berritzailea den hurbilpena egiten du autoreak: Adarkatze Norabide Teorian (BranchingDirection Theory) (Dryer, 1992) oinarritutako ikerketa da. Teoria hau hitz-ordenaren unibertsaltipologikoen inguruan egindako ikerketaren ondorioa da. Gainera, erabiltzen diren datuetatikasko oso gutxi aztertutako germaniar hizkuntza batetik atereak dira, sajoiera zaharretik, hainzuzen ere. Emaitzek orain arteko ikerketaren aurkikuntzak hobetzen dituzte

    Reconstructing word order in Proto-Germanic: A comparative Branching Direction Theory (BDT) analysis of Old Saxon

    Get PDF
    286 p.Tesi honen helburua germaniar hizkuntza guztiek amankomunean daukaten arbasoaren hitzordenaberreraikitzea da. Horretarako orain arte egin diren saiakerekin zerikusia duen eta aldiberean berritzailea den hurbilpena egiten du autoreak: Adarkatze Norabide Teorian (BranchingDirection Theory) (Dryer, 1992) oinarritutako ikerketa da. Teoria hau hitz-ordenaren unibertsaltipologikoen inguruan egindako ikerketaren ondorioa da. Gainera, erabiltzen diren datuetatikasko oso gutxi aztertutako germaniar hizkuntza batetik atereak dira, sajoiera zaharretik, hainzuzen ere. Emaitzek orain arteko ikerketaren aurkikuntzak hobetzen dituzte

    Труды

    Get PDF
    corecore