7 research outputs found
The Caucasian language material in Evliya Çelebi’s "Travel Book" : a revision
When in 1934, Robert BLEICHSTEINER published the Caucasian language specimina contained in the "travel book" of the 17th century Turkish writer Evliya Çelebi , he was struck by the amount of reliability he found in Evliya’s notations: "(Die Sprachproben) sind, von einzelnen Mißverständnissen abgesehen, und wenn man die falschen Punktierungen und Irrtümer der Kopisten abrechnet, außerordentlich gut, ja zuweilen mit einem gewissen phonetischen Geschick wiedergegeben, was der Auffassungsgabe und dem Eifer Evliyas ein hohes Zeugnis ausstellt. Man muß bedenken, wie schwer das arabische Alphabet, ohne weitere Unterscheidungszeichen, wie sie die islamischen Kaukasusvölker anwenden, die verwickelten, oft über 70 verschiedene Phoneme umfassenden Lautsysteme wiederzugeben imstande ist. Wenn trotzdem die Entzifferung der Sprachproben zum größten Teil geglückt ist, so muß man der ungewöhnlichen Begabung des türkischen Reisenden und Gelehrten schrankenlose Bewunderung zollen" (85). ..
Computational Analysis of Morphosyntactic Categories in Georgian.
This thesis describes the development of part-of-speech tagging resources for the Georgian language, consisting of i.) a new morphosyntactic language model for part-of-speech (POS) tagging purposes; ii.) tagging guidelines for tagging and post-editing; iii.) the KATAG tagset and iv.) the trained parameter files the probabilistic TreeTagger program needs to work on Georgian texts.
A new morphosyntactic model of Georgian for part-of-speech tagging purposes is described in the thesis. The thesis also describes a tagset (KATAG) defined in accordance with a new morphosyntactic model of the language and a set of design principles and tagging guidelines.
A stochastic methodology is used here to perform tagging in Georgian. Namely, the Treetagger - a probabilistic part-of-speech tagging program has been trained on Georgian texts. The justification for this choice is discussed. I use two tokenisation approaches in part-of-speech tagging. An accuracy of 92.41% using an enclitic tokenisation approach and accuracy of 87.13% was achieved using a non-enclitic tokenisation approach, corroborating my hypothesis that treating enclitic elements separately from the host words results in better tagging performance.
To make the tagger program easily adaptable for a range of inputs (type, variety or genre of text), the performance of the probabilistic TreeTagger program was evaluated according to the obtained test set consisting of five different genres such as academic, informal, legal, fiction and news
Reconstructing word order in Proto-Germanic: A comparative Branching Direction Theory (BDT) analysis of Old Saxon
286 p.Tesi honen helburua germaniar hizkuntza guztiek amankomunean daukaten arbasoaren hitzordenaberreraikitzea da. Horretarako orain arte egin diren saiakerekin zerikusia duen eta aldiberean berritzailea den hurbilpena egiten du autoreak: Adarkatze Norabide Teorian (BranchingDirection Theory) (Dryer, 1992) oinarritutako ikerketa da. Teoria hau hitz-ordenaren unibertsaltipologikoen inguruan egindako ikerketaren ondorioa da. Gainera, erabiltzen diren datuetatikasko oso gutxi aztertutako germaniar hizkuntza batetik atereak dira, sajoiera zaharretik, hainzuzen ere. Emaitzek orain arteko ikerketaren aurkikuntzak hobetzen dituzte
Reconstructing word order in Proto-Germanic: A comparative Branching Direction Theory (BDT) analysis of Old Saxon
286 p.Tesi honen helburua germaniar hizkuntza guztiek amankomunean daukaten arbasoaren hitzordenaberreraikitzea da. Horretarako orain arte egin diren saiakerekin zerikusia duen eta aldiberean berritzailea den hurbilpena egiten du autoreak: Adarkatze Norabide Teorian (BranchingDirection Theory) (Dryer, 1992) oinarritutako ikerketa da. Teoria hau hitz-ordenaren unibertsaltipologikoen inguruan egindako ikerketaren ondorioa da. Gainera, erabiltzen diren datuetatikasko oso gutxi aztertutako germaniar hizkuntza batetik atereak dira, sajoiera zaharretik, hainzuzen ere. Emaitzek orain arteko ikerketaren aurkikuntzak hobetzen dituzte
Recommended from our members
Nostratic Dictionary
A revised edition can be found at http://www.dspace.cam.ac.uk/handle/1810/244080.Aharon Dolgopolsky is the leading authority on the Nostratic macrofamily. His 'Nostratic Dictionary' presented here is, of course, something very much more than a dictionary. It is the most thorough and extensive demonstration and documentation so far of what may be termed the Nostratic hypothesis: that several of the world's best- known language families are related in their origin, their grammar and their lexicon, and that they belong together in a larger unit, of earlier origin, the Nostratic macrofamily. It should at once be noted that several elements of this enterprise are controversial. For while the Nostratic hypothesis has many supporters, it has been criticized on rather fundamental grounds by a number of distinguished linguists. The matter was reviewed some years ago in a symposium held at the McDonald Institute, and positions remain very much polarized. It was a result of that meeting that the decision was taken to invite Aharon Dolgopolsky to publish his Dictionary - a much more substantial treatise than any work hitherto undertaken on the subject - at the McDonald Institute. For it became clear that the diversities of view expressed at that symposium were not likely to be resolved by further polemical exchanges. Instead, a substantial body of data was required, whose examination and evaluation could subsequently lead to more mature judgments. Those data are presented here, and that more mature evaluation can now proceed.McDonald Institute for Archaeological Research
Alfred P. Sloan Foundatio