    CroDeriV and the morphological analysis of Croatian verb

    U radu se prikazuje izrada leksikona hrvatskih glagola CroDeriV i teorijske postavke na kojima ona počiva. CroDeriV je računalni leksikon koji sadržava podatke o morfoloÅ”koj strukturi gotovo 14.000 hrvatskih glagola. U prvom dijelu članka prikazuju se postupci morfoloÅ”ke analize glagola i međusobnoga povezivanja glagola s istim korijenom. Glagoli su u prvoj fazi izrade CroDeriVā€“a automatski segmentirani s pomoću pravila. U drugoj fazi rezultati segmentacije i svođenja na isti korijen ručno su provjereni. U drugome dijelu članka obrazlaže se uopćeni prikaz morfoloÅ”ke strukture hrvatskoga glagola utemeljen na podatcima dobivenima iz CroDeriVā€“a, pri čemu se posebna pozornost pridaje vrstama, funkciji i značenju sufikasa. Naposljetku se iznose empirijski podatci o mogućim kombinacijama afikasa koji sudjeluju u tvorbi hrvatskih glagola, kao i o njihovoj frekvenciji utvrđenoj analizom glagola iz CroDeriVā€“a.The paper discusses the procedures in building of CroDeriV, the computational lexicon containing data on morphological structure of Croatian verbs. CroDeriV is the first morphological resource dealing with derivational phenomena of Croatian. In the first part of the paper, after the motivation for building this kind of lexicon and the brief overview of the existing morphological resources for Croatian, procedures for morphological segmentation of verbs in CroDeriV are presented. Each verb in CroDeriV is segmented into lexical and derivational morphemes. Verbs of the same root are mutually linked. This procedure enables the recognition of derivationally related families of verbs and, at the same time, the detection of full derivational spans of particular base forms. The second part of the paper focuses on the morphological structure of Croatian verbs based on the analysis of almost 14000 verbal lemmas currently included in CroDeriV. The analysis enabled the recognition of generalized morphological structure applicable to all Croatian verbs. It consists of four slots for derivational prefixes and three slots for derivational suffixes on each side of a lexical morpheme, and these slots are provided for every verbal lemma in CroDeriV. This structure is compared with other approaches dealing with morphology of Croatian verbs. The three suffixal slots and their semantics are explained in more detail, since this kind of segmentation has so far not been introduced in Croatian morphological literature. First suffixal slot comprises suffixes with specialized meanings (e.g. diminutive, pejorative), second slot suffixes with aspectual meaning, and third slot suffixes denoting conjugational class. The final part of the paper describes attested combinations of derivational affixes in CroDeriV and indicates the frequency of their occurrence

    Linguographic planning and e-literacy in Croatian language

    U disertaciji je istražena pravopisna standardizacija i čimbenici pravopisnih reformi u odabranim južnoslavenskim i drugim europskim jezicima koji su tipoloÅ”ki i kategorijalno važni za hrvatski jezik. Zbog deficitarnosti sličnih analiza o hrvatskome jeziku, disertacija sadržava dva ciljana anketna istraživanja prosječne pravopisne pismenosti (526 hrvatskih studenata tehničkoga smjera) i miÅ”ljenja o pravopisnoj standardizaciji i jezičnoj politici za hrvatski jezik (2000 članova hrvatske akademske zajednice). Opisuje se suvremena povijest kroatističke pravopisne norme i jezičnopolitička situacija u Hrvatskoj danas. Popisuje se i raŔčlanjuje 25 pravopisnih nerazumijevanja, predrasuda i zabluda o pismenosti na hrvatskome jeziku. Provodi se bibliometrijsko istraživanje hrvatskih pravopisnih priručnika od 1639. do 2014. i uspostavljaju se pravopisne razvojne faze. Opisuje se jezična elektronička pismenost i Å”iri kontekst jezika i računala. Tradicionalna lingvistička podjela jezika i govora promijenjena je u trojstvo govorne, zapisne i značenjske strane prema čemu se razvio jezikopisni opis hrvatskoga jezika. Istražujući kontekst hrvatskoga jezika prema računalstvu, jezičnoj e-pismenosti i mjeri u kojoj se jezik prilagođava informacijskome druÅ”tvu, disertacija je obrazlagala potrebu za zelenom knjigom o razvoju pismenosti iz čega bi proiziÅ”la strategija razvoja hrvatskoga standardnog jezika s definiranom jezičnom politikom i mjerljivim kriterijima uspjeha. Normiranje jezika mora voditi brigu i o suvremenoj pisanosti i pismenosti te uključiti računalnojezične aspekte u jezičnu politiku.In this dissertation I have researched the orthographic standardization and the factors of orthography reforms in selected South Slavic and other European languages which are typologically and categorically important for the Croatian language. Due to deficiency of similar analyses of Croatian, the dissertation contains two targeted surveys of the average orthographic literacy (526 Croatian polytechnic students) and of the opinion on Croatian orthographic standardization and language policy (2000 members of the Croatian academic society). The contemporary history of the Croatian orthography standard and the language policy situation in Croatia today are described. 25 instances of orthographic misunderstandings, prejudice, and misconceptions about literacy in Croatian are listed and analyzed. A bibliometric investigation of Croatian orthography manuals from 1639 till 2014 is provided and orthography development phases are established. The linguistic e-literacy and the broader context of language and computers are described. The traditional linguistic division into langue and speech has been changed into the triad of speech, writing and meaning, pursuant to which the linguographic description of Croatian developed. Investigating the context of the Croatian language with regard to computing, linguistic e-literacy and the extent to which language is adapting to the information society, the dissertation illuminates the need for a green book on the development of literacy, which would give rise to a development strategy for standard Croatian with a defined language policy and measurable success criteria. The standardization of language has to take into account contemporary writing and literacy and include the computational linguistics and digital media aspects into the language policy