Search CORE

36 research outputs found

Croatian Language Resources for NooJ

Author: Božo Bekavac
Kristina Vučković
Marko Tadić
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2010
Field of study

This paper presents the Croatian module for NooJ. The module includes the novel “Posljednji Stipančići” by Vjenceslav Novak as a corpus with fully covered dictionary (i.e. zero unknowns). Examples of morphological and syntactic grammars are presented together with few examples of dictionary entries and their inflectional and derivational paradigms

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Hrvatski poredbeni idiomi: MWU pristup

Author: Kocijan Kristina
Librenjak Sara
Publication venue: Tradulex
Publication date: 01/01/2016
Field of study

This article presents the work aiming to describe comparative idioms in Croatian language for computational processing using NooJ linguistic environment. As a part of a larger project concentrated on annotating and extracting different Croatian idioms as multi-word units (MWUs), this work aims to present automated comparative idiom search in any Croatian text. Using NooJ environment, a user can find any comparative structure in a text and use it for translation, language learning or research purposes

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Comparative Analysis of Automatic Term and Collocation Extraction

Author: Crnec Dina
Dalbelo Bašić Bojana
Delač Davor
Seljan Sanja
Šamec-Gjurin Matija
Šnajder Jan
Publication venue: Department of Information Sciences, Faculty of Humanities and Social Sciences, University of Zagreb
Publication date: 01/11/2009
Field of study

Monolingual and multilingual terminology and collocation bases, covering a specific domain, used independently or integrated with other resources, have become a valuable electronic resource. Building of such resources could be assisted by automatic term extraction tools, combining statistical and linguistic approaches. In this paper, the research on term extraction from monolingual corpus is presented. The corpus consists of publicly accessible English legislative documents. In the paper, results of two hybrid approaches are compared: extraction using the TermeX tool and an automatic statistical extraction procedure followed by linguistic filtering through the open source linguistic engineering tool. The results have been elaborated through statistical measures of precision, recall, and F-measure

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Improved Parser for Simple Croatian Sentences

Author: Bekavac Božo
Dovedan Zdravko
Vučković Kristina
Publication venue: Komotini
Publication date
Field of study

In this paper, authors will present the work that has been done to improve the existing syntactic parser. This work is a continuation of the work presented at the NooJ 2009 conference. We will show and explain the grammar for detecting nominal predicate in a simple sentence. The nominal predicate in Croatian language is made of the auxiliary verb ‘to be’ and an in Nominative case. The can be a complex made of a single noun and any number of adjectives, pronouns and numbers proceeding that noun and agreeing with it in number, gender and case, but also a single noun, a single pronoun, a single adjective or even an adverb. A problem of coordination of two or more nodes of different gender and its agreement with the main verb in the cases where coordination is a subject of a sentence will be discussed. The work will further enlight and discuss other important properties of Croatian sentence complexity. At the end of the paper, the results will be evaluated through precision, recall and f-measure to show the adequacy of the model

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

AmAMorph: Finite State Morphological Analyzer for Amazighe

Author: Driss Aboutajdine
Fatima Zahra Nejme
Siham Boulaknadel
Publication venue: 'Faculty of Electrical Engineering and Computing, Univ. of Zagreb'
Publication date: 01/01/2016
Field of study

This paper presents AmAMorph, a morphological analyzer for Amazighe language using a system based on the NooJ linguistic development environment. The paper begins with the development of Amazighe lexicons with large coverage formalization. The built electronic lexicons, named ‘NAmLex’, ‘VAmLex’ and ‘PAmLex’ which stand for ‘Noun Amazighe Lexicon’, ‘Verb Amazighe Lexicon’ and ‘Particles Amazighe Lexicon’, link inflectional, morphological, and syntacticsemantic information to the list of lemmas. Automated inflectional and derivational routines are applied to each lemma producing over inflected forms. To our knowledge,AmAMorph is the first morphological analyzer for Amazighe. It identifies the component morphemes of the forms using large coverage morphological grammars. Along with the description of how the analyzer is implemented, this paper gives an evaluation of the analyzer

Directory of Open Access Journals

The Adventures of Hlapić in Burgenland Croatian

Author: Katarina Alardović Slovaček
Smiljana Narančić Kovač
Publication venue: 'University of Zadar'
Publication date: 01/01/2020
Field of study

The paper presents the results of a digital comparative text analysis of the Croatian original and the Burgenland editions of a children’s classic performed in combination with research methods of Translation Studies. The Croatian children’s novel of 1913, Čudnovate zgode šegrta Hlapića [The Strange Adventures of Hlapić the Apprentice] by Ivana Brlić-Mažuranić (1874–1938), appeared in Burgenland Croatian in 1960 and again, with minor alterations, in 2000. Burgenland Croatian is the language of the Croatian minority predominantly positioned in Austria, considered to be a regional variant of Croatian. These two languages are similar, but they still differ in structural and semantic elements as they have been separately developing since the 15th century. The similarities allowed for a digital comparative text analysis of the linguistic aspects of source and target texts, including their linguistic complexity. The results of the digital analysis demonstrate the applicability of digital linguistics methodology in analyzing translated and rewritten literary texts when source and target language idioms are similar, especially in determining the stylistic differences between source and target texts. The results of the analysis of culture-specific items rendered in two target texts, as compared to the original, indicate there exist not many differences on the language text levels between the analyzed source and target texts, yet some discrepancies between the two editions of the translation into the Burgenland Croatian have been detected, and thus explained in the historical and cultural context of their appearance

Directory of Open Access Journals

Towards Parsing Croatian Complex Sentences: Dependent Noun Clauses

Author: Dovedan Zdravko
Vučković Kristina
Štefanec Vanja
Publication venue: Komotini
Publication date: 01/01/2011
Field of study

In this paper, authors will present methods for parsing Croatian complex sentences in which a dependent clause serves as a direct object to the main verb. This research is based on the resources that have already been developed for parsing simple Croatian sentences. So far, sentences that we were able to parse using these resources are of the basic structure consisting of a subject, verb, direct and indirect object, adverbial of time and place. Methods we shall present in this paper will extend this structure to the following sentence structure > and, although quite rare and stylistically marked, to the structure . Our primary indicator for this type of sentence will be the absence of the required direct object in the main clause as well as the presence of one of the subordinating conjunctions (‘da’, ‘kako’) or complementizers (relative pronoun, adverb of place, time, cause or manner). Since this type of complex sentences is very common in Croatian language, we believe that this research will be a valuable contribution to Croatian module for NooJ. At the end of the paper, we will evaluate the adequacy of the model through precision, recall and f-measure

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb