Search CORE

47 research outputs found

Kako bojimo svijet riječima

Author: Kocijan Kristina
Publication venue: 'Hrvatsko filolosko drustvo (Croatian Philological Society)'
Publication date: 01/01/2022
Field of study

Th is paper presents a computational approach to the automatic detection of language patterns, specifi cally those dealing with expressing colors in the Croatian language. It investigates diff erent lexicalization patterns of color terms, mainly compounds and multiword units, in order to classify them and prepare them for usage in the design of an algorithm that will automatically recognize and annotate these expressions in Croatian text. Th e paper also presents a comparative analysis of diff erent classes of color terms found in a corpus built from books intended for younger (CLC) and older (ALC) populations. Finally, the research data is presented through a dictionary of three types of color terms categorized as multiword expressionsU radu je dan sveobuhvatan prikaz različitih obrazaca koji se koriste u terminologiji boja u hrvatskom jeziku i koji su do sada opisani kroz objavljena istraživanja u ovom području. U fokusu je prikaz iz računalnog pristupa automatskom otkrivanju leksičkih obrazaca. Svrha predstavljenog istraživanja je defi nirati postojeće modele za izgradnju izraza o boji u hrvatskom jeziku, s posebnim naglaskom na složenice i višerječne izraze te implementacija prepoznatih modela u računalnoj obradi jezika. Analiza i defi niranje različitih modela na osnovu postojeće literature za boje u hrvatskom jeziku imala je za cilj njihovu klasifi kaciju i pripremu za uporabu u računalnoj obradi jezika. U ovoj su fazi defi nirana 4 osnovna uzorka sa svojim pod–klasama. Ovako defi nirani leksikalizirani obrasci korišteni su unutar NooJ alata za obradu jezika gdje su omogućili izradu (a) digitalnog rječnika s popisom osnovnih boja i opisom njihovih derivacija te (b) računalnog algoritma za automatsko prepoznavanje i označavanje boja u hrvatskom jeziku i pripadajućih oznaka klase. U radu je dodatno predstavljena usporedna analiza različitih klasa izraza za boje pronađenih u korpusu izgrađenom iz knjževnih djela namijenjenih mlađoj (CLC) i starijoj (ALC) populaciji kako bi se dobili dodatni uvidi o korištenju određenog obrasca ovisno o uzorku teksta nad kojim se radi analiza. Podaci istraživanja dani su i kroz tablični prikaz tri tipa izraza za boju u klasi višerječnih izraza. Pripremljeni resursi otvaraju mogućnost dodatnih analiza tekstova iz drugih domena i s novim istraživačkim interesima koji uključuju boje u računalnoj obradi jezik

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Data Quality in the Context of Longitudinal Research Studies

Author: Carić Tonko
Kocijan Kristina
Publication venue: 'Faculty of Humanities and Social Sciences, University of Zagreb'
Publication date: 01/11/2019
Field of study

This paper discusses the concept of data quality in the context of longitudinal research. By deconstructing quality assurance process and data collection strategies through a case study of the “Croatian Birth Cohort Study“, we try to define causes and sources of poor data quality in the context of longitudinal studies. Besides the problems discussed throughout the known literature (panel conditioning, sample attrition, recall bias, temporal and financial demands), we introduce singlesource problems, multi-source problems, security problems, design questionnaire problems and QA workflow problems as important aspects in the domain of the possible sources of errors. Additionaly we propose models for eliminating the errors through prevention and detection in order to improve data quality

Open Repository of the University of Zagreb Faculty of Humanities and Social Sciences

Crossref

University of Zagreb Repository

Story of a 'Storyline Visualization' in High School Readings

Author: Kocijan Kristina
Osmakčić Katarina
Publication venue: Croatian Society for Information and Communication Technology, Electronics and Microelectronics - MIPRO
Publication date: 01/01/2017
Field of study

Storyline visualization, as a process of illustrating data that has a course of events via a visual medium, has been used in the area of film making for a very long time. Not so long ago, it has moved from the paper version to the digital word allowing for a wider usage. In this paper we propose its usage as a teaching tool in the area of literature reading for the Croatian class (primary language). We have conducted a preliminary research in five Croatian high schools of a different profile to see how storyline visualization, and visualization of school materials in general, affects students understanding of the material being studied. Each school participated with two groups of students where one group was exposed to the storyline visualization of a novel Prokleta avlija by Ivo Andrić [N=103 in total] during the reading period, and the other one was reading without the visualization [N=93 in total]. We will present our results taking into account students’ gender and type of a school

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Digitalni arhiv Filozofskog fakulteta u Zagrebu

Hrvatski poredbeni idiomi: MWU pristup

Author: Kocijan Kristina
Librenjak Sara
Publication venue: Tradulex
Publication date: 01/01/2016
Field of study

This article presents the work aiming to describe comparative idioms in Croatian language for computational processing using NooJ linguistic environment. As a part of a larger project concentrated on annotating and extracting different Croatian idioms as multi-word units (MWUs), this work aims to present automated comparative idiom search in any Croatian text. Using NooJ environment, a user can find any comparative structure in a text and use it for translation, language learning or research purposes

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Digitalni arhiv Filozofskog fakulteta u Zagrebu

Big Data: how we got to the BigData and where are they taking us

Author: Kocijan Kristina
Publication venue: Zavod za informacijske studije
Publication date: 01/01/2014
Field of study

Količina informacija nastala u razmaku od otprilike 1200 godina, od osnivanja Carigrada pa do otkrića Gutenbergova tiskarskoga stroja, udvostručila se tek nakon 50 godina. Danas postojeću količinu informacija udvostručimo svake 3 godine pa je već mjerimo u eksabajtima. Tako velike količine podataka promijenile su i način na koji koristimo, ali i obrađujemo podatke. Sa sigurnošću možemo reći da smo u tijeku jedne nove velike revolucije koja ima i svoje prigodno ime Big Data – Veliki podatci. Iako su termin osmislili znanstvenici iz područja poput astronomije i genomije, Veliki podatci su posvuda. Oni su istovremeno i resurs i alat čiji je glavni zadatak informiranje. Ali, koliko god nam mogu pomoći bolje razumjeti svijet oko nas, ovisno o tome kako se njima upravlja i tko njima upravlja, mogu nas odvesti i u nekome drugome smjeru. Iako nam se brojke koje se vežu uz Velike podatke mogu u ovom trenutku činiti enormnima, moramo biti svjesni činjenice da će količina onoga što možemo prikupiti i obraditi uvijek biti samo djelić informacija koje zaista postoje na svijetu (i oko njega). No, od nečega moramo početi

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Digitalni arhiv Filozofskog fakulteta u Zagrebu

Invasive species of algae in the Adriatic sea

Author: Kocijan Kristina
Publication venue: University of Zagreb. Faculty of Science. Department of Biology.
Publication date: 01/01/2014
Field of study

Zadnjih par desetljeća Sredozemno more, a time i naš Jadran ugrožen je dolaskom novih invazivnih vrsta. Nakon prokopa Sueskog kanala mnogim organizmima otvorio se put prema Sredozemlju, a oni su tu priliku iskoristili kako bi naselili nova staništa i došli do novih izvora hrane. U ovom su radu obrađene tri vrste algi koje se velikom brzinom šire podmorjem Jadranskog mora, Caulerpa taxifolia, Caulerpa racemosa i Womersleyella setacea. Opisan je habitus svake alge, njezin način razmnožavanja, područja gdje se je može naći te kakav utjecaj ima na ostale organizme. Izložena je ideja o biološkoj kontroli koju znanstvenici proučavaju zadnjih par godina, a još će toliko i proći do njezine realizacije. Potrebno je da sve zemlje Sredozemlja ulože zajedničke snage kako bi stale na kraj negativnom utjecaju algi i pronašle učinkovito rješenje za smanjenje njezina širenja. Na taj način autohtone zajednice Sredozemnog mora pa i Jadrana biti će očuvane.Over the last few decades the Mediterranean Sea, and thus our Adriatic Sea, is severely affected by the arrival of new invasive species. After Suez Canal was dug through, lots of organisms got the opportunity to populate new habitats and to find new food sources in the Mediterranean Sea. Herein are presented three species of algae that are spreading very fast over the seabed of Adriatic Sea, Caulerpa taxifolia, Caulerpa racemosa and Womersleyella setacea. The habitus of every algae, its way of reproduction, area that occupies and the affect that has on other organisms is described. The idea of biological control on which scientists are working on for a few years now is also described. It will also take a few years to realize it. All Mediterranean countries should synergy to stop the negative influences of algae and to find effective solution to reduce their expansion rates. Thus native communities of the Mediterranean and Adriatic Sea would be preserved

Repository of Faculty of Science, University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Molecular phylogenetic and phylogeographic analysis of Ancylus fluviatilis O. F. Müller, 1774 (Gastropoda: Planorbidae) in Croatia

Author: Kocijan Kristina
Publication venue: University of Zagreb. Faculty of Science. Department of Biology.
Publication date: 09/11/2016
Field of study

Ancylus fluviatilis O. F. Müller, 1774 slatkovodni je puž iz porodice Planorbidae široko rasprostranjen u Hrvatskoj. A. fluviatilis najvjerojatnije predstavlja kompleks četiri genetski i reproduktivno izolirane kriptične vrste koje zasad još nisu formalno opisane. Cilj ovog istraživanja bio je, na temelju analize dva genska biljega, mitohondrijskih gena za COI i 16S rRNA, razotkriti koje su kriptične vrste A. fluviatilis kompleksa prisutne u Hrvatskoj i kakvo je njihovo rasprostranjenje u našim vodotocima te utvrditi molekularno-filogenetske odnose i genetske udaljenosti između populacija kao i filogeografski uzorak kompleksa na našem području. Utvrđeno je da Hrvatsku nastanjuju bar tri vrste A. fluviatilis kompleksa, široko rasprostranjen Ancylus sp. B, te lokalno prisutan Ancylus sp. C i A. fluviatilis sensu stricto čije rasprostranjenje je u skladu s općom filogeografskom slikom kompleksa. Filogeografski uzorak Ancylus sp. B u Hrvatskoj uglavnom se ne može objasniti prirodnim rasprostranjenjem i geografskim barijerama toku gena, nego pasivnim transportom i klimatskim značajkama područja koje nastanjuju.Ancylus fluviatilis O. F. Müller, 1774 is a freshwater snail from family Planorbidae widespread in Croatia. A. fluviatilis most likely represents a complex of four genetically and reproductively isolated cryptic species that are not currently formally described. The aim of this study was, based on the analysis of two genetic markers, mitochondrial genes COI and 16S rRNA, to determine which cryptic species of A. fluviatilis complex are present in Croatia, their distribution, molecular phylogenetic relationships and genetic distance between populations as well as phylogenetic pattern that complex exhibits in our area. It was found that Croatia is inhabited with at least three species of A. fluviatilis complex, the widespread Ancylus sp. B, and locally present Ancylus sp. C and A. fluviatilis sensu stricto whose distribution is in accordance with the general phylogeographic picture of the complex. Observed phylogenetic pattern of Ancylus sp. B in Croatia mostly can not be explained by natural distribution and geographical barriers to gene flow, but rather to passive transport and climatic features of the area they inhabite

Repository of Faculty of Science, University of Zagreb

University of Zagreb Repository

Croatian Digital Thesis Repository

Scholarly reference trees

Author: Kocijan Kristina
Poljak Dario
Požega Marko
Publication venue: 'University of Zadar'
Publication date: 01/01/2016
Field of study

In this paper, we propose, explain and implement bibliometric data analysis and visualization model in a web environment. We use NLP syntactic grammars for pattern recognition of references used in scholarly publications. The extracted information is used for visualizing author egocentric data via tree like structure. The ultimate goal of this work is to use the egocentric trees for comparisons of two authors and to build networks or forests of different trees depending on the forest’s attributes. We have stumbled upon many different problems ranging from exceptions in citation style structures to optimization of visualization model in order to achieve an optimal user experience. We will give a summary of our grammars’ restrictions and will provide some ideas for possible future work that could improve the overall user experience. The proposed trees can function by themselves, or they can be implemented in digital repositories of libraries and different types of citation databases

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Crossref

Directory of Open Access Journals

Digitalni arhiv Filozofskog fakulteta u Zagrebu

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Building Scholarly Data Forest

Author: Kocijan Kristina
Poljak Dario
Požega Marko
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

In this paper, we will demonstrate syntactic analysis and visualization of scientific data, namely references from scientific papers. Our main goal is to build a parser which could extract references from scientific papers, convert them to XML format, send to custom visualization algorithm and present in a web interface as a ReferenceTree for a single author. For this process, we use several different technologies such as NLP software NooJ, programming languages PHP and JavaScript in combination with HTML5. Our main problem was dissimilarity in reference styles between articles. Thus, our parser was designed to recognize different reference source (book, paper, web page) in APA, MLA and Chicago reference styles. As for the visualization idea, we have chosen the concept of presenting an author as a tree, the publication years as the main branches, the articles/books as twigs and references used in each article/book as the leaves. The books are grouped on the left side of the tree while the articles are grouped on the right side. With final output, every processed author should have a unique tree (preferences of references) and could be compared with the rest of the scientific forest

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Crossref

Digitalni arhiv Filozofskog fakulteta u Zagrebu

Improving Students' Language Performance Through Consistent Use of E-Learning: An Empirical Study in Japanese, Korean, Hindi and Sanskrit

Author: Janjić Marijana
Kocijan Kristina
Librenjak Sara
Publication venue: 'University of Ljubljana'
Publication date: 01/01/2016
Field of study

This paper describes the backing theories, methodology, and results of a two-semester long case study of the application of technology in teaching four Asian languages (Japanese, Korean, Hindi, and Sanskrit) to Croatian students. We have developed e-learning materials to follow the curriculum in Croatia and deployed them in Asian language classrooms. Students who agreed to participate in the study were tested before using the materials, and after each semester, and their progress was surveyed. In the case of Japanese students (N=53), we have thoroughly monitored their usage and compared the progress of students who have diligently studied vocabulary and grammar using our materials on Memrise, and those who have neglected their studies. This was measured through their scores on the Memrise, which shows the user's activity. Also, their progress was measured using standardized tests that were designed in such a manner to resemble Japanese Language Proficiency Test. We have found that frequent users progressed averagely 20,3% after each semester, while non-frequent users have progressed only 11,6%, proving this method to be related to stable and constant use of e-materials

Repozitorij Filozofskog fakulteta u Zagrebu' at University of Zagreb

Crossref

Directory of Open Access Journals

Journals of Faculty of Arts, University of Ljubljana

York St John University Institutional Repository

Digitalni arhiv Filozofskog fakulteta u Zagrebu