Search CORE

70 research outputs found

Rječnik suvremenoga slovenskog jezika: od slovenske leksičke baze do digitalne rječničke baze

Author: Polona Gantar
Publication venue: 'Institute of Croatian Language and Linguistics'
Publication date: 01/01/2020
Field of study

The ability to process language data has become fundamental to the development of technologies in various areas of human life in the digital world. The development of digitally readable linguistic resources, methods, and tools is, therefore, also a key challenge for the contemporary Slovene language. This challenge has been recognized in the Slovene language community both at the professional and state level and has been the subject of many activities over the past ten years, which will be presented in this paper. The idea of a comprehensive dictionary database covering all levels of linguistic description in modern Slovene, from the morphological and lexical levels to the syntactic level, has already formulated within the framework of the European Social Fund’s Communication in Slovene (2008-2013) project; the Slovene Lexical Database was also created within the framework of this project. Two goals were pursued in designing the Slovene Lexical Database (SLD): creating linguistic descriptions of Slovene intended for human users that would also be useful for the machine processing of Slovene. Ever since the construction of the first Slovene corpus, it has become evident that there is a need for a description of modern Slovene based on real language data, and that it is necessary to understand the needs of language users to create useful language reference works. It also became apparent that only the digital medium enables the comprehensiveness of language description and that the design of the database must be adapted to it from the start. Also, the description must follow best practices as closely as possible in terms of formats and international standards, as this enables the inclusion of Slovene into a wider network of resources, such as Open Linked Data, babelNet and ELExIS. Due to time pressures and trends in lexicography, procedures to automate the extraction of linguistic data from corpora and the inclusion of crowdsourcing into the lexicographic process were taken into consideration. Following the essential idea of creating an all-inclusive digital dictionary database for Slovene, a few independent databases have been created over the past two years: the Collocations Dictionary of Modern Slovene, and the automatically generated Thesaurus of Modern Slovene, both of which also exist as independent online dictionary portals. One of the novelties that we put forward together with both dictionaries is the ‘responsive dictionary’ concept, which includes crowdsourcing methods. Ultimately, the Digital Dictionary Database provides all (other) levels of linguistic description: the morphological level with the Sloleks database upgrade, the phraseological level with the construction of a multi-word expressions lexicon, and the syntactic level with the formalization of Slovene verb valency patterns. Each of these databases contains its specific language data that will ultimately be included in the comprehensive Slovene Digital Dictionary Database, which will represent basic linguistic descriptions of Slovene both for the human and machine user.Ideja sveobuhvatne rječničke baze koja uključuje sve razine jezičnoga opisa suvremenoga slovenskog jezika od morfološke i leksičke do sintaktičke prvotno je formulirana u okviru projekta Sporazumijevanje na slovenskomu jeziku (2008. – 2013.). U cilju ostvarenja ideje o stvaranju sveobuhvatne digitalne rječničke baze stvorene su dvije neovisne baze podataka: Kolokacijski rječnik suvremenoga slovenskoga jezika i automatski generiran Tezaurus modernoga slovenskoga jezika. Jedna od novina u obama rječnicima koncept je responzivnoga rječnika, koji uključuje masovnu podršku. Digitalna rječnička baza sadržava sve razine jezičnoga opisa: morfološku nadograđenu Sloleksom, izraznu s opisom konstrukcija višerječnih jedinica te sintaktičku s formalizacijom modela glagolskih valencija. Svaka od postojećih baza podataka sadržava specifične jezične podatke koji će biti uključeni u sveobuhvatnu Slovensku digitalnu rječničku bazu podataka, koja će sadržavati temeljni jezikoslovni opis slovenskoga jezika čiji korisnici mogu biti ljudi i strojevi

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Leksikalna baza za slovenščino: komu, zakaj in kako (naprej)?

Author: Gantar Polona
Publication venue: 'The Research Center of the Slovenian Academy of Sciences and Arts / Znanstvenoraziskovalni center Slovenske akademije znanosti in umetnosti (ZRC SAZU)'
Publication date: 28/07/2015
Field of study

This article describes the guidelines in the formation of the Slovenian lexical database, especially the issue of various users and the types and manners of structuring lexical and grammatical information in this database. Special emphasis is placed on questions dealing with the scope and selection of lexical units and the arrangement of lexical and grammatical information, while taking into account the premise that information in the lexical database is primarily intended for web applications and modern electronic media.V prispevku so opisane smernice pri oblikovanju leksikalne baze za slovenščino, zlasti vprašanje različnih uporabnikov ter vrste in načina strukturiranja leksikalno-slovničnih podatkov v njej. Posebej so izpostavljene dileme, ki zadevajo določitev obsega in izbora leksikalnih enot ter razporeditev leksikalno-slovničnih podatkov ob upoštevanju predpostavke, da bodo podatki v leksikalni bazi za slovenščino namenjeni primarno spletnim aplikacijam in sodobnim elektronskim medijem

ZRC SAZU Publishing (Znanstvenoraziskovalni center - Slovenske akademije znanosti in umetnosti)

Slovar sodobnega slovenskega jezika: leksikografska tradicija in/ali inovacija

Author: Polona Gantar
Publication venue: 'University of Ljubljana'
Publication date: 01/12/2014
Field of study

Ko je bil konec maja 2013 objavljen Predlog za izdelavo Slovarja sodobnega slovenskega jezika, se je tako na strokovnih forumih kot v medijih razvila debata o tem, ali naj novi slovar slovenskega jezika sledi leksikografski tradiciji, kot se je oblikovala s Slovarjem slovenskega knjižnega jezika, ali naj se od te tradicije oddalji. Ker so se ob tem oblikovali različni pogledi na razumevanje slovarske tradicije kot tudi na vključevanje sodobnih slovarskih praks, želimo v prispevku na podlagi analize zasnove SSKJ in SNB ter s prispevki, ki se kakorkoli nanašajo na koncept bodočega slovarja slovenskega jezika, ugotoviti, katere elemente leksikografske teorije in prakse lahko pojmujemo kot tradicionalne ter katere so predlagane novosti v slovenski leksikografiji. Vzporedno predlagamo tudi zasnovo novega slovarja v ključnih segmentih, tj. z vidika uporabnika, medija in uporabe jezikovnotehnološkega znanja, ki bi zadostila opisu sodobnega slovenskega jezika, ki kar v največji meri zadovoljuje potrebe jezikovne skupnosti v današnjem času in okoliščinah

Directory of Open Access Journals

Journals of Faculty of Arts, University of Ljubljana

Temeljne prvine zasnove frazeološkega slovarja

Author: Polona GANTAR
Publication venue: Slavistično društvo Slovenije
Publication date: 01/01/2002
Field of study

Z analitično-sintetično metodo primerjanja slovarskih rešitev v frazeoloških slovarjih je mogoče izločiti prvine slovarske zasnove, ki jih predvideva celovit slovarski opis frazemske enote. Specifičnost zasnove frazeološkega slovarja, kot je prikazana v članku, upošteva povezanost frazeološkega in frazeografskega sistema. Pregled prvin slovarskih zasnov znotraj posameznih segmentov slovarskega opisa nakazuje možne rešitve tudi za frazeološki slovar slovenskega jezika

Directory of Open Access Journals

Stalne besedne zveze v slovenščini

Author: Gantar Polona
Publication venue: 'The Research Center of the Slovenian Academy of Sciences and Arts / Znanstvenoraziskovalni center Slovenske akademije znanosti in umetnosti (ZRC SAZU)'
Publication date: 01/04/2022
Field of study

Osrednji predmet opazovanja v knjigi Stalne besedne zveze v slovenščini – korpusni pristop so leksikalne enote, ki so praviloma sestavljene iz več kot ene besede, poseben poudarek pa je namenjen njihovi umestitvi v sodobni slovenski leksikalni fond na podlagi empirične analize jezikovnih podatkov, pridobljenih iz slovenskih referenčnih elektronskih besedilnih korpusov FIDA in FidaPLUS. Omenjeni pristop opazuje jezik izključno na podlagi realnih besedil, ki tvorijo diskurzni univerzum, in so zajeta v konkretni besedilni korpus. Pristop k leksikalnemu opisu slovenskega jezika na tej podlagi ponuja v slovenističnem jezikoslovju novo opazovalno izhodišče tako glede kakovosti in količine jezikovnih podatkov kot tudi glede metodologije jezikoslovne analize. Bistvena posledica takega pristopa se kaže v brisanju mej med eno- in večbesednimi leksikalnimi enotami ter v razširitvi frazeološke problematike ne samo na raven leksikologije pač pa tudi skladnje in besediloslovja

Directory of Open Access Books (DOAB)

Editorial

Author: Iztok Kosem
Polona Gantar
Publication venue: 'University of Ljubljana'
Publication date: 01/08/2020
Field of study

Directory of Open Access Journals

Journals of Faculty of Arts, University of Ljubljana

Uvodnik

Author: Nataša Logar
Polona Gantar
Publication venue: 'University of Ljubljana'
Publication date: 01/12/2014
Field of study

S prvo številko drugega letnika revija Slovenščina 2.0: empirične, aplikativne in interdisciplinarne raziskave, ki jo tisti, ki nam je že domača, na kratko imenujemo SLO 2.0, utrjuje svojo osrednjo vlogo na področju prikaza rezultatov raziskav slovenskega in drugih jezikov, ki združujejo empirični ter interdisciplinarni, zlasti pa jezikovnotehnološki pristop in aplikativno naravnanost. Z izidom številke 1 (2014) pa v slovenistično znanstveno periodiko prinašamo še eno novost: sprotno objavljanje

Directory of Open Access Journals

Journals of Faculty of Arts, University of Ljubljana

Uvodnik

Author: Nataša Logar Breginc
Polona Gantar
Publication venue: 'University of Ljubljana'
Publication date: 01/12/2013
Field of study

Digitalizirani jezikovni viri, procesiranje naravnega jezika, korpusne analize slovničnih in drugih jezikovnih pojavov, rudarjenje besedil, označevalniki, luščilniki, leksikografska orodja, sinteza govora, strojno prevajanje, avatarski sogovorci, pametne hiše ... Skupna točka: jezik

Directory of Open Access Journals

Journals of Faculty of Arts, University of Ljubljana

Uvodnik

Author: Nataša Logar Berginc
Polona Gantar
Publication venue: Trojina, Institute for Applied Slovene Studies
Publication date: 01/05/2013
Field of study

Directory of Open Access Journals

Defining collocation for Slovenian lexical resources

Author: Iztok Kosem
Polona Gantar
Simon Krek
Publication venue: 'University of Ljubljana'
Publication date: 01/08/2020
Field of study

In this paper, we define the notion of collocation for the purpose of its use in machine-readable language resources, which will be used in the creation of electronic dictionaries and language applications for Slovene. Based on theoretical and lexicographically-driven studies we define collocation as a lexical phenomenon, defined by three key aspects: statistical, syntactic, and semantic. We take lexicographic relevance as a point of departure for defining collocations within the typology of word combinations, as well as for distinguishing them from free combinations. Free combinations are (frequent) syntactically valid word combinations without lexicographic value and consequently there is no need for the description of their meaning, or syntactic role. Next, we distinguish collocations from all multiword lexical units (compounds, phraseological units and lexico-grammatical units) using the lexicographic view that multiword lexical units, whose meaning is not a sum of its parts, require a description of their meaning whereas collocations do not. In the final part, we return to the three aspects of collocation and their role in automatic extraction of collocational information from corpora. Semantic criterion or dictionary relevance of extracted collocations has particularly exposed the problem of semantically broad collocates such as certain types of adverbs, adjectives and verbs, and word which feature in different syntactic roles (e.g. pronouns and adjuncts). We discuss a particular issue of collocations related to proper names and the decisions about their inclusion into the dictionary based on the evaluation of lexicographers

Directory of Open Access Journals

Journals of Faculty of Arts, University of Ljubljana