16 research outputs found

    Hitzak sarean. Pello Salabururi esker onez

    Get PDF
    168 p.Hona hemen, irakurle, Pello Salaburu maisu handiari eskaintzen diogun esker on liburua. Hamaika lan bildu ditugu, hainbat alorretakoak: euskararen lexikografia, terminologia, historia, gramatika, corpus ikerketa eta baita euskara eta gaztelaniaren prozesamenduari buruzkoak ere. Askotarikoa izan baita, zalantza gabe, Salaburuk utzi digun uzta oparoa; hizkuntzalaritzan, euskararen fonologian, gramatikan, lexikografian aitzindari izan dugu, gramatika sortzailearen esparruan lehenik, gero corpus eta hiztegi digitalak sortuaz, gure hizkuntza hobeto ezagutzeko, zaintzeko eta erabiltzeko behar ditugun xxi. mendeko lanabesak eraikiaz eta gizartean zabalduaz. Salabururen uzta ikerketaz eta akademiaz harago zabaltzen da, gure gizarteak ezagutu dituen hainbat erronkatara. Pello ez da inoiz uzkurtu gogoeta egin behar izan denean, bakegintzaren bideari dagokiola edo unibertsitatearen etorkizunari, beldur gabe agertu izan zaigu beti plazan, argi mintzatuaz, zuzen, hitzen kiribiletan sekula bere burua ezkutatu gabe

    Dynamic verbs in the Wordnet of Polish

    Get PDF
    Dynamic verbs in the Wordnet of Polish The paper presents patterns of co-occurrences of wordnet relations involving verb lexical units in plWordNet - a large wordnet of Polish. The discovered patterns reveal tendencies of selected synset and lexical relations to form regular circular structures of clear semantic meanings. They involve several types of relations, e.g., presupposition, cause, processuality and antonymy, do not have a necessary character (there are exceptions), but can be used in wordnet diagnostics and guidelines for wordnet editors. The analysis is illustrated with numerous positive and negative examples, as well as statistics for verb relations in plWordNet 4.0 emo. Some attempts to a more general, linguistic explanation of the observed phenomena are also made. As a background, plWordNet model of linguistic character is briefly recollected. A special attention is given to the verb part. In addition the description of dynamic verbs by relations and features is discussed in details including relation definitions and substitution tests.   Czasowniki dynamiczne w Słowosieci - wordnecie języka polskiego W artykule zostały przedstawione wzorce współwystępowania relacji leksykalno-semantycznych obejmujących czasownikowe jednostki leksykalne w ramach Słowosieci - wielkiego relacyjnego słownika języka polskiego, wordnetu języka polskiego. Tłem obserwacji jest Słowosieć 4.0 emo, dla której omówiono skrótowo system relacji czasownikowych wraz ze statystykami. Szczególną uwagę autorzy poświęcili czasownikom dynamicznym i ich typowym relacjom, dla których przedstawiono testy substytucji z wytycznych do relacyjnego opisu czasownika, zdefiniowanych na potrzeby edycji Słowosieci przez lingwistów. Opisane w artykule wzorce współwystępowania ukazują tendencje niektórych relacji synsetów (tj. zbiorów synonimów) i jednostek leksykalnych (m.in. presupozycji, kauzacji, procesywności i antonimii) do tworzenia regularnych struktur, specyfikujących znaczenie wszystkich jednostek/synsetów, połączonych za pomocą danych relacji. Współwystępowania relacji wg wzorców nie mają charakteru obligatoryjnego, dlatego też w artykule przedstawiono zarówno pozytywne, jak i negatywne przykłady jednostek i synsetów, połączonych ze sobą za pomocą relacji współwystępujących, jak i pewne uwagi natury ogólnej, wskazujące na językowy charakter obserwowanego zjawiska. Oprócz znaczenia poznawczego, związanego ze współzależnościami, jakie zachodzą w obrębie systemu językowego, opis tych regularności ma również znaczenie praktyczne - może być wykorzystany przy diagnostyce wordnetu oraz w wytycznych dla lingwistów

    Contrastive terminography

    Get PDF
    Contrastive terminographyContrastive methods have long been employed in lexicography, in particular in bi- and multilingual dictionary projects. The main rationale for this is the necessity to comprehensively study, i.e. compare and contrast, two or more linguistic systems that are to be presented in one way or another in respective dictionaries. Similarly, the contrastive approach is of paramount importance in terminographic undertakings, on account of the need to draw a distinction between terminological (conceptual) systems existing in various languages and across cultures. It must be emphasised, however, that the contrastive element is not only a part of terminographic practice, but also of the theory of terminography. This article aims to present the role of contrastive research across various spheres of specialised (=LSP) lexicography. Terminografia kontrastywnaMetody kontrastywne stosowane są w słownikarstwie od dawna, w szczególności w odniesieniu do dwu- i wielojęzycznych projektów leksykograficznych, z uwagi na konieczność przeprowadzenia analiz porównawczych dwóch lub więcej systemów językowych, których elementy mają być w konkretny sposób przedstawione/zestawione w słowniku. Badania kontrastywne odgrywają równie ważną rolę w pracy terminograficznej, przede wszystkim ze względu na potrzebę dokonania porównania systemów terminologicznych (pojęciowych) funkcjonujących w różnych językach i kulturach. Należy podkreślić, że elementy analiz kontrastywnych nie są jedynie domeną praktyki – korzysta z nich również teoria terminografii. W artykule przedstawiono rolę badań kontrastywnych na różnych płaszczyznach działalności terminograficznej

    Contrastive terminography

    Get PDF
    Contrastive terminography Contrastive methods have long been employed in lexicography, in particular in bi- and multilingual dictionary projects. The main rationale for this is the necessity to comprehensively study, i.e. compare and contrast, two or more linguistic systems that are to be presented in one way or another in respective dictionaries. Similarly, the contrastive approach is of paramount importance in terminographic undertakings, on account of the need to draw a distinction between terminological (conceptual) systems existing in various languages and across cultures. It must be emphasised, however, that the contrastive element is not only a part of terminographic practice, but also of the theory of terminography. This article aims to present the role of contrastive research across various spheres of specialised (=LSP) lexicography.   Terminografia kontrastywna Metody kontrastywne stosowane są w słownikarstwie od dawna, w szczególności w odniesieniu do dwu- i wielojęzycznych projektów leksykograficznych, z uwagi na konieczność przeprowadzenia analiz porównawczych dwóch lub więcej systemów językowych, których elementy mają być w konkretny sposób przedstawione/zestawione w słowniku. Badania kontrastywne odgrywają równie ważną rolę w pracy terminograficznej, przede wszystkim ze względu na potrzebę dokonania porównania systemów terminologicznych (pojęciowych) funkcjonujących w różnych językach i kulturach. Należy podkreślić, że elementy analiz kontrastywnych nie są jedynie domeną praktyki – korzysta z nich również teoria terminografii. W artykule przedstawiono rolę badań kontrastywnych na różnych płaszczyznach działalności terminograficznej

    Semantic relations between verbs in Polish WordNet 2.0

    Get PDF
    Semantic relations between verbs in Polish WordNet 2.0The noun dominates wordnets. The lexical semantics of verbs is usually under-represented, even if it is essential in any semantic analysis which goes beyond statistical methods. We present our attempt to remedy the imbalance; it begins by designing a sufficiently rich set of wordnet relations for verbs. We discuss and show in detail such a relation set in the largest Polish wordnet. Our design decisions, while as general and language-independent as possible, are mainly informed by our desire to capture the nature and peculiarities of the verb system in Polish

    Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources

    Get PDF
    Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical ResourcesLexical resources can be applied in many different Natural Language Engineering tasks, but the most fundamental task is the recognition of word senses used in text contexts. The problem is difficult, not yet fully solved and different lexical resources provided varied support for it. Polish CLARIN lexical semantic resources are based on the plWordNet — a very large wordnet for Polish — as a central structure which is a basis for linking together several resources of different types. In this paper, several Word Sense Disambiguation (henceforth WSD) methods developed for Polish that utilise plWordNet are discussed. Textual sense descriptions in the traditional lexicon can be compared with text contexts using Lesk’s algorithm in order to find best matching senses. In the case of a wordnet, lexico-semantic relations provide the main description of word senses. Thus, first, we adapted and applied to Polish a WSD method based on the Page Rank. According to it, text words are mapped on their senses in the plWordNet graph and Page Rank algorithm is run to find senses with the highest scores. The method presents results lower but comparable to those reported for English. The error analysis showed that the main problems are: fine grained sense distinctions in plWordNet and limited number of connections between words of different parts of speech. In the second approach plWordNet expanded with the mapping onto the SUMO ontology concepts was used. Two scenarios for WSD were investigated: two step disambiguation and disambiguation based on combined networks of plWordNet and SUMO. In the former scenario, words are first assigned SUMO concepts and next plWordNet senses are disambiguated. In latter, plWordNet and SUMO are combined in one large network used next for the disambiguation of senses. The additional knowledge sources used in WSD improved the performance. The obtained results and potential further lines of developments were discussed

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal

    The System of Register Labels in plWordNet

    Get PDF
    The System of Register Labels in plWordNetStylistic registers influence word usage. Both traditional dictionaries and wordnets assign lexical units to registers, and there is a wide range of solutions. A system of register labels can be flat or hierarchical, with few labels or many, homogeneous or decomposed into sets of elementary features. We review the register label systems in lexicography, and then discuss our model, designed for plWordNet, a large wordnet for Polish. There follows a detailed comparative analysis of several register systems in Polish lexical resources. We also present the practical effect of the adoption of our flat, small and homogeneous system: a relatively high consistency of register assignment in plWordNet, as measured by inter-annotator agreement on a manageable sample. Large-scale conclusions for the whole plWordNet remain to be made once the annotation has been completed, but the experience half-way through this labour-intensive exercise is very encouraging

    Semantic relations among adjectives in Polish WordNet 2.0: a new relation set, discussion and evaluation

    Get PDF
    Semantic relations among adjectives in Polish WordNet 2.0: a new relation set, discussion and evaluationAdjectives in wordnets are often neglected: there are many fewer of them than nouns, and relations among them are sometimes not as varied as those among nouns or verbs. Polish WordNet 1.0 was no exception. Version 2.0 aims to correct that. We present an overview of a much larger set of lexical-semantic relations which connect adjectives to the other parts of the network. Our choice of relations has been motivated by linguistic considerations, especially the concerns of the Polish lexical semantics, and by pragmatic reasons. The discussion includes detailed substitution tests, meant to ensure consistency among wordnet editors

    The System of Register Labels in plWordNet

    Get PDF
    The System of Register Labels in plWordNet Stylistic registers influence word usage. Both traditional dictionaries and wordnets assign lexical units to registers, and there is a wide range of solutions. A system of register labels can be flat or hierarchical, with few labels or many, homogeneous or decomposed into sets of elementary features. We review the register label systems in lexicography, and then discuss our model, designed for plWordNet, a large wordnet for Polish. There follows a detailed comparative analysis of several register systems in Polish lexical resources. We also present the practical effect of the adoption of our flat, small and homogeneous system: a relatively high consistency of register assignment in plWordNet, as measured by inter-annotator agreement on a manageable sample. Large-scale conclusions for the whole plWordNet remain to be made once the annotation has been completed, but the experience half-way through this labour-intensive exercise is very encouraging
    corecore