3 research outputs found

    Vartosenos modelių analizė mokomojoje leksikografijoje: žvalgomasis tyrimas lietuvių kalbos veiksmažodžių pavyzdžiu

    Get PDF
    The aim of this paper is to present a pilot study which applies the framework of Corpus Pattern Analysis (CPA, Hanks 2004) to analyse some Lithuanian verbs which form part of the basic vocabulary. CPA draws on the insights of the corpus-driven language analysis and contextual and functional theory of meaning: a meaning of a word is associated with a specific lexical and grammatical environment, e.g. corpus patterns which represent an interconnection of lexical and grammatical elements. The CPA procedure is one of the several corpus-driven methods differing from the pattern grammar (Hunston, Francis 2000) in the way that CPA not only uses typical grammatical categories (e.g. word classes) but also introduces semantic values (e.g. semantic types) to distinguish different senses of a word. Semantic types are often the main separator of meanings, especially when two verb senses are associated with the same grammatical pattern. Concerning learners’ dictionaries, CPA could provide learners with more detailed usage data, and this could lead to a better understanding of meaning differences, important both for language reception and language production. After introducing the CPA methodology, we present the CPA analysis of two Lithuanian verbs, namely, the inductive procedure followed to observe and define meaning-related patterning. We also discuss the problematic issues related to the application of CPA as identified in this study and mentioned by other CPA practitioners. First, observing and defining corpus patterns is a challenging task for lexicographers, especially because of the pattern / meaning division and generalizations related to semantic types. The second problematic aspect is automatization in the process of pattern recognition. The third issue relates to foreign language learners as a target group: meaning-related patterning observed in the data has to be presented in a learner dictionary in a user-friendly way.Šio straipsnio tikslas – pristatyti žvalgomąjį tyrimą, kuriame siekta išbandyti vartosenos modelių analizės metodą (angl. Corpus Pattern Analysis, Hanks 2004), pavyzdžiu paėmus du dažnus lietuvių kalbos veiksmažodžius iš pagrindinio žodyno. Šis metodas paremtas tekstynų inspiruotos (angl. corpus-driven) lingvistikos principais, kai į vartoseną žiūrima kaip į neatsiejamą reikšmės dalį, o vartosenoje rasti leksiniai ir gramatiniai dėsningumai yra pagrindas reikšmėms skirti. Susipažinęs ar supažindintas su šiais dėsningumais, besimokantis lietuvių kalbos galėtų geriau suvokti reikšmių skirtumus, o tai būtų svarbu ir suvokiant kalbą, ir ją produkuojant. Išdėsčius vartosenos modelių analizės principus, straipsnyje aprašomas dviejų lietuvių kalbos veiksmažodžių tyrimas, parodant, kaip taikytas šis analizės metodas; taip pat atskleidžiama, kokios yra šiuo metodu surinktų duomenų naudojimo galimybės mokomuosiuose žodynuose. Be to, aptariami ir probleminiai aspektai: vartosenos modelių atpažinimas, aprašymas ir pateikimas mokomuosiuose žodynuose

    Linguistically-motivated automatic classification of Lithuanian texts for didactic purposes

    No full text
    This paper presents an effort to provide a level-appropriate study corpus for Lithuanian language learners. The collected corpus includes levelled texts from study books and unlevelled texts from other sources. The main goal is to assign the level-appropriate labels (A1, A2, B1, B2) to texts from other sources. For automatic classification we use preselected surface features, based on text readability research, and shallow linguistic features. First, we train the model with levelled texts from study books; second, we apply the learned model to classifying other texts. The best classification results are achieved with Logistic Regression method

    Linguistically-motivated automatic classification of Lithuanian texts for didactic purposes

    No full text
    Knygos ISBN 978-1-61499-912-6 (online)This paper presents an effort to provide a level-appropriate study corpus for Lithuanian language learners. The collected corpus includes levelled texts from study books and unlevelled texts from other sources. The main goal is to assign the level-appropriate labels (A1, A2, B1, B2) to texts from other sources. For automatic classification we use preselected surface features, based on text readability research, and shallow linguistic features. First, we train the model with levelled texts from study books; second, we apply the learned model to classifying other texts. The best classification results are achieved with Logistic Regression methodLituanistikos katedraUžsienio kalbų, lit. ir vert. s. katedraVytauto Didžiojo universiteta
    corecore