40 research outputs found
Variability of Czech Phraseme Usage
The paper addresses the variability of Czech phrasemes, i.e. semantically non-compositional multiword units, in current use represented by corpora, the variability being the result of linguistic creativity on the part of text authors. It also asks what, in fact, identifies a phraseme. A basic, original phraseme has a certain meaning that cannot be inferred from the meaning of its components, and if it is modified, made more topical and up-to-date, either the original meaning is entirely or partially preserved, or the modified phraseme acquires a totally new meaning. Some phrasemes allow for multiple modifications, while others are more rigid. The article examines different types of lexical/syntactic/morphological/semantic alteration of basic phrasemes. In addition to lexical variations, the focus is mainly on syntactic and morphological changes, and on the question as to whether the chosen syntactic means of expressing semantic shifts have an impact on the potential for a creative treatment of the phraseme. In order to identify the variants of a phraseme with the phraseme itself, we introduce the term phraseme nucleus and outline a partial solution to the phraseme variability problem — designing a lexical database of multiword units (including phrasemes) containing entries sufficiently flexible to at least partially capture the variability of phrasemes.15117
Slovnědruhová a morfologická homonymie, homofonie a homografie v současné češtině : Part-of-Speech and Morphological Ambiguity, Homophony and Homography in Contemporary Czech
The paper presents a classification of the types of morphological ambiguity and the types of homophony and homography in contemporary Czech occurring in the material of the SYN and SYN2013PUB corpora of the Czech National Corpus. The classification of homonymy and homography constitutes a data base for the rule-based automatic morphological disambiguation of written Czech performed in the Institute of Theoretical and Computational Linguistics at the Faculty of Arts, Charles University. As for homophony, the types presented in the paper and mainly the sets of word forms associated with these types, can be used for the disambiguation of spoken texts
A Grammar Checker for Czech
The paper presents a detailed description of the linguistic software system Kontrola české gramatiky (Grammar Checker for Czech) that checks grammatical, orthographic and stylistic correctness of a text written in Czech, highlights the errors encountered and offers a user corrections of the errors committed. The Grammar checker for Czech has been integrated in the Microsoft Word software product within the system Microsoft Office™ version 2003 (since 2005) and subsequently 2007, 2010 and 2013. The paper focuses on the history of the project, presents ideas on which the Checker is based, describes how the Checker can be utilized including interactive error corrections, presents a detailed description of the errors detected and an algorithm of the Checker’s processing an input text. It is not only external aspects of the grammar checking system that are depicted, i.e. how the Checker presents itself to the user and how the user should use it, but also the whole conceptual basis of the system: what components it consists of and how they cooperate. The core of the Checker is constituted by formalized grammatical (especially syntactic), orthographic and stylistic rules detecting mainly grammatical and also some orthographic a stylistic errors in Czech texts. The rules use primarily the results of automatic part-of-speech and morphological analysis and morphological disambiguation of Czech texts and also morphological synthesis. An error is conceived of as a violation of the grammar and orthography of contemporary standard written Czech, the grammar being expressed by the rule-based system. The Checker deals with errors of spelling only in special grammatically motivated cases and in those cases where the people usually make mistakes; a full-fledged spell checker had already been integrated in the Microsoft Office package before the emergence of the Checker and a new one was not needed. In the conclusion, the success rate of the Checker is briefly compared to the Grammaticon system, a survey of positive and negative aspects of the Checker is presented and future directions of its potential further development are shown
Part-of-Speech and Morphological Ambiguity, Homophony and Homography in Contemporary Czech
The paper presents a classification of the types of morphological ambiguity and the types of homophony and homography in contemporary Czech occurring in the material of the SYN and SYN2013PUB corpora of the Czech National Corpus. The classification of homonymy and homography constitutes a data base for the rule-based automatic morphological disambiguation of written Czech performed in the Institute of Theoretical and Computational Linguistics at the Faculty of Arts, Charles University. As for homophony, the types presented in the paper and mainly the sets of word forms associated with these types, can be used for the disambiguation of spoken texts.12713