203 research outputs found

    Extending an Event-type Ontology: Adding Verbs and Classes Using Fine-tuned LLMs Suggestions

    Full text link
    In this project, we have investigated the use of advanced machine learning methods, specifically fine-tuned large language models, for pre-annotating data for a lexical extension task, namely adding descriptive words (verbs) to an existing (but incomplete, as of yet) ontology of event types. Several research questions have been focused on, from the investigation of a possible heuristics to provide at least hints to annotators which verbs to include and which are outside the current version of the ontology, to the possible use of the automatic scores to help the annotators to be more efficient in finding a threshold for identifying verbs that cannot be assigned to any existing class and therefore they are to be used as seeds for a new class. We have also carefully examined the correlation of the automatic scores with the human annotation. While the correlation turned out to be strong, its influence on the annotation proper is modest due to its near linearity, even though the mere fact of such pre-annotation leads to relatively short annotation times.Comment: Accepted to LAW-XVII @ ACL 202

    Důležitá slova. Podklady ke kolokačnímu švédsko-českému slovníku základních sloves

    Get PDF
    Basic verbs, i.e. very common verbs that typically denote physical movements, locations, states or actions, undergo various semantic shifts and acquire different secondary uses. In extreme cases, the distribution of secondary uses grows so general that they are regarded as auxiliary verbs (go and to be going to), phase verbs (turn, grow), etc. ese uses are usually well-documented by grammars and language textbooks, and so are idiomatic expressions (phraseologisms) in dictionaries. ere is, however, a grey area in between, which is extremely difficult to learn for non-native speakers. is consists of secondary uses with limited collocability, in particular light verb constructions, and secondary meanings that only get activated under particular morphosyntactic conditions. e basic-verb secondary uses and constructions are usually semantically transparent, such that they do not pose understanding problems, but they are generally unpredictable and language-specific, such that they easily become an issue in non-native text production. In this thesis, Swedish basic verbs are approached from the contrastive point of view of an advanced Czech learner of Swedish. A selection of Swedish constructions with basic verbs is explored. e observations result in a proposal for the structure of a machine-readable Swedish-Czech...Základní slovesa (basic verbs), tj. frekventovaná významová slovesa, jež zpravidla popisují fyzický pohyb, umístění, stav, nebo děj, procházejí řadou sémantických posunů, díky kterým se používají k vyjádření druhotných, přenesených významů. V krajních případech se dané sloveso stává pomocným, způsobovým, nebo fázovým slovesem a přestávají pro ně platit kolokační omezení, jež se vztahují na sloveso užité v jeho primárním (tj. doslovném) významu. Tato užití sloves bývají většinou dobře dokumentována v gramatikách i učebnicích, stejně jako kvalitní slovníky podávají podrobnou informaci o užití těchto sloves v ustálených frazeologických spojeních. Mezi plně gramatikalizovaným užitím na jedné straně a idiomatickým, frazeologickým užitím na druhé straně však existuje celá škála užití základních sloves v přenesených významech, jejíž zvládnutí je pro nerodilého mluvčího značně obtížné: užití v přeneseném významu, jež mají omezenou kolokabilitu. To jsou především verbonominální konstrukce někdy nazývané analytické predikáty (light verb constructions), ale také užití, která za určitých omezených morfosyntaktických podmínek (např. pouze v negaci) aktivují abstraktní sémantické rysy u jiných predikátů, např. zesilují význam, nebo implikují, že daný děj již trvá dlouho, a podobně. Tato druhotná užití významových sloves...Institute of Germanic StudiesÚstav germánských studiíFilozofická fakultaFaculty of Art

    Grammar and Corpora 2016

    Get PDF
    In recent years, the availability of large annotated corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel work using corpus methods to study the grammar of natural languages. This volume presents recent developments and advances, firstly, in corpus-oriented grammar research with a special focus on Germanic, Slavic, and Romance languages and, secondly, in corpus linguistic methodology as well as the application of corpus methods to grammar-related fields. The volume results from the sixth international conference Grammar and Corpora (GaC 2016), which took place at the Institute for the German Language (IDS) in Mannheim, Germany, in November 2016

    Formalizovaný kontrastivní popis lexikálních jednotek: deskriptivní rámec pro dvojjazyčné slovníky

    Get PDF
    Institute of the Czech National CorpusÚstav českého národního korpusuFilozofická fakultaFaculty of Art

    English Index

    Get PDF
    No abstract

    Design of a Controlled Language for Critical Infrastructures Protection

    Get PDF
    We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen

    A computational approach to Latin verbs: new resources and methods

    Get PDF
    Questa tesi presenta l'applicazione di metodi computazionali allo studio dei verbi latini. In particolare, mostriamo la creazione di un lessico di sottocategorizzazione estratto automaticamente da corpora annotati; inoltre presentiamo un modello probabilistico per l'acquisizione di preferenze di selezione a partire da corpora annotati e da un'ontologia (Latin WordNet). Infine, descriviamo i risultati di uno studio diacronico e quantitativo sui preverbi spaziali latini

    Evolutionary algorithms for definition extraction

    Get PDF
    Books and other text-based learning material contain implicit information which can aid the learner but which usually can only be accessed through a semantic analysis of the text. Definitions of new concepts appearing in the text are one such instance. If extracted and presented to the learner in form of a glossary, they can provide an excellent reference for the study of the main text. One way of extracting definitions is by reading through the text and annotating definitions manually — a tedious and boring job. In this paper, we explore the use of machine learning to extract definitions from non-technical texts, reducing human expert input to a minimum. We report on experiments we have conducted on the use of genetic programming to learn the typical linguistic forms of definitions and a genetic algorithm to learn the relative importance of these forms. Results are very positive, showing the feasibility of exploring further the use of these techniques in definition extraction. The genetic program is able to learn similar rules derived by a human linguistic expert, and the genetic algorithm is able to rank candidate definitions in an order of confidence.peer-reviewe

    Předložková fráze s předložkou at jakožto valenční komplement substantiv

    Get PDF
    Diplomová práce se zabývá problematikou valence substantiv, jejím vztahem k referenci a faktory podmiňujícími realizaci valenčního potenciálu substantiv. Teoretická část práce se věnuje jak valenci obecně, tak konkrétněji valenci substantiv. V obecnějších oddílech jsou vymezeny základní termíny a koncepty uplatňované ve valenčních popisech různého zaměření. V oddílech věnovaných substantivní valenci jsou mimo jiné vymezeny některé rozdíly mezi valencí substantiv a sloves a je odůvodněno vyloučení konstrukcí typu make an attempt z popisovaných dat. Vedle toho je upozorněno na vztah valence a slovotvorby a na vztah valence substantiv a reference, resp. kontextové určenosti. Empirická část práce je rozdělena do několika oddílů; všechny vycházejí z dat z Britského národního korpusu. Kvantitativní část analýzy ukazuje, že substantiva attempt a ability vyžadují obligatorně vyjádřený komplement, jsou-li determinována neurčitým členem vyjadřujícím kontextovou nezapojenost. Tím je potenciálně zpochybněno jak v literatuře běžné tvrzení, že vyjádření valenčního potenciálu substantiv není nikdy obligatorní, tak tvrzení, že substantiva (resp. některá z popisovaných substantiv) vůbec nemají valenci. Kvalitativní část analýzy popisuje možná vyjádření prvního argumentu substantiv attempt, ability a failure, ale...The present thesis deals with noun valency, its relation to reference, and factors underlying the realization of the valency potential of nouns. The theoretical part examines valency in general, delineating the basic terminology and concepts usually employed in the descriptions of valency couched within various linguistic frameworks. The theoretical part subsequently focuses more specifically on the valency of nouns, pointing out in what ways it differs from the valency of verbs. The support verb construction is introduced, and it is explained why the construction is not examined in the present thesis. Two interfaces are introduced, viz. that of valency and word-formation, and that of valency and reference, or contextual boundness. The empirical part of the thesis is divided into several parts, all relying on data from the British National Corpus. The quantitative part of the analysis shows that the nouns attempt and ability obligatorily take an explicit complement when they are immediately preceded by an indefinite article marking their newness in discourse. This could possibly challenge both the widespread claim that the expression of the valency potential of a noun is never obligatory and the claim that (these) nouns are avalent. The qualitative part of the analysis examines the expression of...Ústav anglického jazyka a didaktikyDepartment of the English Language and ELT MethodologyFilozofická fakultaFaculty of Art
    corecore