203 research outputs found
Extending an Event-type Ontology: Adding Verbs and Classes Using Fine-tuned LLMs Suggestions
In this project, we have investigated the use of advanced machine learning
methods, specifically fine-tuned large language models, for pre-annotating data
for a lexical extension task, namely adding descriptive words (verbs) to an
existing (but incomplete, as of yet) ontology of event types. Several research
questions have been focused on, from the investigation of a possible heuristics
to provide at least hints to annotators which verbs to include and which are
outside the current version of the ontology, to the possible use of the
automatic scores to help the annotators to be more efficient in finding a
threshold for identifying verbs that cannot be assigned to any existing class
and therefore they are to be used as seeds for a new class. We have also
carefully examined the correlation of the automatic scores with the human
annotation. While the correlation turned out to be strong, its influence on the
annotation proper is modest due to its near linearity, even though the mere
fact of such pre-annotation leads to relatively short annotation times.Comment: Accepted to LAW-XVII @ ACL 202
Důležitá slova. Podklady ke kolokačnímu švédsko-českému slovníku základních sloves
Basic verbs, i.e. very common verbs that typically denote physical movements, locations, states or actions, undergo various semantic shifts and acquire different secondary uses. In extreme cases, the distribution of secondary uses grows so general that they are regarded as auxiliary verbs (go and to be going to), phase verbs (turn, grow), etc. ese uses are usually well-documented by grammars and language textbooks, and so are idiomatic expressions (phraseologisms) in dictionaries. ere is, however, a grey area in between, which is extremely difficult to learn for non-native speakers. is consists of secondary uses with limited collocability, in particular light verb constructions, and secondary meanings that only get activated under particular morphosyntactic conditions. e basic-verb secondary uses and constructions are usually semantically transparent, such that they do not pose understanding problems, but they are generally unpredictable and language-specific, such that they easily become an issue in non-native text production. In this thesis, Swedish basic verbs are approached from the contrastive point of view of an advanced Czech learner of Swedish. A selection of Swedish constructions with basic verbs is explored. e observations result in a proposal for the structure of a machine-readable Swedish-Czech...Základní slovesa (basic verbs), tj. frekventovaná významová slovesa, jež zpravidla popisují fyzický pohyb, umístění, stav, nebo děj, procházejí řadou sémantických posunů, díky kterým se používají k vyjádření druhotných, přenesených významů. V krajních případech se dané sloveso stává pomocným, způsobovým, nebo fázovým slovesem a přestávají pro ně platit kolokační omezení, jež se vztahují na sloveso užité v jeho primárním (tj. doslovném) významu. Tato užití sloves bývají většinou dobře dokumentována v gramatikách i učebnicích, stejně jako kvalitní slovníky podávají podrobnou informaci o užití těchto sloves v ustálených frazeologických spojeních. Mezi plně gramatikalizovaným užitím na jedné straně a idiomatickým, frazeologickým užitím na druhé straně však existuje celá škála užití základních sloves v přenesených významech, jejíž zvládnutí je pro nerodilého mluvčího značně obtížné: užití v přeneseném významu, jež mají omezenou kolokabilitu. To jsou především verbonominální konstrukce někdy nazývané analytické predikáty (light verb constructions), ale také užití, která za určitých omezených morfosyntaktických podmínek (např. pouze v negaci) aktivují abstraktní sémantické rysy u jiných predikátů, např. zesilují význam, nebo implikují, že daný děj již trvá dlouho, a podobně. Tato druhotná užití významových sloves...Institute of Germanic StudiesÚstav germánských studiíFilozofická fakultaFaculty of Art
Grammar and Corpora 2016
In recent years, the availability of large annotated corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel work using corpus methods to study the grammar of natural languages. This volume presents recent developments and advances, firstly, in corpus-oriented grammar research with a special focus on Germanic, Slavic, and Romance languages and, secondly, in corpus linguistic methodology as well as the application of corpus methods to grammar-related fields. The volume results from the sixth international conference Grammar and Corpora (GaC 2016), which took place at the Institute for the German Language (IDS) in Mannheim, Germany, in November 2016
Formalizovaný kontrastivní popis lexikálních jednotek: deskriptivní rámec pro dvojjazyčné slovníky
Institute of the Czech National CorpusÚstav českého národního korpusuFilozofická fakultaFaculty of Art
Design of a Controlled Language for Critical Infrastructures Protection
We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates
from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically
represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of
traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an
analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen
A computational approach to Latin verbs: new resources and methods
Questa tesi presenta l'applicazione di metodi computazionali allo studio dei verbi latini. In particolare, mostriamo la creazione di un lessico di sottocategorizzazione estratto automaticamente da corpora annotati; inoltre presentiamo un modello probabilistico per l'acquisizione di preferenze di selezione a partire da corpora annotati e da un'ontologia (Latin WordNet). Infine, descriviamo i risultati di uno studio diacronico e quantitativo sui preverbi spaziali latini
Evolutionary algorithms for definition extraction
Books and other text-based learning material
contain implicit information which can aid the
learner but which usually can only be accessed
through a semantic analysis of the text. Definitions of new concepts appearing in the text are
one such instance. If extracted and presented
to the learner in form of a glossary, they can
provide an excellent reference for the study of
the main text. One way of extracting definitions is by reading through the text and annotating definitions manually — a tedious and boring
job. In this paper, we explore the use of machine learning to extract definitions from non-technical texts, reducing human expert input to
a minimum. We report on experiments we have
conducted on the use of genetic programming to
learn the typical linguistic forms of definitions
and a genetic algorithm to learn the relative importance of these forms. Results are very positive, showing the feasibility of exploring further
the use of these techniques in definition extraction. The genetic program is able to learn similar
rules derived by a human linguistic expert, and
the genetic algorithm is able to rank candidate
definitions in an order of confidence.peer-reviewe
Předložková fráze s předložkou at jakožto valenční komplement substantiv
Diplomová práce se zabývá problematikou valence substantiv, jejím vztahem k referenci a faktory podmiňujícími realizaci valenčního potenciálu substantiv. Teoretická část práce se věnuje jak valenci obecně, tak konkrétněji valenci substantiv. V obecnějších oddílech jsou vymezeny základní termíny a koncepty uplatňované ve valenčních popisech různého zaměření. V oddílech věnovaných substantivní valenci jsou mimo jiné vymezeny některé rozdíly mezi valencí substantiv a sloves a je odůvodněno vyloučení konstrukcí typu make an attempt z popisovaných dat. Vedle toho je upozorněno na vztah valence a slovotvorby a na vztah valence substantiv a reference, resp. kontextové určenosti. Empirická část práce je rozdělena do několika oddílů; všechny vycházejí z dat z Britského národního korpusu. Kvantitativní část analýzy ukazuje, že substantiva attempt a ability vyžadují obligatorně vyjádřený komplement, jsou-li determinována neurčitým členem vyjadřujícím kontextovou nezapojenost. Tím je potenciálně zpochybněno jak v literatuře běžné tvrzení, že vyjádření valenčního potenciálu substantiv není nikdy obligatorní, tak tvrzení, že substantiva (resp. některá z popisovaných substantiv) vůbec nemají valenci. Kvalitativní část analýzy popisuje možná vyjádření prvního argumentu substantiv attempt, ability a failure, ale...The present thesis deals with noun valency, its relation to reference, and factors underlying the realization of the valency potential of nouns. The theoretical part examines valency in general, delineating the basic terminology and concepts usually employed in the descriptions of valency couched within various linguistic frameworks. The theoretical part subsequently focuses more specifically on the valency of nouns, pointing out in what ways it differs from the valency of verbs. The support verb construction is introduced, and it is explained why the construction is not examined in the present thesis. Two interfaces are introduced, viz. that of valency and word-formation, and that of valency and reference, or contextual boundness. The empirical part of the thesis is divided into several parts, all relying on data from the British National Corpus. The quantitative part of the analysis shows that the nouns attempt and ability obligatorily take an explicit complement when they are immediately preceded by an indefinite article marking their newness in discourse. This could possibly challenge both the widespread claim that the expression of the valency potential of a noun is never obligatory and the claim that (these) nouns are avalent. The qualitative part of the analysis examines the expression of...Ústav anglického jazyka a didaktikyDepartment of the English Language and ELT MethodologyFilozofická fakultaFaculty of Art
- …