225 research outputs found

    Using Parallel Texts and Lexicons for Verbal Word Sense Disambiguation

    Get PDF
    We present a system for verbal Word Sense Disambiguation (WSD) that is able to exploit additional information from parallel texts and lexicons. It is an extension of our previous WSD method, which gave promising results but used only monolingual features. In the follow-up work described here, we have explored two additional ideas: using English-Czech bilingual resources (as features only - the task itself remains a monolingual WSD task), and using a 'hybrid' approach, adding features extracted both from a parallel corpus and from manually aligned bilingual valency lexicon entries, which contain subcategorization information. Albeit not all types of features proved useful, both ideas and additions have led to significant improvements for both languages explored

    Automatic Identification of Aspectual Classes across Verbal Readings

    Get PDF
    International audienceThe automatic prediction of aspectual classes is very challenging for verbs whose aspectual value varies across readings, which are the rule rather than the exception. This paper sheds a new perspective on this problem by using a machine learning approach and a rich morpho-syntactic and semantic valency lexicon.In contrast to previous work, where the aspectual value of corpus clauses is determined on the basis of features retrieved from the corpus, we use features extracted from the lexicon, and aim to predict the aspectual value of verbal \textit{readings} rather than verbs.Studying the performance of the classifiers on a set of manually annotated verbal readings, we found that our lexicon provided enough information to reliably predict the aspectual value of verbs across their readings.We additionally tested our predictions for unseen predicates through a task based evaluation, by using them in the automatic detection of temporal relation types in TempEval 2007 tasks for French. These experiments also confirmed the reliability of our aspectual predictions, even for unseen verbs

    Large and noisy vs small and reliable: combining 2 types of corpora for adjective valence extraction

    Get PDF
    International audienceThis work investigates a possibility of combining two different types of corpora to build a valence lexicon for French adjectives. We complete adjectival frames extracted from a Treebank with statistical cues computed from a large automatically parsed corpus. This experiment shows how linguistic knowledge and large amount of annotated data can be used in a complementary manner

    On Singles, Couples and Extended Families. Measuring Overlapping between Latin Vallex and Latin WordNet

    Get PDF
    Different lexical resources may pursue different views on lexical meaning. However, all of them deal with lexical items as common basic components, which are described according to criteria that may vary from one resource to another. In this paper, we present a method for measuring the degree of similarity between a valency-based lexical resource and a WordNet. This is motivated by both theoretical and practical reasons. As for the former, we wonder if there are lexical classes that "impose" themselves regardless of the fact that they are explicitly recorded as such in source lexical resources. As for the latter, our work wants to contribute to the research task dealing with merging lexical resources. In order to apply and evaluate our method, we propose a normalized coefficient of overlapping that measures the overlapping rate between a valency lexicon and a WordNet. In particular, in the context of the exploitation of the linguistic resources for ancient languages built over the last decade, we compute and evaluate the overlapping between a selection of homogeneous lexical subsets extracted from two lexical resources for Latin

    Předložková fráze s předložkou at jakožto valenční komplement substantiv

    Get PDF
    Diplomová práce se zabývá problematikou valence substantiv, jejím vztahem k referenci a faktory podmiňujícími realizaci valenčního potenciálu substantiv. Teoretická část práce se věnuje jak valenci obecně, tak konkrétněji valenci substantiv. V obecnějších oddílech jsou vymezeny základní termíny a koncepty uplatňované ve valenčních popisech různého zaměření. V oddílech věnovaných substantivní valenci jsou mimo jiné vymezeny některé rozdíly mezi valencí substantiv a sloves a je odůvodněno vyloučení konstrukcí typu make an attempt z popisovaných dat. Vedle toho je upozorněno na vztah valence a slovotvorby a na vztah valence substantiv a reference, resp. kontextové určenosti. Empirická část práce je rozdělena do několika oddílů; všechny vycházejí z dat z Britského národního korpusu. Kvantitativní část analýzy ukazuje, že substantiva attempt a ability vyžadují obligatorně vyjádřený komplement, jsou-li determinována neurčitým členem vyjadřujícím kontextovou nezapojenost. Tím je potenciálně zpochybněno jak v literatuře běžné tvrzení, že vyjádření valenčního potenciálu substantiv není nikdy obligatorní, tak tvrzení, že substantiva (resp. některá z popisovaných substantiv) vůbec nemají valenci. Kvalitativní část analýzy popisuje možná vyjádření prvního argumentu substantiv attempt, ability a failure, ale...The present thesis deals with noun valency, its relation to reference, and factors underlying the realization of the valency potential of nouns. The theoretical part examines valency in general, delineating the basic terminology and concepts usually employed in the descriptions of valency couched within various linguistic frameworks. The theoretical part subsequently focuses more specifically on the valency of nouns, pointing out in what ways it differs from the valency of verbs. The support verb construction is introduced, and it is explained why the construction is not examined in the present thesis. Two interfaces are introduced, viz. that of valency and word-formation, and that of valency and reference, or contextual boundness. The empirical part of the thesis is divided into several parts, all relying on data from the British National Corpus. The quantitative part of the analysis shows that the nouns attempt and ability obligatorily take an explicit complement when they are immediately preceded by an indefinite article marking their newness in discourse. This could possibly challenge both the widespread claim that the expression of the valency potential of a noun is never obligatory and the claim that (these) nouns are avalent. The qualitative part of the analysis examines the expression of...Ústav anglického jazyka a didaktikyDepartment of the English Language and ELT MethodologyFilozofická fakultaFaculty of Art


    Get PDF
    Proceedings of the NODALIDA 2011 Workshop Constraint Grammar Applications. Editors: Eckhard Bick, Kristin Hagen, Kaili Müürisep, Trond Trosterud. NEALT Proceedings Series, Vol. 14 (2011), vi+69 pp. © 2011 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/19231

    Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan languages

    Get PDF
    Proceedings of the Seventh International Conference Formal Approaches to South Slavic and Balkan Languages publishes 17 papers that were presented at the conference organised in Dubrovnik, Croatia, 4-6 Octobre 2010

    Grammar and Corpora 2016

    Get PDF
    In recent years, the availability of large annotated corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel work using corpus methods to study the grammar of natural languages. This volume presents recent developments and advances, firstly, in corpus-oriented grammar research with a special focus on Germanic, Slavic, and Romance languages and, secondly, in corpus linguistic methodology as well as the application of corpus methods to grammar-related fields. The volume results from the sixth international conference Grammar and Corpora (GaC 2016), which took place at the Institute for the German Language (IDS) in Mannheim, Germany, in November 2016

    Improving Automated Alignment in Multilingual Corpora

    Get PDF