35 research outputs found

    Electronic dictionary of english verb conjugation: its structure, contents and parametrization

    Get PDF
    Стаття присвячена розв’язанню головних питань створення електронного словника англійської дієслівної дієвідміни з використанням теорії лексикографічних систем (Л-систем). Зокрема, визначено формальну структуру англійського дієслова, схарактеризовано її квазіоснови та квазіфлексії, а також побудовано модель Л-системи словника та визначено його лексикографічні структури, до яких передбачається доступ для користувача. Визначено лексикографічні параметри для репрезентації словозмінної парадигми дієслова. Окреслено завдання навчального та дослідницького характеру, які може розв’язувати розроблюваний словник.The article studies the main issues related to creating electronic dictionary of English verb conjugation by using the theory of lexicographic systems (L-systems). In particular the formal structure of the English verb was determined together with its possible quasi-stems and quasi-flexions. To facilitate functioning of the inflexion dictionary in digital environment the L-system model was built up and lexicographic structures to be made accessible for user were determined. Lexicographic parameters to represent verb conjugation were specified. The tasks of educational and research nature which are possible to be resolved with the English verb conjugation dictionary in question were set out

    Dictionaries for language processing. Readability and organization of information

    Get PDF
    What makes a dictionary exploitable in Natural Language Processing (NLP)? We examine two requirements: readability of information and general architecture, and we focus on the human tasks involving NLP dictionaries: construction, update, check, correction. We exemplify our points with real cases from projects of morpho-syntactic or syntactic-semantic dictionaries.Quelles caractéristiques d'un dictionnaire le rendent exploitable pour le traitement automatique des langues (TAL) ? Nous examinons deux exigences : la lisibilité des informations et l'architecture générale, et nous nous concentrons sur les tâches humaines concernées par les dictionnaires pour le TAL : construction, mise à jour, vérification, correction. Nous illustrons nos arguments par des exemples de cas réels tirés de projets de dictionnaires morpho-syntaxiques ou syntactico-sémantiques

    Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

    Full text link
    Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines. In this work, we propose a novel cross-lingual transfer method for inducing VerbNets for multiple languages. To the best of our knowledge, this is the first study which demonstrates how the architectures for learning word embeddings can be applied to this challenging syntactic-semantic task. Our method uses cross-lingual translation pairs to tie each of the six target languages into a bilingual vector space with English, jointly specialising the representations to encode the relational information from English VerbNet. A standard clustering algorithm is then run on top of the VerbNet-specialised representations, using vector dimensions as features for learning verb classes. Our results show that the proposed cross-lingual transfer approach sets new state-of-the-art verb classification performance across all six target languages explored in this work.Comment: EMNLP 2017 (long paper

    Improvement of VerbNet-like resources by frame typing

    Get PDF
    International audienceVerbenet is a French lexicon developed by " translation " of its English counterpart — VerbNet (Kipper-Schuler, 2005) — and treatment of the specificities of French syntax (Pradet et al., 2014; Danlos et al., 2016). One difficulty encountered in its development springs from the fact that the list of (potentially numerous) frames has no internal organization. This paper proposes a type system for frames that shows whether two frames are variants of a given alternation. Frame typing facilitates coherence checking of the resource in a " virtuous circle ". We present the principles underlying a program we developed and used to automatically type frames in Verbenet. We also show that our system is portable to other languages

    A Dataset for Movie Description

    Full text link
    Descriptive video service (DVS) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their peers. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed DVS, which is temporally aligned to full length HD movies. In addition we also collected the aligned movie scripts which have been used in prior work and compare the two different sources of descriptions. In total the Movie Description dataset contains a parallel corpus of over 54,000 sentences and video snippets from 72 HD movies. We characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing DVS to scripts, we find that DVS is far more visual and describes precisely what is shown rather than what should happen according to the scripts created prior to movie production

    Italian VerbNet: A Construction based Approach to Italian Verb Classification

    Get PDF
    This paper proposes a new method for Italian verb classification -and a preliminary example of resulting classes- inspired by Levin (1993) and VerbNet (Kipper-Schuler, 2005), yet partially independent from these resources; we achieved such a result by integrating Levin and VerbNet’s models of classification with other theoretic frameworks and resources. The classification is rooted in the constructionist framework (Goldberg, 1995; 2006) and is distribution-based. It is also semantically characterized by a link to FrameNet’ssemanticframesto represent the event expressed by a class. However, the new Italian classes maintain the hierarchic “tree” structure and monotonic nature of VerbNet’s classes, and, where possible, the original names (e.g.: Verbs of Killing, Verbs of Putting, etc.). We therefore propose here a taxonomy compatible with VerbNet but at the same time adapted to Italian syntax and semantics. It also addresses a number of problems intrinsic to the original classifications, such as the role of argument alternations, here regarded simply as epiphenomena, consistently with the constructionist approach

    Vonzatkeretek vizsgálata orvostudományi tárgyú, angol nyelvű szabadalmi szövegeken

    Get PDF
    Orvostudományi tárgyú, angol nyelv szabadalmi szövegekben el- forduló igék s fnevek vonzatkereteit vizsgáltuk. Az elfordulási gyakoriságuk alapján összeállítottunk egy kifejezetten az orvostudományi tárgyú szabadalmi szövegekre jellemz vonzatkerettárat, amely hasznosítható a hasonló tárgyú szövegekre alkalmazandó szintaktikai és szemantikai elemzk építésében