34 research outputs found

    The Lefff, a freely available and large-coverage morphological and syntactic lexicon for French

    Get PDF
    International audienceIn this paper, we introduce the Lefff , a freely available, accurate and large-coverage morphological and syntactic lexicon for French, used in many NLP tools such as large-coverage parsers. We first describe Alexina, the lexical framework in which the Lefff is developed as well as the linguistic notions and formalisms it is based on. Next, we describe the various sources of lexical data we used for building the Lefff , in particular semi-automatic lexical development techniques and conversion and merging of existing resources. Finally, we illustrate the coverage and precision of the resource by comparing it with other resources and by assessing its impact in various NLP tools

    Adding frequencies to the LGLex lexicon with IRASUBCAT

    Get PDF
    We present a method for enlarge a lexicon (with frequencies information), that is useful for parsing and others NLP applications. We show an example enlarging the verbal LGLex lexicon of French [8], using several corpora extracted from the evaluation campaign for French parsers Passage [5]. To do that, we use the results of the frmg parser [7] with IRASubcat, a tool that automatically acquires subcategorization frames from corpus in any language and that also allows to complete an existing lexicon. We obtain the frequencies of occurrence for each input and each subcategorization frame for 14,068 distinct lemmas.Sociedad Argentina de Informática e Investigación Operativ

    Adding frequencies to the LGLex lexicon with IRASUBCAT

    Get PDF
    We present a method for enlarge a lexicon (with frequencies information), that is useful for parsing and others NLP applications. We show an example enlarging the verbal LGLex lexicon of French [8], using several corpora extracted from the evaluation campaign for French parsers Passage [5]. To do that, we use the results of the frmg parser [7] with IRASubcat, a tool that automatically acquires subcategorization frames from corpus in any language and that also allows to complete an existing lexicon. We obtain the frequencies of occurrence for each input and each subcategorization frame for 14,068 distinct lemmas.Sociedad Argentina de Informática e Investigación Operativ

    Conversion of Lexicon-Grammar tables to LMF. Application to French

    Get PDF
    We describe the first experiment of conversion of Lexicon-Grammar tables for French verbs into the Lexical Markup Framework (LMF) format. The Lexicon-Grammar of the French language is currently one of the major sources of lexical and syntactic information for French. Its conversion into an interoperable representation format according to the LMF standard makes it usable in different contexts, thus contributing to the standardization and interoperability of natural language processing dictionaries. We briefly introduce the Lexicon-Grammar and the derived dictionaries; we analyse the main difficulties faced during the conversion; and we describe the resulting resource.Nous décrivons une première expérience de conversion des tables du lexique-grammaire du français vers le format Cadre de balisage lexical (LMF). Le lexique-grammaire des verbes français est actuellement une des principales sources d'informations lexicales et syntaxiques sur le français. Sa conversion dans un format de représentation permettant l'interopérabilité et conforme à la norme LMF le rend utilisable dans des contextes variés, contribuant ainsi à la normalisation et à l'interopérabilité des dictionnaires pour le traitement des langues naturelles. Nous introduisons brièvement le lexique-grammaire et les dictionnaires dérivés ; nous analysons les principales difficultés rencontrées durant la conversion ; et nous décrivons la ressource obtenue

    Le DM, a French Dictionary for NooJ

    Get PDF
    International audienceThis paper presents the DM, a new dictionary for French. Freely available resources are selectively used to obtain lexical lemmas, from which morphological grammars generate about 538 000 baseforms. Evaluation of the DM on corpus shows that it stands the comparison with the previous NooJ delaf dictionary

    Mapping the Lexique des Verbes du Français (Lexicon of French Verbs) to a NLP Lexicon using Examples

    Get PDF
    Abstract This article presents experiments aiming at mapping the Lexique des Verbes du Français (Lexicon of French Verbs) to FRILEX, a Natural Language Processing (NLP) lexicon based on DICOVALENCE. The two resources (Lexicon of French Verbs and DICOVALENCE) were built by linguists, based on very different theories, which makes a direct mapping nearly impossible. We chose to use the examples provided in one of the resource to find implicit links between the two and make them explicit

    Encoding a syntactic dictionary into a super granular unification grammar

    Get PDF
    International audienceWe show how to turn a large-scale syntactic dictionary into a dependency-based unification grammar where each piece of lexical information calls a separate rule, yielding a super granular grammar. Subcategorization, raising and control verbs, auxiliaries and copula, passivization, and tough-movement are discussed. We focus on the semantics-syntax interface and offer a new perspective on syntactic structure

    Mapping the Lexique des Verbes du Français (Lexicon of French Verbs) to a NLP Lexicon using Examples

    Get PDF
    International audienceThis article presents experiments aiming at mapping the Lexique des Verbes du Français (Lexicon of French Verbs) to FRILEX, a Natural Language Processing (NLP) lexicon based on DICOVALENCE. The two resources (Lexicon of French Verbs and DICOVALENCE) were built by linguists, based on very different theories, which makes a direct mapping nearly impossible. We chose to use the examples provided in one of the resource to find implicit links between the two and make them explicit