41 research outputs found

    TEI and LMF crosswalks

    Get PDF
    The present paper explores various arguments in favour of making the Text Encoding Initia-tive (TEI) guidelines an appropriate serialisation for ISO standard 24613:2008 (LMF, Lexi-cal Mark-up Framework) . It also identifies the issues that would have to be resolved in order to reach an appropriate implementation of these ideas, in particular in terms of infor-mational coverage. We show how the customisation facilities offered by the TEI guidelines can provide an adequate background, not only to cover missing components within the current Dictionary chapter of the TEI guidelines, but also to allow specific lexical projects to deal with local constraints. We expect this proposal to be a basis for a future ISO project in the context of the on going revision of LMF

    The Grande Dicionário Houaiss da Língua Portuguesa Dictionary as a Use Case

    Get PDF
    UIDB/00749/2020 UIDP/00749/2020In this article, we will introduce two of the new parts of the new multi-part version of the Lexical Markup Framework(LMF) ISO standard, namely Part 3 of the standard (ISO 24613-3), which deals with etymological and diachronic data, andPart 4 (ISO 24613-4), which consists of a TEI serialisation of all of the prior parts of the model. We will demonstrate the useof both standards by describing the LMF encoding of a small number of examples taken from a sample conversion of thereference Portuguese dictionaryGrande Dicion ́ario Houaiss da L ́ıngua Portuguesa, part of a broader experiment comprisingthe analysis of different, heterogeneously encoded, Portuguese lexical resources. We present the examples in the UnifiedModelling Language (UML) and also in a couple of cases in TEI.publishersversionpublishe

    from TEI Lex-0 to Ontolex-Lemon

    Get PDF
    UIDB/03213/2020 UIDP/03213/2020This paper describes ongoing work in the modelling of usage information in the context of the MORDigital project. The latter is based on the encoding and publication as linked data of Diccionario da Lingua Portugueza, a Portuguese legacy dictionary authored by António de Morais Silva, whose first edition was published in 1789. In this paper, we will focus on the modelling of domain labels in Ontolex-Lemon, based on a previous encoding of the dictionary’s entries in TEI Lex-0. This approach should be reusable for other projects involving the linked data publication of legacy dictionaries.publishersversionpublishe

    Méthodes pour la représentation informatisée de données lexicales / Methoden der Speicherung lexikalischer Daten

    Get PDF
    In recent years, new developments in the area of lexicography have altered not only the management, processing and publishing of lexicographical data, but also created new types of products such as electronic dictionaries and thesauri. These expand th range of possible uses of lexical data and support users with more flexibility, for instance in assisting human translation. In this article, we give a short and easy-to-understand introduction to the problematic nature of the storage, display and interpretation of lexical data. We then describe the main methods and specifications used to build and represent lexical data.In diesem Beitrag werden zwei Darstellungen zur Speicherung lexikalischer Daten in zwei verschiedenen Sprachen prasentiert. Die Texte beschreiben zwar in einer parallelen Gliederung dieselben Themen, sind aber keine direkte Ubersetzung voneinander.Dieses Kapitel richtet sich an unterschiedliche Zielgruppen, neben Sprachwissenschaftler(inne)n und Lexikograph(inn)en richtet es sich auch an Informatiker(innen) und Computerlinguist(inn)en, die mehr uber die Grundlagen der Modellierung und Darstellung von digitalen Worterbuchern lernen mochten. Wir betrachten dieses Kapitel als moglichen Ausgangspunkt fur diejenigen, die lexikographische Projekte beginnen wollen, und pladieren fur eine grundliche Durchdringung der Problematik der Speicherung lexikalischer Daten

    LMF Reloaded

    Get PDF
    International audienceLexical Markup Framework (LMF) or ISO 24613 [1] is a de jure standard that provides a framework for modelling and encoding lexical information in retrodigitised print dictionaries and NLP lexical databases. An in-depth review is currently underway within the standardisation subcommittee , ISO-TC37/SC4/WG4, to find a more modular, flexible and durable follow up to the original LMF standard published in 2008. In this paper we will present some of the major improvements which have so far been implemented in the new version of LMF

    Data models and the (blind ?) query of lexical resources

    Get PDF
    International audiencePresentation of several issues related to the query of standardized lexical data

    A Linked Coptic Dictionary Online

    Get PDF
    We describe a new project publishing a freely available online dictionary for Coptic. The dictionary encompasses comprehensive cross-referencing mechanisms, including linking entries to an online scanned edition of Crum’s Coptic Dictionary, internal cross-references and etymological information, translated searchable definitions in English, French and German, and linked corpus data which provides frequencies and corpus look-up for headwords and multiword expressions. Headwords are available for linking in external projects using a REST API. We describe the challenges in encoding our dictionary using TEI XML and implementing linking mechanisms to construct a Web interface querying frequency information, which draw on NLP tools to recognize inflected forms in context. We evaluate our dictionary’s coverage using digital corpora of Coptic available online

    football terms encoded in TEI Lex-0

    Get PDF
    UIDB/00749/2020 UIDP/00749/2020Terms are a significant part of lexicographical nomenclatures in general language dictionaries. In this paper, we focus on how football terms are treated in three Academy Dictionaries – Portuguese, French, and Spanish – and draw some conclusions about the lexicographical decisions taken in the three languages. After identifying every position football players can have on the field, we verify whether the dictionaries above include these terms. We propose the TEI encoding of the term “defesa” (defence), which designates a position occupied by football players on the field. Bearing in mind concepts such as reusability and interoperability, we intend to present: 1) a comparison of football terms in the three dictionaries; 2) TEI Lex-0 dictionary encoding, a streamlined standard to facilitate interoperability; 3) a consistent TEI modelling and description of the microstructural elements of lexicographical entries. In the end, we draw some conclusions.publishersversionpublishe
    corecore