17 research outputs found

    Editorial of the special issue on latest advancements in linguistic linked data

    Get PDF
    Since the inception of the Open Linguistics Working Group in 2010, there have been numerous efforts in transforming language resources into Linked Data. The research field of Linguistic Linked Data (LLD) has gained in importance, visibility and impact, with the Linguistic Linked Open Data (LLOD) cloud gathering nowadays over 200 resources. With this increasing growth, new challenges have emerged concerning particular domain and task applications, quality dimensions, and linguistic features to take into account. This special issue aims to review and summarize the progress and status of LLD research in recent years, as well as to offer an understanding of the challenges ahead of the field for the years to come. The papers in this issue indicate that there are still aspects to address for a wider community adoption of LLD, as well as a lack of resources for specific tasks and (interdisciplinary) domains. Likewise, the integration of LLD resources into Natural Language Processing (NLP) architectures and the search for long-term infrastructure solutions to host LLD resources continue to be essential points to which to attend in the foreseeable future of the research line

    Challenges for the representation of morphology in ontology lexicons

    Get PDF
    Recent years have experienced a growing trend in the publication of language resources as Linguistic Linked Data (LLD) to enhance their discovery, reuse and the interoperability of tools that consume language data. To this aim, the OntoLex-lemon model has emerged as a de facto standard to represent lexical data on the Web. However, traditional dictionaries contain a considerable amount of morphological information which is not straightforwardly representable as LLD within the current model. In order to fill this gap a new Morphology Module of OntoLex-lemon is currently being developed. This paper presents the results of this model as on-going work as well as the underlying challenges that emerged during the module development. Based on the MMoOn Core ontology, it aims to account for a wide range of morphological information, ranging from endings to derive whole paradigms to the decomposition and generation of lexical entries which is in compliance to other OntoLex-lemon modules and facilitates the encoding of complex morphological data in ontology lexicons

    Towards the Integration of Multilingual Terminologies: an Example of a Linked Data Prototype

    Get PDF
    Abstract Many language resources are nowadays available in machine readable formats, but still contained in isolated silos. Current Semantic Web-based techniques enable the transformation and linking of those resources to become a navigable graph of linked language resources, which can be directly consumed by third-party applications. The prototype we have developed builds on a web user interface and SPARQL endpoint initially developed to query a single terminological database (Terminesp), now extended to navigate a set of multilingual terminologies. The vocabulary used to represent these terminologies into the linked data format is lemon-ontolex, a de facto standard for representing lexical information relative to ontologies and for linking lexicons and machine-readable dictionaries to the Semantic Web

    Cross-Lingual Link Discovery for Under-Resourced Languages

    Get PDF
    CC BY-NC 4.0In this paper, we provide an overview of current technologies for cross-lingual link discovery, and we discuss challenges, experiences and prospects of their application to under-resourced languages. We first introduce the goals of cross-lingual linking and associated technologies, and in particular, the role that the Linked Data paradigm (Bizer et al., 2011) applied to language data can play in this context. We define under-resourced languages with a specific focus on languages actively used on the internet, i.e., languages with a digitally versatile speaker community, but limited support in terms of language technology. We argue that languages for which considerable amounts of textual data and (at least) a bilingual word list are available, techniques for cross-lingual linking can be readily applied, and that these enable the implementation of downstream applications for under-resourced languages via the localisation and adaptation of existing technologies and resources

    Linguistic Linked Data for Lexicography

    Full text link
    Nowadays, the number of resources that provide lexical data keeps significantly increasing as outcomes of projects in linguistics, lexicography and language technologies. However, this data is scattered throughout the Web, isolated, and often comes in a vast number of different formats and languages. To address this landscape of heterogeneous and isolated language resources, experts working in the domain of the Semantic Web have adopted approaches to linguistic data representation based on the Linked Data (LD) paradigm, giving birth to the Linguistic Linked Data (LLD) line of research. Although LLD is focused on the representation, publication and sharing of language resources, there exists no previous wide-scope exploration and assessment of the impact of the application of LLD to lexicography as a discipline: the requirements and process this involves, its practical and theoretical benefits, the challenges it raises, and the open problems on the way. Furthermore, as a required ingredient towards this exploration, guidelines to represent a wide range of lexicographic resources (as outcomes of a lexicographic compilation process) by following this new paradigm are lacking as well. In this thesis we address the application of LLD to lexicography from the looking glass of the lexicographer, the user who consults lexicographic works, or the linguist interested in lexical semantics who needs lexicographic content for their work. We detect and resolve obstacles on the way for LLD adoption in lexicography regarding the representation requirements of lexicographic works through the definition of application profiles and extensions of the de facto standards for LLD representation. On the basis of a set of representative resources that we convert to the Resource Descriptioin Framework (RDF), we analyse and showcase the benefits and implications of LLD for dictionary representation, both as a target format of a conversion, as well as a potential native format for lexicographic projects in the future. ----------RESUMEN---------- Con el incesante aumento de los recursos léxicos que surgen de numerosos proyectos en lingüística, lexicografía, y tecnologías del lenguaje, hoy en día los datos léxicos se encuentran en distintos formatos, dispersos y aislados unos de otros en la Web. Los Datos Enlazados Lingüísticos (por sus siglas en inglés, LLD) es una línea de investigación desarrollada por expertos en el campo de la Web Semántica que responde a la necesidad de estandarización en la representación de datos lingüísticos y que se basa en el paradigma de los Datos Enlazados (LD). Pese a que la línea de LLD se centra en la representación, la publicación, y la difusión de los recursos lingüísticos, no existe hasta la fecha un estudio amplio ni una valoración del impacto que tendría su aplicación a la lexicografía como disciplina: cuáles son los requisitos que cumplir en la representación de recursos lexicográficos como LLD, qué procesos habría que llevar a cabo, cuáles serían las ventajas prácticas y teóricas de este tipo de representación, los desafíos a los que daría lugar, ni los posibles problemas a los que habría que hacer frente. Asimismo, como piezas necesarias en ese estudio, destaca también la falta de guías para representar un amplio abanico de recursos lexiográficos en este nuevo paradigma. En esta tesis doctoral se investiga la aplicación de los LLD a la lexicografía desde la perspectiva del lexicógrafo, el usuario de recursos lexicográficos, o el lingüista interesado en la semántica léxica que necesita acceder a contenido lexicográfico para su trabajo. Esta tesis identifica y resuelve una serie de problemas de modelado a la hora de representar contenido lexicográfico en el formato RDF (Resource Description Framework). Mediante la definición de perfiles de aplicación y extensiones para el estándar de facto más utilizado en LLD, este trabajo presenta una serie de recursos lexicográficos en formato RDF que sirven para analizar y demostrar las ventajas de este paradigma para codificar información lexicográfica, tanto como formato final de un recurso tras una conversión, como como formato nativo para la creación de nuevas obras lexicográficas

    Editorial of the Special Issue on Latest Advancements in Linguistic Linked Data

    No full text
    Bosque-Gil J, Cimiano P, Dojchinovski M. Editorial of the Special Issue on Latest Advancements in Linguistic Linked Data. Semantic Web . 2022;13(6):911-916.Since the inception of the Open LinguisticsWorking Group in 2010, there have been numerous efforts in transforming language resources into Linked Data. The research field of Linguistic Linked Data (LLD) has gained in importance, visibility and impact, with the Linguistic Linked Open Data (LLOD) cloud gathering nowadays over 200 resources. With this increasing growth, new challenges have emerged concerning particular domain and task applications, quality dimensions, and linguistic features to take into account. This special issue aims to review and summarize the progress and status of LLD research in recent years, as well as to offer an understanding of the challenges ahead of the field for the years to come. The papers in this issue indicate that there are still aspects to address for a wider community adoption of LLD, as well as a lack of resources for specific tasks and (interdisciplinary) domains. Likewise, the integration of LLD resources into Natural Language Processing (NLP) architectures and the search for long-term infrastructure solutions to host LLD resources continue to be essential points to which to attend in the foreseeable future of the research line

    Terminesp RDF

    No full text
    <p>This is the RDF OntoLex version of the Terminesp dataset. OntoLex is an on-going work and has not been released yet.</p> <p>For more information about OntoLex, please visit http://www.w3.org/community/ontolex/wiki/Final_Model_Specification</p> <p>For more information about Terminesp, please visit http://www.wikilengua.org/index.php/Wikilengua:Terminesp</p

    Linked data in lexicography

    Full text link
    Modeling lexical information as a graph is not a novel notion coming from L

    Front Matter, Table of Contents, Preface, Conference Organization

    No full text
    Front Matter, Table of Contents, Preface, Conference Organizatio
    corecore