135 research outputs found

    RuThes cloud: Towards a multilevel linguistic linked open data resource for Russian

    Get PDF
    © 2017, Springer International Publishing AG. In this paper we present a new multi-level Linguistic Linked Open Data resource for Russian. It covers four linguistic levels: semantic, lexical, morphological and syntactic. The resource has been constructed on base of the well-known RuThes thesaurus and the original hitherto unpublished Extended Zaliznyak grammatical dictionary. The resource is represented in terms of SKOS, Lemon, and LexInfo ontologies and a new custom ontology. Building the resource, we automatically completed the following tasks: merging source resources upon common lexical entries, decomposing complex lexical entries, and publishing constructed resource as LLOD-compatible dataset. We demonstrate the use case in which the developed resource is exploited in IR task. We hope that our work can serve as a crystallization point of the LLOD cloud in Russian

    The Open Linguistics Working Group: developing the Linguistic Linked Open Data cloud

    Get PDF
    The Open Linguistics Working Group (OWLG) brings together researchers from various fields of linguistics, natural language processing, and information technology to present and discuss principles, case studies, and best practices for representing, publishing and linking linguistic data collections. A major outcome of our work is the Linguistic Linked Open Data (LLOD) cloud, an LOD (sub-)cloud of linguistic resources, which covers various linguistic databases, lexicons, corpora, terminologies, and metadata repositories. We present and summarize five years of progress on the development of the cloud and of advancements in open data in linguistics, and we describe recent community activities. The paper aims to serve as a guideline to orient and involve researchers with the community and/or Linguistic Linked Open Data

    On the linguistic linked open data infrastructure

    Get PDF
    In this paper we describe the current state of development of the Linguistic Linked Open Data (LLOD) infrastructure, an LOD(sub-)cloud of linguistic resources, which covers various linguistic data bases, lexicons, corpora, terminology and metadata repositories.We give in some details an overview of the contributions made by the European H2020 projects “PrĂȘt-Ă -LLOD” (‘Ready-to-useMultilingual Linked Language Data for Knowledge Services across Sectors’) and “ELEXIS” (‘European Lexicographic Infrastructure’) to the further development of the LLOD

    Recent developments for the linguistic linked open data infrastructure

    Get PDF
    In this paper we describe the contributions made by the European H2020 project “Pret-a-LLOD” (‘Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors’) to the further development of the Linguistic Linked Open Data (LLOD) infrastructure. Pret-a-LLOD aims to develop a new methodology for building data value chains applicable to a wide range of sectors and applications and based around language resources and language technologies that can be integrated by means of semantic technologies. We describe the methods implemented for increasing the number of language data sets in the LLOD. We also present the approach for ensuring interoperability and for porting LLOD data sets and services to other infrastructures, as well as the contribution of the projects to existing standards

    Interoperability of language-related information: mapping the BLL Thesaurus to Lexvo and Glottolog

    Get PDF
    Since 2013, the thesaurus of the Bibliography of Linguistic Literature (BLL Thesaurus) has been applied in the context of the Linguistik portal, a hub for linguistically relevant information. Several consecutive projects focus on the modeling of the BLL Thesaurus as ontology and its linking to terminological repositories in the Linguistic Linked Open Data (LLOD) cloud. Those mappings facilitate the connection between the Linguistik portal and the cloud. In the paper, we describe the current efforts to establish interoperability between the language-related index terms and repositories providing language identifiers for the web of Linked Data. After an introduction of Lexvo and Glottolog, we outline the scope, the structure, and the peculiarities of the BLL Thesaurus. We discuss the challenges for the design of scientifically plausible language classification and the linking between divergent classifications. We describe the prototype of the linking model and propose pragmatic solutions for structural or conceptual conflicts. Additionally, we depict the benefits from the envisaged interoperability - for the Linguistik portal, and the Linked Open Data Community in general

    Lin|gu|is|tik: building the linguist's pathway to bibliographies, libraries, language resources and linked open data

    Get PDF
    This paper introduces a novel research tool for the field of linguistics: The Lin|gu|is|tik web portal provides a virtual library which offers scientific information on every linguistic subject. It comprises selected internet sources and databases as well as catalogues for linguistic literature, and addresses an interdisciplinary audience. The virtual library is the most recent outcome of the Special Subject Collection Linguistics of the German Research Foundation (DFG), and also integrates the knowledge accumulated in the Bibliography of Linguistic Literature. In addition to the portal, we describe long-term goals and prospects with a special focus on ongoing efforts regarding an extension towards integrating language resources and Linguistic Linked Open Data

    AcciĂłn COST “Red europea para la ciencia de datos lingĂŒĂ­sticos centrada en la web” (NexusLinguarum)

    Get PDF
    We present the current state of the large “European network for Web-centred linguistic data science”. In its first phase, the network has put in place several working groups to deal with specific topics. The network also already implemented a first round of Short Term Scientific Missions (STSM).Presentamos el estado actual de la “Red Europea para la ciencia de datos lingĂŒĂ­sticos centrada en la Web”. En su primera fase, el proyecto ha establecido varios grupos de trabajo para tratar temas especĂ­ficos. La red tambiĂ©n implementĂł una primera ronda de Misiones CientĂ­ficas de Corto Plazo (la sigla STSM en Ingles, para Short Term Scientifc Mission).Work presented here was supported in part by the COST Action CA18209 – NexusLinguarum “European network for Web-centred linguistic data science”, the project PrĂȘt-Ă -LLOD, under grant agreement no. 825182, and the ELEXIS project, under grant agreement no. 731015

    When linguistics meets web technologies. Recent advances in modelling linguistic linked data

    Get PDF
    This article provides an up-to-date and comprehensive survey of models (including vocabularies, taxonomies and ontologies) used for representing linguistic linked data (LLD). It focuses on the latest developments in the area and both builds upon and complements previous works covering similar territory. The article begins with an overview of recent trends which have had an impact on linked data models and vocabularies, such as the growing influence of the FAIR guidelines, the funding of several major projects in which LLD is a key component, and the increasing importance of the relationship of the digital humanities with LLD. Next, we give an overview of some of the most well known vocabularies and models in LLD. After this we look at some of the latest developments in community standards and initiatives such as OntoLex-Lemon as well as recent work which has been in carried out in corpora and annotation and LLD including a discussion of the LLD metadata vocabularies META-SHARE and lime and language identifiers. In the following part of the paper we look at work which has been realised in a number of recent projects and which has a significant impact on LLD vocabularies and models
    • 

    corecore