108 research outputs found

    Development of linguistic linked open data resources for collaborative data-intensive research in the language sciences

    Get PDF
    Making diverse data in linguistics and the language sciences open, distributed, and accessible: perspectives from language/language acquistiion researchers and technical LOD (linked open data) researchers. This volume examines the challenges inherent in making diverse data in linguistics and the language sciences open, distributed, integrated, and accessible, thus fostering wide data sharing and collaboration. It is unique in integrating the perspectives of language researchers and technical LOD (linked open data) researchers. Reporting on both active research needs in the field of language acquisition and technical advances in the development of data interoperability, the book demonstrates the advantages of an international infrastructure for scholarship in the field of language sciences. With contributions by researchers who produce complex data content and scholars involved in both the technology and the conceptual foundations of LLOD (linguistics linked open data), the book focuses on the area of language acquisition because it involves complex and diverse data sets, cross-linguistic analyses, and urgent collaborative research. The contributors discuss a variety of research methods, resources, and infrastructures. Contributors Isabelle Barrière, Nan Bernstein Ratner, Steven Bird, Maria Blume, Ted Caldwell, Christian Chiarcos, Cristina Dye, Suzanne Flynn, Claire Foley, Nancy Ide, Carissa Kang, D. Terence Langendoen, Barbara Lust, Brian MacWhinney, Jonathan Masci, Steven Moran, Antonio Pareja-Lora, Jim Reidy, Oya Y. Rieger, Gary F. Simons, Thorsten Trippel, Kara Warburton, Sue Ellen Wright, Claus Zin

    Development of Linguistic Linked Open Data Resources for Collaborative Data-Intensive Research in the Language Sciences

    Get PDF
    This book is the product of an international workshop dedicated to addressing data accessibility in the linguistics field. It is therefore vital to the book’s mission that its content be open access. Linguistics as a field remains behind many others as far as data management and accessibility strategies. The problem is particularly acute in the subfield of language acquisition, where international linguistic sound files are needed for reference. Linguists' concerns are very much tied to amount of information accumulated by individual researchers over the years that remains fragmented and inaccessible to the larger community. These concerns are shared by other fields, but linguistics to date has seen few efforts at addressing them. This collection, undertaken by a range of leading experts in the field, represents a big step forward. Its international scope and interdisciplinary combination of scholars/librarians/data consultants will provide an important contribution to the field

    When linguistics meets web technologies. Recent advances in modelling linguistic linked data

    Get PDF
    This article provides an up-to-date and comprehensive survey of models (including vocabularies, taxonomies and ontologies) used for representing linguistic linked data (LLD). It focuses on the latest developments in the area and both builds upon and complements previous works covering similar territory. The article begins with an overview of recent trends which have had an impact on linked data models and vocabularies, such as the growing influence of the FAIR guidelines, the funding of several major projects in which LLD is a key component, and the increasing importance of the relationship of the digital humanities with LLD. Next, we give an overview of some of the most well known vocabularies and models in LLD. After this we look at some of the latest developments in community standards and initiatives such as OntoLex-Lemon as well as recent work which has been in carried out in corpora and annotation and LLD including a discussion of the LLD metadata vocabularies META-SHARE and lime and language identifiers. In the following part of the paper we look at work which has been realised in a number of recent projects and which has a significant impact on LLD vocabularies and models

    Map4Scrutiny – a linked open data solution for politicians interest registers

    Get PDF
    Dissertação de mestrado em Sistemas de InformaçãoO trabalho desenvolvido no âmbito desta dissertação descreve o processo de recolha, uniformização e transformação de dados abertos em formato de texto e tabelas (CSV) para dados abertos ligados (Linked Open Data). Especificamente, dados sobre os registos de interesses dos deputados à assembleia da república portuguesa e contratação pública, ligados pelas organizações que são mencionadas em ambos. O estado da arte inclui uma análise de fundo aos conceitos de corrupção, transparência, dados abertos, e dados abertos ligados, tal como a projetos de dados abertos e dados abertos ligados relevantes. A seleção dos dados a utilizar, com respeito aos tópicos de conjuntos de dados relevantes e ao interesse público, o desenho da solução proposta e a seleção de ferramentas, métodos e processos, seguiu a proposta de três ciclos de Hevner para uma abordagem ao desenho de investigação na ciência. O processo de implementação é iniciado com a recolha de dados das fontes utilizando bibliotecas Python para web Scraping e a transformação dos mesmos em tabelas (CSV). Estes dados são depois limpos e uniformizados com auxílio do OpenRefine. Esta ferramenta é também usada para mapear os dados da tabela para triples que são exportados em ficheiros Turtle. Este mapeamento foi previamente desenhado num perfil de aplicação que serviu de base para a criação das formas dos dados (ShExC) usadas para conduzir o processo de validação nos ficheiros Turtle. Esta validação assegura que os ficheiros gerados pelo OpenRefine são conformes com o perfil de aplicação. Para descrever adequadamente os dados foram usados vocabulários já existentes complementados, quando necessário, com a criação de novas classes, propriedades e valores. Este processo está também descrito e os vocabulários estão disponíveis para consulta e reutilização. Por fim, foram feitas consultas modelo em SPARQL para ilustrar a diferença entre os dados originais e o conjunto de dados transformado. O objetivo deste trabalho é contribuir para as áreas de dados abertos ligados e dados abertos para a transparência e escrutínio público. Os contributos principais para o primeiro são um novo esquema de dados e a descrição de todos os passos do processo de transformação. Para o segundo o contributo que se destaca é mais uma implementação que demonstra o potencial do escrutínio de dados no aumento da transparência através da comparação entra as consultas possíveis aos conjuntos de dados originais e ao resultante da solução proposta. O processo de implementação está documentado abaixo e os ficheiros resultantes disponibilizados para consulta.The work developed in the scope of this dissertation describes the process of sourcing, uniformizing, and transforming text and tabular (CSV) open data to linked open data. More exactly, data on Portuguese parliamentarians’ interest registers and public procurement, linked by the organisations mentioned in both. The state of the art presented includes a background analysis on the concepts of corruption, transparency, open data, and linked open data and an analysis of relevant open data and linked open data projects. The research was conducted using Hevner’s three-cycle design science research approach which led to the definition of the data scope concerning relevant dataset topics and the public’s interest, the design of the proposed solution, and the selected tools, methods, and processes. The implementation process starts with Scraping the data from the sources with the aid of python libraries and generating tabular (CSV) outputs. These are cleaned and uniformized in OpenRefine. OpenRefine is also the tool used to map the data on the tables into triples and generate outputs in Turtle. The map was designed in an application profile that also served as a base for writing the shapes (in ShExC) and conducting validation on the exported Turtle files. This validation ensures that the data is conformant with the application profile. To successfully describe the data in triples, on top of the external vocabularies used, new classes, properties and values had to be created. This process is also thoroughly described, and the outputs are open to access and reuse. Finally, sample SPARQL queries were made to showcase the difference between the sourced data and the resulting dataset. The goal is to contribute to the field of linked open data and open data for transparency and public scrutiny. The main contributions to the first are a new data scheme and the description of every step in the transformation process, while to the latter the contribution is a further implementation showcasing the scrutiny potential of data in improving transparency by comparing the querying possibilities of the final dataset with the originals. Every step taken is documented below and the resulting outputs of the different stages are available for consultation

    A Survey of the First 20 Years of Research on Semantic Web and Linked Data

    Get PDF
    International audienceThis paper is a survey of the research topics in the field of Semantic Web, Linked Data and Web of Data. This study looks at the contributions of this research community over its first twenty years of existence. Compiling several bibliographical sources and bibliometric indicators , we identify the main research trends and we reference some of their major publications to provide an overview of that initial period. We conclude with some perspectives for the future research challenges.Cet article est une étude des sujets de recherche dans le domaine du Web sémantique, des données liées et du Web des données. Cette étude se penche sur les contributions de cette communauté de recherche au cours de ses vingt premières années d'existence. En compilant plusieurs sources bibliographiques et indicateurs bibliométriques, nous identifions les principales tendances de la recherche et nous référençons certaines de leurs publications majeures pour donner un aperçu de cette période initiale. Nous concluons avec une discussion sur les tendances et perspectives de recherche

    A Comprehensive Study for building Resource Information Infrastructure oriented to Digital Archives

    Get PDF
    科学研究費助成事業(科学研究費補助金)研究成果報告書:基盤研究(A)2010-2012研究課題番号:2224002

    Provenance : from long-term preservation to query federation and grid reasoning

    Get PDF

    Maintaining and Publishing Metadata Application Profiles with Extensible Authoring Format

    Get PDF
    Thesis (Master of Science in Library and Information Studies)--University of Tsukuba, no. 41490, 2019.9.2
    corecore