Search CORE

37 research outputs found

Annotation interoperability for the post-ISOCat era

Author: Abromeit Frank
Chiarcos Christian
Fäth Christian
Publication venue
Publication date: 24/04/2023
Field of study

With this paper, we provide an overview over ISOCat successor solutions and annotation standardization efforts since 2010, and we describe the low-cost harmonization of post-ISOCat vocabularies by means of modular, linked ontologies: The CLARIN Concept Registry, LexInfo, Universal Parts of Speech, Universal Dependencies and UniMorph are linked with the Ontologies of Linguistic Annotation and through it with ISOCat, the GOLD ontology, the Typological Database Systems ontology and a large number of annotation schemes

OPUS Augsburg

Development of linguistic linked open data resources for collaborative data-intensive research in the language sciences

Author: Blume Maria
Chiarcos Christian
Lust Barbara C.
Pareja-Lora Antonio
Publication venue: 'MIT Press - Journals'
Publication date: 27/04/2023
Field of study

Making diverse data in linguistics and the language sciences open, distributed, and accessible: perspectives from language/language acquistiion researchers and technical LOD (linked open data) researchers. This volume examines the challenges inherent in making diverse data in linguistics and the language sciences open, distributed, integrated, and accessible, thus fostering wide data sharing and collaboration. It is unique in integrating the perspectives of language researchers and technical LOD (linked open data) researchers. Reporting on both active research needs in the field of language acquisition and technical advances in the development of data interoperability, the book demonstrates the advantages of an international infrastructure for scholarship in the field of language sciences. With contributions by researchers who produce complex data content and scholars involved in both the technology and the conceptual foundations of LLOD (linguistics linked open data), the book focuses on the area of language acquisition because it involves complex and diverse data sets, cross-linguistic analyses, and urgent collaborative research. The contributors discuss a variety of research methods, resources, and infrastructures. Contributors Isabelle Barrière, Nan Bernstein Ratner, Steven Bird, Maria Blume, Ted Caldwell, Christian Chiarcos, Cristina Dye, Suzanne Flynn, Claire Foley, Nancy Ide, Carissa Kang, D. Terence Langendoen, Barbara Lust, Brian MacWhinney, Jonathan Masci, Steven Moran, Antonio Pareja-Lora, Jim Reidy, Oya Y. Rieger, Gary F. Simons, Thorsten Trippel, Kara Warburton, Sue Ellen Wright, Claus Zin

OPUS Augsburg

Development of Linguistic Linked Open Data Resources for Collaborative Data-Intensive Research in the Language Sciences

Author
Publication venue
Publication date
Field of study

This book is the product of an international workshop dedicated to addressing data accessibility in the linguistics field. It is therefore vital to the book’s mission that its content be open access. Linguistics as a field remains behind many others as far as data management and accessibility strategies. The problem is particularly acute in the subfield of language acquisition, where international linguistic sound files are needed for reference. Linguists' concerns are very much tied to amount of information accumulated by individual researchers over the years that remains fragmented and inaccessible to the larger community. These concerns are shared by other fields, but linguistics to date has seen few efforts at addressing them. This collection, undertaken by a range of leading experts in the field, represents a big step forward. Its international scope and interdisciplinary combination of scholars/librarians/data consultants will provide an important contribution to the field

OAPEN Library

Challenges for the development of linked open data for research in multilingualism

Author: Barriere I
Blume M
Dye CD
Kang C
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Newcastle University E-Prints

When linguistics meets web technologies. Recent advances in modelling linguistic linked data

Author: Chiarcos Christian
Declerck Thierry
García Elena González-Blanco
Gifu Daniela
Gracia Jorge
Ionov Maxim
Khan Anas Fahad
Labropoulou Penny
Mambrini Francesco (ORCID:0000-0003-0834-7562)
McCrae John P.
Muñoz Salvador Ros
Pagé-Perron Émilie
Passarotti Marco (ORCID:0000-0002-9806-7187)
Truică Ciprian-Octavian
Publication venue: 'IOS Press'
Publication date: 01/01/2022
Field of study

This article provides an up-to-date and comprehensive survey of models (including vocabularies, taxonomies and ontologies) used for representing linguistic linked data (LLD). It focuses on the latest developments in the area and both builds upon and complements previous works covering similar territory. The article begins with an overview of recent trends which have had an impact on linked data models and vocabularies, such as the growing influence of the FAIR guidelines, the funding of several major projects in which LLD is a key component, and the increasing importance of the relationship of the digital humanities with LLD. Next, we give an overview of some of the most well known vocabularies and models in LLD. After this we look at some of the latest developments in community standards and initiatives such as OntoLex-Lemon as well as recent work which has been in carried out in corpora and annotation and LLD including a discussion of the LLD metadata vocabularies META-SHARE and lime and language identifiers. In the following part of the paper we look at work which has been realised in a number of recent projects and which has a significant impact on LLD vocabularies and models

OPUS Augsburg

PubliCatt

Repositorio Universidad de Zaragoza

Linking Discourse Marker Inventories

Author: Chiarcos Christian
Ionov Maxim
Publication venue: OASIcs - OpenAccess Series in Informatics. 3rd Conference on Language, Data and Knowledge (LDK 2021)
Publication date: 01/01/2021
Field of study

The paper describes the first comprehensive edition of machine-readable discourse marker lexicons. Discourse markers such as and, because, but, though or thereafter are essential communicative signals in human conversation, as they indicate how an utterance relates to its communicative context. As much of this information is implicit or expressed differently in different languages, discourse parsing, context-adequate natural language generation and machine translation are considered particularly challenging aspects of Natural Language Processing. Providing this data in machine-readable, standard-compliant form will thus facilitate such technical tasks, and moreover, allow to explore techniques for translation inference to be applied to this particular group of lexical resources that was previously largely neglected in the context of Linguistic Linked (Open) Data

Dagstuhl Research Online Publication Server

Final FLaReNet deliverable: Language Resources for the Future - The Future of Language Resources

Author: Bel N.
Calzolari N.
Choukri Khalid
LS OZ Taal en spraaktechnologie
Mariani J.
Monachini M.
Odijk J.E.J.M.
Piperidis S
Quochi V.
Soria C.
UiL OTS LLI
Publication venue
Publication date: 01/01/2011
Field of study

Language Technologies (LT), together with their backbone, Language Resources (LR), provide an essential support to the challenge of Multilingualism and ICT of the future. The main task of language technologies is to bridge language barriers and to help creating a new environment where information flows smoothly across frontiers and languages, no matter the country, and the language, of origin. To achieve this goal, all players involved need to act as a community able to join forces on a set of shared priorities. However, until now the field of Language Resources and Technology has long suffered from an excess of individuality and fragmentation, with a lack of coherence concerning the priorities for the field, the direction to move, not to mention a common timeframe. The context encountered by the FLaReNet project was thus represented by an active field needing a coherence that can only be given by sharing common priorities and endeavours. FLaReNet has contributed to the creation of this coherence by gathering a wide community of experts and making them participate in the definition of an exhaustive set of recommendations

PUblication MAnagement

Utrecht University Repository

Linking discourse marker inventories

Author: Chiarcos Christian
Ionov Maxim
Publication venue
Publication date: 24/04/2023
Field of study

OPUS Augsburg

Towards an interoperable ecosystem of AI and LT platforms: a roadmap for the implementation of different levels of interoperability

OPUS Augsburg