39 research outputs found
Challenges for the Multilingual Web of Data
The Web has witnessed an enormous growth in the amount of semantic information published in recent years. This growth has been stimulated to a large extent by the emergence of Linked Data. Although this brings us a big step closer to the vision of a Semantic Web, it also raises new issues such as the need for dealing with information expressed in different natural languages. Indeed, although the Web of Data can contain any kind of information in any language, it still lacks explicit mechanisms to automatically reconcile such information when it is expressed in ifferent languages. This leads to situations in which data expressed in a certain language is not easily accessible to speakers of other languages.
The Web of Data shows the potential for being extended to a truly multilingual web as vocabularies and data can be published in a language-independent fashion, while associated language-dependent (linguistic) information supporting the access across languages can be stored separately. In this sense, the multilingual Web of Data can be realized in our view as a layer of services and resources on top of the existing Linked Data infrastructure adding i) linguistic information for data and vocabularies in different languages, ii) mappings between data with labels in different languages, and iii) services to dynamically access and traverse Linked Data across different languages.
In this article we present this vision of a multilingual Web of Data. We discuss challenges that need to be addressed to make this vision come true and discuss the role that techniques such as ontology localization, ontology mapping, and cross-lingual ontology-based information access and presentation will play in achieving this. Further, we propose an initial architecture and describe a roadmap that can provide a basis for the implementation of this vision
Challenges for the multilingual Web of Data
Garcia J, Montiel-Ponsoda E, Cimiano P, Gómez-Pérez A, Buitelaar P, McCrae J. Challenges for the multilingual Web of Data. Journal of Web Semantics: Science, Services and Agents on the World Wide Web. 2012;11:63-71
What is the current state of the Multilingual Web of Data?
The Semantic Web is growing at a fast pace, recently boosted by the creation of the Linked Data initiative and principles. Methods, standards, techniques and the state of technology are becoming more mature and therefore are easing the task of publication and consumption of semantic information on the Web
Cross-lingual Linking on the Multilingual Web of Data (position statement)
Recently, the Semantic Web has experienced signi�cant advancements in standards and techniques, as well as in the amount of semantic information available online. Even so, mechanisms are still needed to automatically reconcile semantic information when it is expressed in di�erent natural languages, so that access to Web information across language barriers can be improved. That requires developing techniques for discovering and representing cross-lingual links on the Web of Data. In this paper we explore the different dimensions of such a problem and reflect on possible avenues of research on that topic
Using Cross-Lingual Explicit Semantic Analysis for Improving Ontology Translation
Semantic Web aims to allow machines to make inferences using the explicit conceptualisations contained in ontologies. By pointing to ontologies, Semantic Web-based applications are able to inter-operate and share common information easily. Nevertheless, multilingual semantic applications are still rare, owing to the fact that most online ontologies are monolingual in English. In order to solve this issue, techniques for ontology localisation and translation are needed. However, traditional machine translation is difficult to apply to ontologies, owing to the fact that ontology labels tend to be quite short in length and linguistically different from the free text paradigm. In this paper, we propose an approach to enhance machine translation of ontologies based on exploiting the well-structured concept descriptions contained in the ontology. In particular, our approach leverages the semantics contained in the ontology by using Cross Lingual Explicit Semantic Analysis (CLESA) for context-based disambiguation in phrase-based Statistical Machine Translation (SMT). The presented work is novel in the sense that application of CLESA in SMT has not been performed earlier to the best of our knowledge
Cross-lingual ontology matching as a challenge for the Multilingual Semantic Web
Recently, the Semantic Web has experienced significant advancements in standards and techniques, as
well as in the amount of semantic information available online. Nevertheless, mechanisms are still
needed to automatically reconcile information when it is expressed in different natural languages on the
Web of Data, in order to improve the access to semantic information across language barriers. In this
context several challenges arise [1], such as: (i) ontology translation/localization, (ii) cross-lingual
ontology mappings, (iii) representation of multilingual lexical information, and (iv) cross-lingual access
and querying of linked data.
In the following we will focus on the second challenge, which is the necessity of establishing,
representing and storing cross-lingual links among semantic information on the Web. In fact, in a
“truly” multilingual Semantic Web, semantic data with lexical representations in one natural language
would be mapped to equivalent or related information in other languages, thus making navigation across
multilingual information possible for software agents
Git4Voc: Git-based Versioning for Collaborative Vocabulary Development
Collaborative vocabulary development in the context of data integration is
the process of finding consensus between the experts of the different systems
and domains. The complexity of this process is increased with the number of
involved people, the variety of the systems to be integrated and the dynamics
of their domain. In this paper we advocate that the realization of a powerful
version control system is the heart of the problem. Driven by this idea and the
success of Git in the context of software development, we investigate the
applicability of Git for collaborative vocabulary development. Even though
vocabulary development and software development have much more similarities
than differences there are still important differences. These need to be
considered within the development of a successful versioning and collaboration
system for vocabulary development. Therefore, this paper starts by presenting
the challenges we were faced with during the creation of vocabularies
collaboratively and discusses its distinction to software development. Based on
these insights we propose Git4Voc which comprises guidelines how Git can be
adopted to vocabulary development. Finally, we demonstrate how Git hooks can be
implemented to go beyond the plain functionality of Git by realizing
vocabulary-specific features like syntactic validation and semantic diffs
Cross-lingual Linking on the Multilingual Web of Data (position statement)
Recently, the Semantic Web has experienced signi�cant advancements in standards and techniques, as well as in the amount of semantic information available online. Even so, mechanisms are still needed to automatically reconcile semantic information when it is expressed in di�erent natural languages, so that access to Web information across language barriers can be improved. That requires developing techniques for discovering and representing cross-lingual links on the Web of Data. In this paper we explore the different dimensions of such a problem and reflect on possible avenues of research on that topic