45 research outputs found

    What is the current state of the Multilingual Web of Data?

    Get PDF
    The Semantic Web is growing at a fast pace, recently boosted by the creation of the Linked Data initiative and principles. Methods, standards, techniques and the state of technology are becoming more mature and therefore are easing the task of publication and consumption of semantic information on the Web

    Social semantic search : a case study on web 2.0 for science

    Get PDF
    When researchers formulate search queries to find relevant content on the Web, those queries typically consist of keywords that can only be matched in the content or its metadata. The Web of Data extends this functionality by bringing structure and giving well-defined meaning to the content and it enables humans and machines to work together using controlled vocabularies. Due the high degree of mismatches between the structure of the content and the vocabularies in different sources, searching over multiple heterogeneous repositories of structured data is considered challenging. Therefore, the authors present a semantic search engine for researchers facilitating search in research related Linked Data. To facilitate high-precision interactive search, they annotated and interlinked structured research data with ontologies from various repositories in an effective semantic model. Furthermore, the authors' system is adaptive as researchers can synchronize using new social media accounts and efficiently explore new datasets

    Why reinvent the wheel: Let's build question answering systems together

    Get PDF
    Modern question answering (QA) systems need to flexibly integrate a number of components specialised to fulfil specific tasks in a QA pipeline. Key QA tasks include Named Entity Recognition and Disambiguation, Relation Extraction, and Query Building. Since a number of different software components exist that implement different strategies for each of these tasks, it is a major challenge to select and combine the most suitable components into a QA system, given the characteristics of a question. We study this optimisation problem and train classifiers, which take features of a question as input and have the goal of optimising the selection of QA components based on those features. We then devise a greedy algorithm to identify the pipelines that include the suitable components and can effectively answer the given question. We implement this model within Frankenstein, a QA framework able to select QA components and compose QA pipelines. We evaluate the effectiveness of the pipelines generated by Frankenstein using the QALD and LC-QuAD benchmarks. These results not only suggest that Frankenstein precisely solves the QA optimisation problem but also enables the automatic composition of optimised QA pipelines, which outperform the static Baseline QA pipeline. Thanks to this flexible and fully automated pipeline generation process, new QA components can be easily included in Frankenstein, thus improving the performance of the generated pipelines

    The Landscape of Ontology Reuse in Linked Data

    Get PDF
    Abstract. The uptake of Linked Data (LD) has promoted the proliferation of datasets and their associated ontologies for describing different domains. Ac-cording to LD principles, developers should reuse as many available terms as possible to describe their data. Importing ontologies or referring to their terms’ URIs are the two main ways to reuse knowledge from available ontologies. In this paper, we have analyzed 18589 terms appearing within 196 ontologies in-cluded in the Linked Open Vocabularies (LOV) registry with the aim of under-standing the current state of ontology reuse in the LD context. In order to char-acterize the landscape of ontology reuse in this context, we have extracted sta-tistics about currently reused elements, calculated ratios for reuse, and drawn graphs about imports and references between ontologies. Keywords: ontology, vocabulary, reuse, linked data, ontology impor

    Weaving the Web(VTT) of Data

    Get PDF
    International audienceVideo has become a first class citizen on the Web with broad support in all common Web browsers. Where with struc- tured mark-up on webpages we have made the vision of the Web of Data a reality, in this paper, we propose a new vi- sion that we name the Web(VTT) of Data, alongside with concrete steps to realize this vision. It is based on the evolving standards WebVTT for adding timed text tracks to videos and JSON-LD, a JSON-based format to serial- ize Linked Data. Just like the Web of Data that is based on the relationships among structured data, the Web(VTT) of Data is based on relationships among videos based on WebVTT files, which we use as Web-native spatiotemporal Linked Data containers with JSON-LD payloads. In a first step, we provide necessary background information on the technologies we use. In a second step, we perform a large- scale analysis of the 148 terabyte size Common Crawl corpus in order to get a better understanding of the status quo of Web video deployment and address the challenge of integrat- ing the detected videos in the Common Crawl corpus into the Web(VTT) of Data. In a third step, we open-source an online video annotation creation and consumption tool, targeted at videos not contained in the Common Crawl cor- pus and for integrating future video creations, allowing for weaving the Web(VTT) of Data tighter, video by video

    Programmer le web de données avec un "Wiki-based IDE"

    Get PDF
    Session 4 : Web sémantiqueNational audienceWikiNEXT est un wiki à la croisée des wikis sémantiques et des outils de développement en ligne récents (" web based IDEs "). En permettant de coder directement dans le navigateur des applications exploitant le web de donnée et manipulant des données sémantiques, WikiNEXT étend le concept de wiki sémantique et répond au problème " comment faciliter la programmation et l'apprentissage d'applications pour le web sémantique ? ". WikiNEXT s'adresse à plusieurs profils d'utilisateurs, cependant cet article se focalise sur les aspects " wiki programmable " qui concernent principalement les développeurs web. Les wikis sémantiques actuels permettent d'insérer du contenu dynamique dans les pages, mais ne partagent pas cette approche, nous montrerons en quoi WikiNEXT améliore l'état de l'art dans le domaine. L'outil est en ligne sur http://wikinext.gexsoft.com et contient de nombreux tutoriaux interactifs

    Guidelines for multilingual linked data

    Get PDF
    In this article, we argue that there is a growing number of linked datasets in different natural languages, and that there is a need for guidelines and mechanisms to ensure the quality and organic growth of this emerging multilingual data network. However, we have little knowledge regarding the actual state of this data network, its current practices, and the open challenges that it poses. Questions regarding the distribution of natural languages, the links that are established across data in different languages, or how linguistic features are represented, remain mostly unanswered. Addressing these and other language-related issues can help to identify existing problems, propose new mechanisms and guidelines or adapt the ones in use for publishing linked data including language-related features, and, ultimately, provide metrics to evaluate quality aspects. In this article we review, discuss, and extend current guidelines for publishing linked data by focusing on those methods, techniques and tools that can help RDF publishers to cope with language barriers. Whenever possible, we will illustrate and discuss each of these guidelines, methods, and tools on the basis of practical examples that we have encountered in the publication of the datos.bne.es dataset

    Exploring semantic relationships in the web of data

    Get PDF
    corecore