44 research outputs found

    Materializing the editing history of Wikipedia as linked data in DBpedia

    Get PDF
    International audienceWe describe a DBpedia extractor materializing the editing history of Wikipedia pages as linked data to support queries and indicators on the history

    Developing a Benchmark Suite for Semantic Web Data from Existing Workflows

    Get PDF
    This paper presents work in progress towards developing a new benchmark for federated query processing systems. Unlike other popular benchmarks, our queryset is not driven by technical evaluation, but is derived from workflows established by the pharmacology community. The value of this queryset is that it is realistic but at the same time it comprises complex queries that test all features of modern query processing systems

    Materializing the editing history of Wikipedia as linked data in DBpedia

    Get PDF
    International audienceWe describe a DBpedia extractor materializing the editing history of Wikipedia pages as linked data to support queries and indicators on the history

    Combining Flexible Queries and Knowledge Anchors to facilitate the exploration of Knowledge Graphs

    Get PDF
    Semantic web and information extraction technologies are enabling the creation of vast information and knowledge repositories, particularly in the form of knowledge graphs comprising entities and the relationships between them. Users are often unfamiliar with the complex structure and vast content of such graphs. Hence, users need to be assisted by tools that support interactive exploration and flexible querying. In this paper we draw on recent work in flexible querying for graph-structured data and identifying good anchors for knowledge graph exploration in order to demonstrate how users can be supported in incrementally querying, exploring and learning from large complex knowledge graphs. We demonstrate our techniques through a case study in the domain of lifelong learning and career guidance

    Community detection applied on big linked data

    Get PDF
    The Linked Open Data (LOD) Cloud has more than tripled its sources in just six years (from 295 sources in 2011 to 1163 datasets in 2017). The actual Web of Data contains more then 150 Billions of triples. We are assisting at a staggering growth in the production and consumption of LOD and the generation of increasingly large datasets. In this scenario, providing researchers, domain experts, but also businessmen and citizens with visual representations and intuitive interactions can significantly aid the exploration and understanding of the domains and knowledge represented by Linked Data. Various tools and web applications have been developed to enable the navigation, and browsing of the Web of Data. However, these tools lack in producing high level representations for large datasets, and in supporting users in the exploration and querying of these big sources. Following this trend, we devised a new method and a tool called H-BOLD (High level visualizations on Big Open Linked Data). H-BOLD enables the exploratory search and multilevel analysis of Linked Open Data. It offers different levels of abstraction on Big Linked Data. Through the user interaction and the dynamic adaptation of the graph representing the dataset, it will be possible to perform an effective exploration of the dataset, starting from a set of few classes and adding new ones. Performance and portability of H-BOLD have been evaluated on the SPARQL endpoint listed on SPARQL ENDPOINT STATUS. The effectiveness of H-BOLD as a visualization tool is described through a user study

    Adding value to Linked Open Data using a multidimensional model approach based on the RDF Data Cube vocabulary

    Get PDF
    Most organisations using Open Data currently focus on data processing and analysis. However, although Open Data may be available online, these data are generally of poor quality, thus discouraging others from contributing to and reusing them. This paper describes an approach to publish statistical data from public repositories by using Semantic Web standards published by the W3C, such as RDF and SPARQL, in order to facilitate the analysis of multidimensional models. We have defined a framework based on the entire lifecycle of data publication including a novel step of Linked Open Data assessment and the use of external repositories as knowledge base for data enrichment. As a result, users are able to interact with the data generated according to the RDF Data Cube vocabulary, which makes it possible for general users to avoid the complexity of SPARQL when analysing data. The use case was applied to the Barcelona Open Data platform and revealed the benefits of the application of our approach, such as helping in the decision-making process.This work was supported in part by the Spanish Ministry of Science, Innovation and Universities through the Project ECLIPSE-UA under grant RTI2018-094283-B-C32

    Protocol conformance of collaborative SPARQL using multiparty session types

    Get PDF
    Decentralised linked data gives users rights over their data while being accessible to other domains. The RDF (Resource Description Framework) and SPARQL have been the standard specifications for managing linked data for several years. Recent research and development introduce scalable, centralised and distributed RDF store engines with the SPARQL. However, writing SPARQL federated queries may grow more complex as the number of domain participants increases, presenting challenges such as source discovery, completeness and performance. This paper presents a SPARQL Query Template (SQT) that applies Multiparty Session Types (MPST) to determine the order of federated queries. We also guarantee protocol conformance between MPST and SPARQL relational algebra

    GeoTriples: Transforming geospatial data into RDF graphs using R2RML and RML mappings

    Get PDF
    A lot of geospatial data has become available at no charge in many countries recently. Geospatial data that is currently made available by government agencies usually do not follow the linked data paradigm. In the few cases where government agencies do follow the linked data paradigm (e.g., Ordnance Survey in the United Kingdom), specialized scripts have been used for transforming geospatial data into RDF. In this paper we present the open source tool GeoTriples which generates and processes extended R2RML and RML mappings that transform geospatial data from many input formats into RDF. GeoTriples allows the transformation of geospatial data stored in raw files (shapefiles, CSV, KML, XML, GML and GeoJSON) and spatially-enabled RDBMS (PostGIS and MonetDB) into RDF graphs using well-known vocabularies like GeoSPARQL and stSPARQL, but without being tightly coupled to a specific vocabulary. GeoTriples has been developed in European projects LEO and Melodies and has been used to transform many geospatial data sources into linked data. We study the performance of GeoTriples experimentally using large publicly available geospatial datasets, and show that GeoTriples is very efficient and scalable especially when its mapping processor is implemented using Apache Hadoop

    Optimizing Analytical Queries over Semantic Web Sources

    Get PDF
    corecore