21 research outputs found

    XLWrap – Querying and Integrating Arbitrary Spreadsheets with SPARQL

    Full text link

    A Language for the Specification of the Schema of Spreadsheets for the Materialization of Ontologies

    Get PDF
    Ontology-based Data Access (OBDA) is concerned with providing end-users and applications with a way to query legacy databases through a high-level ontology that models both the business logic and the underlying data sources, accessed by mappings that de ne how to express records of the database as ontological assertions. In this research, we are concerned with providing with tools for performing OBDA with relational and non-relational data sources. We developed an OBDA tool that is able to access H2 databases and CSV les allowing the user to explicitly formulate mappings, and populating an ontology that can be saved for later querying. In this paper, we present an extension of our previous work as a language for specifying the schema of the data in a spreadsheet data application. This speci cation is then used to access the contents of a set of Excel books and express them as a relational database with the ultimate goal of materializing its data as an OWL/RDF ontology. We characterize the syntax and semantics of the language, present a prototypical implementation and report on the performance tests showing that our implementation can handle a workload of Excel tables of the order of ten thousand records.Workshop: WISS – Innovación en Sistemas de SoftwareRed de Universidades con Carreras en Informátic

    Interpreting environmental computational spreadsheets

    Get PDF
    Abstract. Environmental computational spreadsheets are important tools in supporting decision making. However, as the underlying concepts and relations are not made explicit, the transparency and re-use of these spreadsheets is severely limited. The goal of this project is to provide a semi-automatic methodology for constructing the underlying knowl-edge level model of environmental computational spreadsheets. We de-velop and test this methodology in a limited number of case studies. Our methodology combines heuristics on spreadsheet layout and for-mulas, with existing methods from computer science. We evaluate our constructed model with both the original developers and their peers. 1 Problem Statement Current environmental issues, like climate change and biodiversity loss, are uni-versal in their scale and long-term in their impact, their mechanisms are complex, and empirical data are scarce [1–3]. In addition there is an urgent need to find strategies to cope with these issues, and political pressure on the research com

    Assessing and refining mappings to RDF to improve dataset quality

    Get PDF
    RDF dataset quality assessment is currently performed primarily after data is published. However, there is neither a systematic way to incorporate its results into the dataset nor the assessment into the publishing workflow. Adjustments are manually -but rarely- applied. Nevertheless, the root of the violations which often derive from the mappings that specify how the RDF dataset will be generated, is not identified. We suggest an incremental, iterative and uniform validation workflow for RDF datasets stemming originally from (semi-) structured data (e.g., CSV, XML, JSON). In this work, we focus on assessing and improving their mappings. We incorporate (i) a test-driven approach for assessing the mappings instead of the RDF dataset itself, as mappings reflect how the dataset will be formed when generated; and (ii) perform semi-automatic mapping refinements based on the results of the quality assessment. The proposed workflow is applied to diverse cases, e.g., large, crowdsourced datasets such as DBpedia, or newly generated, such as iLastic. Our evaluation indicates the efficiency of our workflow, as it significantly improves the overall quality of an RDF dataset in the observed cases

    OBDI System for Fuzzy Web Data Table Integration Using an Ontological and Terminological Resource

    Get PDF
    When finding new product innovations or filling new patents, inventors have necessary to retrieve all the relevant pre-existing know-how or to exploit and enforce patents in the technological area. Since the OTR is at the important and heart of Semantic Ontology system, this team works on the ontology construction and evolution. Author present system architecture relies on an Ontological and the Terminological Resource (OTR) which is made up of two parts: on the one end, a generic set of concepts dedicated to data integration task, on the other hand, a specific set of concepts and terminology, to a given domain of application. The important objective of the semantic annotation method here is to identify which relations of OTR are represented in data table that simple concepts are called in the given simple target concepts. In order to annotate a column by a simple target concept, a score is computed for each of the simple target concept of the OTR, on a generic OTR expressed in OWL. Here the system allows XML data tables that have been taken from Web documents, to be annotated with fuzzy RDF descriptions and to be flexibly Ontology search engine. Ontology search engine allows for retrieve not only to exact answers compared with selection criteria but also semantically close answers and compare the this selection criteria expressed as fuzzy sets representing preferences with fuzzy annotations of data. DOI: 10.17762/ijritcc2321-8169.15072

    Towards Customizable Chart Visualizations of Tabular Data Using Knowledge Graphs

    Get PDF
    Scientific articles are typically published as PDF documents, thus rendering the extraction and analysis of results a cumbersome, error-prone, and often manual effort. New initiatives, such as ORKG, focus on transforming the content and results of scientific articles into structured, machine-readable representations using Semantic Web technologies. In this article, we focus on tabular data of scientific articles, which provide an organized and compressed representation of information. However, chart visualizations can additionally facilitate their comprehension. We present an approach that employs a human-in-the-loop paradigm during the data acquisition phase to define additional semantics for tabular data. The additional semantics guide the creation of chart visualizations for meaningful representations of tabular data. Our approach organizes tabular data into different information groups which are analyzed for the selection of suitable visualizations. The set of suitable visualizations serves as a user-driven selection of visual representations. Additionally, customization for visual representations provides the means for facilitating the understanding and sense-making of information

    Translation of Heterogeneous Databases into RDF, and Application to the Construction of a SKOS Taxonomical Reference

    Get PDF
    International audienceWhile the data deluge accelerates, most of the data produced remains locked in deep Web databases. For the linked open data to benefit from the potential represented by this huge amount of data, it is crucial to come up with solutions to expose heterogeneous databases as linked data. The xR2RML mapping language is an endeavor towards this goal: it is designed to map various types of databases to RDF, by flexibly adapting to heterogeneous query languages and data models while remaining free from any specific language. It extends R2RML, the W3C recommendation for the mapping of relational databases to RDF, and relies on RML for the handling of various data formats. In this paper we present xR2RML, we analyse data models of several modern databases as well as the format in which query results are returned , and we show how xR2RML translates any result data element into RDF, relying on existing languages such as XPath and JSONPath when necessary. We illustrate some features of xR2RML such as the generation of RDF collections and containers, and the ability to deal with mixed data formats. We also describe a real-world use case in which we applied xR2RML to build a SKOS thesaurus aimed at supporting studies on History of Zoology, Archaeozoology and Conservation Biology

    Translation of Relational and Non-Relational Databases into RDF with xR2RML

    Get PDF
    International audienceWith the growing amount of data being continuously produced, it is crucial to come up with solutions to expose data from ever more heterogeneous databases (e.g. NoSQL systems) as linked data.In this paper we present xR2RML, a language designed to describe the mapping of various types of databases to RDF. xR2RML flexibly adapts to heterogeneous query languages and data models while remaining free from any specific language or syntax. It extends R2RML, the W3C recommendation for the mapping of relational databases to RDF, and relies on RML for the handling of various data representation formats.We analyse data models of several modern databases as well as the format in which query results are returned, and we show that xR2RML can translate any data element within such results into RDF, relying on existing languages such as XPath and JSONPath if needed. We illustrate some features of xR2RML such as the generation of RDF collections and containers, and the ability to deal with mixed content
    corecore