14,095 research outputs found

    Reconciliation of RDF* and Property Graphs

    Full text link
    Both the notion of Property Graphs (PG) and the Resource Description Framework (RDF) are commonly used models for representing graph-shaped data. While there exist some system-specific solutions to convert data from one model to the other, these solutions are not entirely compatible with one another and none of them appears to be based on a formal foundation. In fact, for the PG model, there does not even exist a commonly agreed-upon formal definition. The aim of this document is to reconcile both models formally. To this end, the document proposes a formalization of the PG model and introduces well-defined transformations between PGs and RDF. As a result, the document provides a basis for the following two innovations: On one hand, by implementing the RDF-to-PG transformations defined in this document, PG-based systems can enable their users to load RDF data and make it accessible in a compatible, system-independent manner using, e.g., the graph traversal language Gremlin or the declarative graph query language Cypher. On the other hand, the PG-to-RDF transformation in this document enables RDF data management systems to support compatible, system-independent queries over the content of Property Graphs by using the standard RDF query language SPARQL. Additionally, this document represents a foundation for systematic research on relationships between the two models and between their query languages.Comment: slightly changed the definition of PGs and added the notion of property uniquenes

    Fintan - Flexible, Integrated Transformation and Annotation eNgineering

    Get PDF
    We introduce the Flexible and Integrated Transformation and Annotation eNgeneering (Fintan) platform for converting heterogeneous linguistic resources to RDF. With its modular architecture, workflow management and visualization features, Fintan facilitates the development of complex transformation pipelines by integrating generic RDF converters and augmenting them with extended graph processing capabilities: Existing converters can be easily deployed to the system by means of an ontological data structure which renders their properties and the dependencies between transformation steps. Development of subsequent graph transformation steps for resource transformation, annotation engineering or entity linking is further facilitated by a novel visual rendering of SPARQL queries. A graphical workflow manager allows to easily manage the converter modules and combine them to new transformation pipelines. Employing the stream-based graph processing approach first implemented with CoNLL-RDF, we address common challenges and scalability issues when transforming resources and showcase the performance of Fintan by means of a purely graph-based transformation of the Universal Morphology data to RDF

    Towards Efficient Path Query on Social Network with Hybrid RDF Management

    Full text link
    The scalability and exibility of Resource Description Framework(RDF) model make it ideally suited for representing online social networks(OSN). One basic operation in OSN is to find chains of relations,such as k-Hop friends. Property path query in SPARQL can express this type of operation, but its implementation suffers from performance problem considering the ever growing data size and complexity of OSN.In this paper, we present a main memory/disk based hybrid RDF data management framework for efficient property path query. In this hybrid framework, we realize an efficient in-memory algebra operator for property path query using graph traversal, and estimate the cost of this operator to cooperate with existing cost-based optimization. Experiments on benchmark and real dataset demonstrated that our approach can achieve a good tradeoff between data load expense and online query performance

    Formalizing Gremlin pattern matching traversals in an integrated graph Algebra

    Get PDF
    Graph data management (also called NoSQL) has revealed beneficial characteristics in terms of flexibility and scalability by differ-ently balancing between query expressivity and schema flexibility. This peculiar advantage has resulted into an unforeseen race of developing new task-specific graph systems, query languages and data models, such as property graphs, key-value, wide column, resource description framework (RDF), etc. Present-day graph query languages are focused towards flex-ible graph pattern matching (aka sub-graph matching), whereas graph computing frameworks aim towards providing fast parallel (distributed) execution of instructions. The consequence of this rapid growth in the variety of graph-based data management systems has resulted in a lack of standardization. Gremlin, a graph traversal language, and machine provide a common platform for supporting any graph computing sys-tem (such as an OLTP graph database or OLAP graph processors). In this extended report, we present a formalization of graph pattern match-ing for Gremlin queries. We also study, discuss and consolidate various existing graph algebra operators into an integrated graph algebra

    Graph-based data management system for efficient information storage, retrieval and processing

    Get PDF
    Data management systems rely on a correct design of data representation and software components. The data representation scheme plays a vital role in how the data are stored, which influences the efficiency of its processing and retrieval. The system components design realizes software engineering concepts to enable performance metrics such as scalability, efficiency, flexibility, maintainability, and extendibility. This paper presents a data management system that uses a graph-based data representation scheme to achieve an efficient data retrieval when using graph-based databases. Input data are transformed into vertices, edges, and labels while inserting them into the database. The proposed system consists of three layers which are: system beans layer, data access layer, and the database engine. Healthcare data are used to evaluate the system in comparison with resource description framework (RDF) semantics. Extensive experiments are conducted to compare different scenarios of data storage and retrieval using Neo4J, OrientDB, and RDF4J. Experimental results show that the performance of the proposed graph-based approach outperforms RDF4J framework in terms of insertion and retrieval time
    corecore