24,358 research outputs found

    Querying Best Paths in Graph Databases

    Get PDF
    Querying graph databases has recently received much attention. We propose a new approach to this problem, which balances competing goals of expressive power, language clarity and computational complexity. A distinctive feature of our approach is the ability to express properties of minimal (e.g. shortest) and maximal (e.g. most valuable) paths satisfying given criteria. To express complex properties in a modular way, we introduce labelling-generating ontologies. The resulting formalism is computationally attractive - queries can be answered in non-deterministic logarithmic space in the size of the database

    Context-Free Path Querying with Structural Representation of Result

    Full text link
    Graph data model and graph databases are very popular in various areas such as bioinformatics, semantic web, and social networks. One specific problem in the area is a path querying with constraints formulated in terms of formal grammars. The query in this approach is written as grammar, and paths querying is graph parsing with respect to given grammar. There are several solutions to it, but how to provide structural representation of query result which is practical for answer processing and debugging is still an open problem. In this paper we propose a graph parsing technique which allows one to build such representation with respect to given grammar in polynomial time and space for arbitrary context-free grammar and graph. Proposed algorithm is based on generalized LL parsing algorithm, while previous solutions are based mostly on CYK or Earley algorithms, which reduces time complexity in some cases.Comment: Evaluation extende

    Querying Graph Databases.

    Full text link
    Real life data can often be modeled as graphs, in which nodes represent objects and edges indicate their relationships. Large graph datasets are common in many emerging applications. To fully exploit the wealth of information encoded in graphs, systems for managing and analyzing graph data are critical. To address the need of complex analysis on graph data, this thesis presents a graph querying toolkit, called Periscope/GQ. This toolkit is built on top of a commodity RDBMS. It provides a uniform schema for storing graphs and supports various graph query operations. Users can easily combine several operations to perform complex analysis on graphs. The key feature of Periscope/GQ is the support of various sophisticated graph query operations besides the simple ones like node/edge selection and path search. In particular, this thesis focuses on two classes of sophisticated queries: graph matching and graph summarization. The database community has largely focus on exact graph matching problems. However, due to the noisy and incomplete nature of real graph datasets, approximate, rather than exact graph matching is required. This thesis presents a novel approximate graph matching technique, called SAGA. SAGA employs a flexible graph similarity model and utilizes an index-based matching algorithm to efficiently evaluate matching queries. SAGA is effective and efficient for small query graphs (with tens of nodes and edges), but is expensive when applied to large query graphs (with hundreds to thousands of nodes and edges). To handle large query graphs, TALE is proposed. TALE employs a novel indexing technique, which achieves high pruning power and scales linearly with the database sizes. The matching algorithm utilizes the index to first match the important nodes in the query, and then extends them to produce large graph matches. Graph summarization techniques are useful for understanding underlying characteristics of graphs. To summarize large graphs, this thesis introduces an aggregation method. This method produces summary graphs by grouping nodes based on user-selected node attributes and relationships. It further allows users to control the resolutions of summaries, and provides the "drill-down" and "roll-up" abilities to navigate through summaries with different resolutions.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/61640/1/ytian_1.pd

    Representing and querying disease networks using graph databases

    Get PDF
    BACKGROUND: Systems biology experiments generate large volumes of data of multiple modalities and this information presents a challenge for integration due to a mix of complexity together with rich semantics. Here, we describe how graph databases provide a powerful framework for storage, querying and envisioning of biological data. RESULTS: We show how graph databases are well suited for the representation of biological information, which is typically highly connected, semi-structured and unpredictable. We outline an application case that uses the Neo4j graph database for building and querying a prototype network to provide biological context to asthma related genes. CONCLUSIONS: Our study suggests that graph databases provide a flexible solution for the integration of multiple types of biological data and facilitate exploratory data mining to support hypothesis generation. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13040-016-0102-8) contains supplementary material, which is available to authorized users

    A New Life for SQL SELECT Statement

    Get PDF
    An important percent from information systems use databases and in the majority of cases for developing such systems are used object oriented programming languages. From this point of view a key aspect is represented by the database querying features. The authors has observed a major gap between querying features of persistence mechanisms and the requirements for developing true object oriented software applications. Consequently, authors propose a new syntax for SQL SELECT statement, syntax that will allow to client applications to retrieve objects graphs.object-oriented database, query, objects graph, SQL, syntax.

    A Researcher’s Digest of GQL

    Get PDF
    International audienceGQL (Graph Query Language) is being developed as a new ISO standard for graph query languages to play the same role for graph databases as SQL plays for relational. In parallel, an extension of SQL for querying property graphs, SQL/PGQ, is added to the SQL standard; it shares the graph pattern matching functionality with GQL. Both standards (not yet published) are hard-to-understand specifications of hundreds of pages. The goal of this paper is to present a digest of the language that is easy for the research community to understand, and thus to initiate research on these future standards for querying graphs. The paper concentrates on pattern matching features shared by GQL and SQL/PGQ, as well as querying facilities of GQL

    Distributed Graph Storage And Querying System

    Get PDF
    Graph databases offer an efficient way to store and access inter-connected data. However, to query large graphs that no longer fit in memory, it becomes necessary to make multiple trips to the storage device to filter and gather data based on the query. But I/O accesses are expensive operations and immensely slow down query response time and prevent us from fully exploiting the graph specific benefits that graph databases offer. The storage models of most existing graph database systems view graphs as indivisible structures and hence do not allow a hierarchical layering of the graph. This adversely affects query performance for large graphs as there is no way to filter the graph on a higher level without actually accessing the entire information from the disk. Distributing the storage and processing is one way to extract better performance. But current distributed solutions to this problem are not entirely effective, again due to the indivisible representation of graphs adopted in the storage format. This causes unnecessary latency due to increased inter-processor communication. In this dissertation, we propose an optimized distributed graph storage system for scalable and faster querying of big graph data. We start with our unique physical storage model, in which the graph is decomposed into three different levels of abstraction, each with a different storage hierarchy. We use a hybrid storage model to store the most critical component and restrict the I/O trips to only when absolutely necessary. This lets us actively make use of multi-level filters while querying, without the need of comprehensive indexes. Our results show that our system outperforms established graph databases for several class of queries. We show that this separation also eases the difficulties in distributing graph data and go on propose a more efficient distributed model for querying general purpose graph data using the Spark framework
    • 

    corecore