59 research outputs found

    Systematic Literature Review: Current Products, Topic, and Implementation of Graph Database

    Get PDF
    Planning, developing, and updating software cannot be separated from the role of the database. From various types of databases, graph databases are considered to have various advantages over their predecessor, relational databases. Graph databases then become the latest trend in the software and data science industry, apart from the development of graph theory itself. The proliferation of research on GDB in the last decade raises questions about what topics are associated with GDB, what industries use GDB in its data processing, what the GDB models are, and what types of GDB have been used most frequently by users in the last few years. This article aims to answer these questions through a Literature Review, which is carried out by determining objectives, determining the limits of review coverage, determining inclusion and exclusion criteria for data retrieval, data extraction, and quality assessment. Based on a review of 60 studies, several research topics related to GDB are Semantic Web, Big Data, and Parallel computing. A total of 19 (30%) studies used Neo4j as their database. Apart from Social Networks, the industries that implement GDB the most are the Transportation sector, Scientific Article Networks, and general sectors such as Enterprise Data, Biological data, and History data. This Literature Review concludes that research on the topic of the Graph Database is still developing in the future. This is shown by the breadth of application and the variety of new derivatives of GDB products offered by researchers to address existing problems

    Visões em bancos de dados de grafos : uma abordagem multifoco para dados heterogêneos

    Get PDF
    Orientador: Claudia Maria Bauzer MedeirosTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: A pesquisa científica tornou-se cada vez mais dependente de dados. Esse novo paradigma de pesquisa demanda técnicas e tecnologias computacionais sofisticadas para apoiar tanto o ciclo de vida dos dados científicos como a colaboração entre cientistas de diferentes áreas. Uma demanda recorrente em equipes multidisciplinares é a construção de múltiplas perspectivas sobre um mesmo conjunto de dados. Soluções atuais cobrem vários aspectos, desde o projeto de padrões de interoperabilidade ao uso de sistemas de gerenciamento de bancos de dados não-relacionais. Entretanto, nenhum desses esforços atende de forma adequada a necessidade de múltiplas perspectivas, denominadas focos nesta tese. Em termos gerais, um foco é projetado e construído para atender um determinado grupo de pesquisa (mesmo no escopo de um único projeto) que necessita manipular um subconjunto de dados de interesse em múltiplos níveis de agregação/generalização. A definição e criação de um foco são tarefas complexas que demandam mecanismos capazes de manipular múltiplas representações de um mesmo fenômeno do mundo real. O objetivo desta tese é prover múltiplos focos sobre dados heterogêneos. Para atingir esse objetivo, esta pesquisa se concentrou em quatro principais problemas. Os problemas inicialmente abordados foram: (1) escolher um paradigma de gerenciamento de dados adequado e (2) elencar os principais requisitos de pesquisas multifoco. Nossos resultados nos direcionaram para a adoção de bancos de dados de grafos como solução para o problema (1) e a utilização do conceito de visões, de bancos de dados relacionais, para o problema (2). Entretanto, não há consenso sobre um modelo de dados para bancos de dados de grafos e o conceito de visões é pouco explorado nesse contexto. Com isso, os demais problemas tratados por esta pesquisa são: (3) a especificação de um modelo de dados de grafos e (4) a definição de um framework para manipular visões em bancos de dados de grafos. Nossa pesquisa nesses quatro problemas resultaram nas contribuições principais desta tese: (i) apontar o uso de bancos de dados de grafos como camada de persistência em pesquisas multifoco - um tipo de banco de dados de esquema flexível e orientado a relacionamentos que provê uma ampla compreensão sobre as relações entre os dados; (ii) definir visões para bancos de dados de grafos como mecanismo para manipular múltiplos focos, considerando operações de manipulação de dados em grafos, travessias e algoritmos de grafos; (iii) propor um modelo de dados para grafos - baseado em grafos de propriedade - para lidar com a ausência de um modelo de dados pleno para grafos; (iv) especificar e implementar um framework, denominado Graph-Kaleidoscope, para prover o uso de visões em bancos de dados de grafos e (v) validar nosso framework com dados reais em aplicações distintas - em biodiversidade e em recursos naturais - dois típicos exemplos de pesquisas multidisciplinares que envolvem a análise de interações de fenômenos a partir de dados heterogêneosAbstract: Scientific research has become data-intensive and data-dependent. This new research paradigm requires sophisticated computer science techniques and technologies to support the life cycle of scientific data and collaboration among scientists from distinct areas. A major requirement is that researchers working in data-intensive interdisciplinary teams demand construction of multiple perspectives of the world, built over the same datasets. Present solutions cover a wide range of aspects, from the design of interoperability standards to the use of non-relational database management systems. None of these efforts, however, adequately meet the needs of multiple perspectives, which are called foci in the thesis. Basically, a focus is designed/built to cater to a research group (even within a single project) that needs to deal with a subset of data of interest, under multiple ggregation/generalization levels. The definition and creation of a focus are complex tasks that require mechanisms and engines to manipulate multiple representations of the same real world phenomenon. This PhD research aims to provide multiple foci over heterogeneous data. To meet this challenge, we deal with four research problems. The first two were (1) choosing an appropriate data management paradigm; and (2) eliciting multifocus requirements. Our work towards solving these problems made as choose graph databases to answer (1) and the concept of views in relational databases for (2). However, there is no consensual data model for graph databases and views are seldom discussed in this context. Thus, research problems (3) and (4) are: (3) specifying an adequate graph data model and (4) defining a framework to handle views on graph databases. Our research in these problems results in the main contributions of this thesis: (i) to present the case for the use of graph databases in multifocus research as persistence layer - a schemaless and relationship driven type of database that provides a full understanding of data connections; (ii) to define views for graph databases to support the need for multiple foci, considering graph data manipulation, graph algorithms and traversal tasks; (iii) to propose a property graph data model (PGDM) to fill the gap of absence of a full-fledged data model for graphs; (iv) to specify and implement a framework, named Graph-Kaleidoscope, that supports views over graph databases and (v) to validate our framework for real world applications in two domains - biodiversity and environmental resources - typical examples of multidisciplinary research that involve the analysis of interactions of phenomena using heterogeneous dataDoutoradoCiência da ComputaçãoDoutora em Ciência da Computaçã

    Systematic Literature Review: Current Products, Topic, and Implementation of Graph Database

    Get PDF
    Planning, developing, and updating software cannot be separated from the role of the database. From various types of databases, graph databases are considered to have various advantages over their predecessor, relational databases. Graph databases then become the latest trend in the software and data science industry, apart from the development of graph theory itself. The proliferation of research on GDB in the last decade raises questions about what topics are associated with GDB, what industries use GDB in its data processing, what the GDB models are, and what types of GDB have been used most frequently by users in the last few years. This article aims to answer these questions through a Literature Review, which is carried out by determining objectives, determining the limits of review coverage, determining inclusion and exclusion criteria for data retrieval, data extraction, and quality assessment. Based on a review of 60 studies, several research topics related to GDB are Semantic Web, Big Data, and Parallel computing. A total of 19 (30%) studies used Neo4j as their database. Apart from Social Networks, the industries that implement GDB the most are the Transportation sector, Scientific Article Networks, and general sectors such as Enterprise Data, Biological data, and History data. This Literature Review concludes that research on the topic of the Graph Database is still developing in the future. This is shown by the breadth of application and the variety of new derivatives of GDB products offered by researchers to address existing problems

    A SEMANTIC GRAPH DATABASE FOR BIM-GIS INTEGRATED INFORMATION MODEL FOR AN INTELLIGENT URBAN MOBILITY WEB APPLICATION

    Get PDF
    Over the recent years, the usage of semantic web technologies and Resources Description Framework (RDF) data models have been notably increased in many fields. Multiple systems are using RDF data to describe information resources and semantic associations. RDF data plays a very important role in advanced information retrieval, and graphs are efficient ways to visualize and represent real world data by providing solutions to many real-time scenarios that can be simulated and implemented using graph databases, and efficiently query graphs with multiple attributes representing different domains of knowledge. Given that graph databases are schema less with efficient storage for semi-structured data, they can provide fast and deep traversals instead of slow RDBMS SQL based joins allowing Atomicity, Consistency, Isolation and durability (ACID) transactions with rollback support, and by utilizing mathematics of graph they can enormous potential for fast data extraction and storage of information in the form of nodes and relationships. In this paper, we are presenting an architectural design with complete implementation of BIM-GIS integrated RDF graph database. The proposed integration approach is composed of four main phases: ontological BIM and GIS model’s construction, mapping and semantic integration using interoperable data formats, then an import into a graph database with querying and filtering capabilities. The workflows and transformations of IFC and CityGML schemas into object graph databases model are developed and applied to an intelligent urban mobility web application on a game engine platform validate the integration methodology

    Comparative analysis of PropertyFirst vs. EntityFirst modeling approaches in graph databases

    Get PDF
    While relational databases still hold the primary position in the database technology domain, and have been for the longest time of any Computer Science technology has since its inception, for the first time the relational databases now have valid and worthy opponent in the NoSQL database movement. NoSQL databases, even though not many people have heard of them, with a significant number of Computer Science people included, have spread rapidly in many shapes and forms and have done so in quite a chaotic fashion. Similarly to the way they appeared and spread, design and modeling for them have been undertaken in an unstructured manner. Currently they are subcategorized in 4 main groups as: Key-value stores, Column Family stores, Document stores and Graph databases. In this thesis, different modeling approaches for graph databases, applied to the same domain are analyzed and compared, especially from a design perspective. The database selected here as the implemented technology is Neo4J by Neo Technologies and is a directed property graph database, which means that relationships between its data entities must have a starting and ending (or source and destination) node. This research provides an overview of two competing modeling approaches and evaluates them in a context of a real world example. The work done here shows that both of these modeling approaches are valid and that it is possible to fully develop a data model based on the same domain data with both approaches and that both can be used later to support application access in a similar fashion. One of the models provides for faster access to data, but at a cost of higher maintenance and increased complexity
    corecore