Search CORE

272 research outputs found

A Survey on Graph Database Management Techniques for Huge Unstructured Data

Author: N. P. Kiran
N. S. Patil
P Kiran
Patel K. M. Naresh
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/04/2018
Field of study

Data analysis, data management, and big data play a major role in both social and business perspective, in the last decade. Nowadays, the graph database is the hottest and trending research topic. A graph database is preferred to deal with the dynamic and complex relationships in connected data and offer better results. Every data element is represented as a node. For example, in social media site, a person is represented as a node, and its properties name, age, likes, and dislikes, etc and the nodes are connected with the relationships via edges. Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques to handle with. This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management

IAES journal

Crossref

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institute of Advanced Engineering and Science

An introduction to Graph Data Management

Author: A Dries
A Gutiérrez
A Iosup
A Morari
A Poulovassilis
AD Zhu
AO Mendelzon
B Amann
B Elser
C Berge
C Vicknair
C Watters
C Weiss
CS Chang
D Conte
D Dominguez-Sal
D Theodoratos
DC Faye
DW Shipman
EF Codd
FW Tompa
G Malewicz
GM Kuper
H He
HS Kunii
IF Cruz
IF Cruz
J Hidders
J Paredaens
J Peckham
J. Hidders
Jonathan Hayes
K Zeng
L Kowalik
L Zou
M Atre
M Ciglan
M Consens
M Gemis
M Gyssens
M Han
M Levene
M Levene
M Levene
M Mainguenaud
M Schmidt
M Yannakakis
MA Bornea
MA Rodriguez
MA Rodriguez
Marc Andries
MP Consens
MP Consens
N Kiesel
N Roussopoulos
O Erling
P Barceló Baeza
P Buneman
P Yuan
Philippe Cudré-Mauroux
PPS Chen
PT Wood
PT Wood
R Agrawal
R Angles
R Angles
R Brijder
R Ronen
RH Güting
RS Xin
S Abiteboul
S Abiteboul
T Neumann
W Fan
W Kim
Y Guo
Y Low
Y Papakonstantinou
Y Tian
Y Zhao
YA Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/12/2017
Field of study

A graph database is a database where the data structures for the schema and/or instances are modeled as a (labeled)(directed) graph or generalizations of it, and where querying is expressed by graph-oriented operations and type constructors. In this article we present the basic notions of graph databases, give an historical overview of its main development, and study the main current systems that implement them

arXiv.org e-Print Archive

Crossref

Comparison of Graph Databases and Relational Databases When Handling Large-Scale Social Data

Author: Chen Yaowen 1989-
Publication venue: 'University of Saskatchewan Library'
Publication date: 20/09/2016
Field of study

Over the past few years, with the rapid development of mobile technology, more people use mobile social applications, such as Facebook, Twitter and Weibo, in their daily lives, and there is an increasing amount of social data. Thus, finding a suitable storage approach to store and process the social data, especially for the large-scale social data, should be important for the social network companies. Traditionally, a relational database, which represents data in terms of tables, is widely used in the legacy applications. However, a graph database, which is a kind of NoSQL databases, is in a rapid development to handle the growing amount of unstructured or semi-structured data. The two kinds of storage approaches have their own advantages. For example, a relational database should be a more mature storage approach, and a graph database can handle graph-like data in an easier way. In this research, a comparison of capabilities for storing and processing large-scale social data between relational databases and graph databases is applied. Two kinds of analysis, the quantitative research analysis of storage cost and executing time and the qualitative analysis of five criteria, including maturity, ease of programming, flexibility, security and data visualization, are taken into the comparison to evaluate the performance of relational databases and graph databases when handling large-scale social data. Also, a simple mobile social application is developed for experiments. The comparison is used to figure out which kind of database is more suitable for handling large-scale social data, and it can compare more graph database models with real-world social data sets in the future research

eCommons@USASK

University of Saskatchewan Research Archive

Data management in cloud environments: NoSQL and NewSQL data stores

Author: Capretz Miriam AM
Grolinger Katarina
Higashino Wilson A
Tiwari Abhinav
Publication venue: Scholarship@Western
Publication date: 01/01/2013
Field of study

: Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages

Scholarship@Western

Springer - Publisher Connector

The Future is Big Graphs! A Community View on Graph Processing Systems

Author: Ammar Khaled
Angles Renzo
Aref Walid
Arenas Marcelo
Besta Maciej
Boncz Peter A.
Bonifati Angela
Daudjee Khuzaima
Della Valle Emanuele
Dumbrava Stefania
Hartig Olaf
Haslhofer Bernhard
Hegeman Tim
Hidders Jan
Hose Katja
Iamnitchi Adriana
Iosup Alexandru
Kalavri Vasiliki
Kapp Hugo
Martens Wim
Peukert Eric
Plantikow Stefan
Ragab Mohamed
Ripeanu Matei R.
Sakr Sherif
Salihoglu Semih
Schulz Christian
Selmer Petra
Sequeda Juan F.
Shinavier Joshua
Szárnyas Gábor
Tommasini Riccardo
Tumeo Antonino
Uta Alexandru
Varbanescu Ana Lucia
Voigt Hannes
Wu Hsiang-Yun
Yakovets Nikolay
Yan Da
Yoneki Eiko
Özsu M. Tamer
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2020
Field of study

Graphs are by nature unifying abstractions that can leverage interconnectedness to represent, explore, predict, and explain real- and digital-world phenomena. Although real users and consumers of graph instances and graph workloads understand these abstractions, future problems will require new abstractions and systems. What needs to happen in the next decade for big graph processing to continue to succeed?Comment: 12 pages, 3 figures, collaboration between the large-scale systems and data management communities, work started at the Dagstuhl Seminar 19491 on Big Graph Processing Systems, to be published in the Communications of the AC

arXiv.org e-Print Archive

Repository for Publications and Research Data

Archivio istituzionale della ricerca - Politecnico di Milano

VU Research Portal

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL

VBN

Birkbeck Institutional Research Online

Hal-Diderot

Graphical Database Architecture For Clinical Trials

Author: Baset Aiman G.
Publication venue: Aggie Digital Collections and Scholarship
Publication date: 01/01/2015
Field of study

The general area of the research is Health Informatics. The research focuses on creating an innovative and novel solution to manage and analyze clinical trials data. It constructs a Graphical Database Architecture (GDA) for Clinical Trials (CT) using New Technology for Java (Neo4j) as a robust, a scalable and a high-performance database. The purpose of the research project is to develop concepts and techniques based on architecture to accelerate the processing time of clinical data navigation at lower cost. The research design uses a positivist approach to empirical research. The research is significant because it proposes a new approach of clinical trials through graph theory and designs a responsive structure of clinical data that can be deployed across all the health informatics landscape. It uniquely contributes to scholarly literature of the phenomena of Not only SQL (NoSQL) graph databases, mainly Neo4j in CT, for future research of clinical informatics. A prototype is created and examined to validate the concepts, taking advantage of Neo4jâ€™s high availability, scalability, and powerful graph query language (Cypher). This research study finds that integration of search methodologies and information retrieval with the graphical database provides a solid starting point to manage, query, and analyze the clinical trials data, furthermore the design and the development of a prototype demonstrate the conceptual model of this study. Likewise the proposed clinical trials ontology (CTO) incorporates all data elements of a standard clinical study which facilitate a heuristic overview of treatments, interventions, and outcome results of these studies

North Carolina Agricultural and Technical State University: NC A&T SU Bluford Library's Aggie Digital Collections and Scholarship

GLL-based Context-Free Path Querying for Neo4j

Author: Abzalov Vadim
Grigorev Semyon
Kutuev Vladimir
Pogozhelskaya Vlada
Publication venue
Publication date: 19/12/2023
Field of study

We propose GLL-based context-free path querying algorithm which handles queries in Extended Backus-Naur Form (EBNF) using Recursive State Machines (RSM). Utilization of EBNF allows one to combine traditional regular expressions and mutually recursive patterns in constraints natively. The proposed algorithm solves both the reachability-only and the all-paths problems for the all-pairs and the multiple sources cases. The evaluation on realworld graphs demonstrates that utilization of RSMs increases performance of query evaluation. Being implemented as a stored procedure for Neo4j, our solution demonstrates better performance than a similar solution for RedisGraph. Performance of our solution of regular path queries is comparable with performance of native Neo4j solution, and in some cases our solution requires significantly less memory

arXiv.org e-Print Archive

Multi-threaded execution of Cypher queries

Author: Mellbin Ragnar
Åkerlund Felix
Publication venue: Lunds universitet/Institutionen för datavetenskap
Publication date: 01/01/2017
Field of study

In this report we investigate parallel execution of queries in graph databases. We analyse different methods of parallelization, how to introduce query parallelization to a graph database, which query operations that are suitable for parallelization and if we can improve the execution time of a single query. We do this by designing and implementing a parallel runtime for the Cypher query language in the graph database Neo4j, but many of the design ideas and operators investigated are applicable to any graph database. We focus on increasing performance for a select few operators, while still being fully integrated with Neo4j. We take much inspiration from a design called morsel-driven parallelism. This means that we strive to split the workload into many small pieces, “morsels”, and then hand these morsels to the threads executing the query. This is in contrast to a more classical parallelization approach, where you split the workload into a few big parts of equal size. We conclude that the operators best suited for parallelization are the operators that can be split into several smaller parts, where each part can be computed independently. We successfully introduce parallel execution of Cypher queries to Neo4j and by doing so we increase the performance of a single query by up to 15 times under certain conditionsGrafdatabaser blir allt vanligare, samtidigt som antalet processorer i moderna datorer ökar mer och mer. Vi tittar i detta arbete på hur parallelliserad sökning kan leda till prestandavinster i den populära grafdatabashanteraren Neo4j. För att ta reda på om det går att parallellisera en enskild sökning i en grafdatabas och hur stor påverkan detta då har på svarstider, skapade vi vår egen modifierade version av Neo4j. Vi började med att ta reda på vilka delar av mjukvaran som bäst lämpade sig för parallellisering, med hänsyn till hur ofta de förekom i sökningar samt hur pass stora krav de ställde på processorn. Efter att ha valt ut ett antal av dessa så gick vi vidare med att ta fram metoder för att dela upp dem i mindre uppgifter som kunde köras i olika delar av processorn samtidigt, för att slutligen införa dessa ändringar i Neo4j. Resultatet är en version av Neo4j som under rätt förhållanden ger upp till 15 gånger snabbare svar på enskilda sökningar

EpiGeNet : A graph database of interdependencies between genetic and epigenetic events in colorectal cancer

Author: Auffray C.
Balaur I.
Barat A.
Lysenko A.
Mazein A.
Rawlings C. J.
Ruskin H. J.
Saqi M.
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/2017
Field of study

The development of colorectal cancer (CRC)—the third most common cancer type—has been associated with deregulations of cellular mechanisms stimulated by both genetic and epigenetic events. StatEpigen is a manually curated and annotated database, containing information on interdependencies between genetic and epigenetic signals, and specialized currently for CRC research. Although StatEpigen provides a well-developed graphical user interface for information retrieval, advanced queries involving associations between multiple concepts can benefit from more detailed graph representation of the integrated data. This can be achieved by using a graph database (NoSQL) approach. Data were extracted from StatEpigen and imported to our newly developed EpiGeNet, a graph database for storage and querying of conditional relationships between molecular (genetic and epigenetic) events observed at different stages of colorectal oncogenesis. We illustrate the enhanced capability of EpiGeNet for exploration of different queries related to colorectal tumor progression; specifically, we demonstrate the query process for (i) stage-specific molecular events, (ii) most frequently observed genetic and epigenetic interdependencies in colon adenoma, and (iii) paths connecting key genes reported in CRC and associated events. The EpiGeNet framework offers improved capability for management and visualization of data on molecular events specific to CRC initiation and progression

Crossref

Rothamsted Repository