23 research outputs found
The HyperBagGraph DataEdron: An Enriched Browsing Experience of Multimedia Datasets
Traditional verbatim browsers give back information in a linear way according
to a ranking performed by a search engine that may not be optimal for the
surfer. The latter may need to assess the pertinence of the information
retrieved, particularly when she wants to explore other facets of a
multi-facetted information space. For instance, in a multimedia dataset
different facets such as keywords, authors, publication category, organisations
and figures can be of interest. The facet simultaneous visualisation can help
to gain insights on the information retrieved and call for further searches.
Facets are co-occurence networks, modeled by HyperBag-Graphs -- families of
multisets -- and are in fact linked not only to the publication itself, but to
any chosen reference. These references allow to navigate inside the dataset and
perform visual queries. We explore here the case of scientific publications
based on Arxiv searches.Comment: Extension of the hypergraph framework shortly presented in
arXiv:1809.00164 (possible small overlaps); use the theoretical framework of
hb-graphs presented in arXiv:1809.0019
Gunrock: A High-Performance Graph Processing Library on the GPU
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs have been two
significant challenges for developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We evaluate Gunrock on five key graph
primitives and show that Gunrock has on average at least an order of magnitude
speedup over Boost and PowerGraph, comparable performance to the fastest GPU
hardwired primitives, and better performance than any other GPU high-level
graph library.Comment: 14 pages, accepted by PPoPP'16 (removed the text repetition in the
previous version v5
Performance Comparison Analysis of ArangoDB, MySQL, and Neo4j: An Experimental Study of Querying Connected Data
Choosing and developing performant database solutions helps organizations
optimize their operational practices and decision-making. Since graph data is
becoming more common, it is crucial to develop and use them in big data with
complex relationships with high and consistent performance. However, legacy
database technologies such as MySQL are tailored to store relational databases
and need to perform more complex queries to retrieve graph data. Previous
research has dealt with performance aspects such as CPU and memory usage. In
contrast, energy usage and temperature of the servers are lacking. Thus, this
paper evaluates and compares state-of-the-art graphs and relational databases
from the performance aspects to allow a more informed selection of
technologies. Graph-based big data applications benefit from informed selection
database technologies for data retrieval and analytics problems. The results
show that Neo4j performs faster in querying connected data than MySQL and
ArangoDB, and energy, CPU, and memory usage performances are reported in this
paper.Comment: https://hdl.handle.net/10125/10731
Computational Modelling for Bankruptcy Prediction: Semantic data Analysis Integrating Graph Database and Financial Ontology
In this paper, we propose a novel intelligent methodology to construct a Bankruptcy Prediction Computation Model, which is aimed to execute a company's financial status analysis accurately. Based on the semantic data analysis and management, our methodology considers Semantic Database System as the core of the system. It comprises three layers: an Ontology of Bankruptcy Prediction, Semantic Search Engine, and a Semantic Analysis Graph Database system.
The Ontological layer defines the basic concepts of the financial risk management as well as the objects that serve as sources of knowledge for predicting a company's bankruptcy. The Graph Database layer utilises a powerful semantic data technology, which serves as a semantic data repository for our model.
The article provides a detailed description of the construction of the Ontology and its informal conceptual representation. We also present a working prototype of the Graph Database system, constructed using the Neo4j application, and show the connection between well-known financial ratios.
We argue that this methodology which utilises state of the art semantic data management mechanisms enables data processing and relevant computations in a more efficient way than approaches using the traditional relational database. These give us solid grounds to build a system that is capable of tackling the data of any complexity level
Systematic Literature Review: Current Products, Topic, and Implementation of Graph Database
Planning, developing, and updating software cannot be separated from the role of the database. From various types of databases, graph databases are considered to have various advantages over their predecessor, relational databases. Graph databases then become the latest trend in the software and data science industry, apart from the development of graph theory itself. The proliferation of research on GDB in the last decade raises questions about what topics are associated with GDB, what industries use GDB in its data processing, what the GDB models are, and what types of GDB have been used most frequently by users in the last few years. This article aims to answer these questions through a Literature Review, which is carried out by determining objectives, determining the limits of review coverage, determining inclusion and exclusion criteria for data retrieval, data extraction, and quality assessment. Based on a review of 60 studies, several research topics related to GDB are Semantic Web, Big Data, and Parallel computing. A total of 19 (30%) studies used Neo4j as their database. Apart from Social Networks, the industries that implement GDB the most are the Transportation sector, Scientific Article Networks, and general sectors such as Enterprise Data, Biological data, and History data. This Literature Review concludes that research on the topic of the Graph Database is still developing in the future. This is shown by the breadth of application and the variety of new derivatives of GDB products offered by researchers to address existing problems
Evaluación y punto de referencia para bases de datos orientadas a grafos
Las bases de datos orientadas a grafos (BDG) han adquirido popularidad dentro del análisis de datos masivos ya que proveen un rendimiento superior a la que se obtiene a través de una base de datos relacional en los escenarios en donde la alta conectividad entre datos se convierte en el principal componente para interpretar datos y obtener información de ellos para la toma de decisiones. Este trabajo de obtención de grado (TOG) tiene como objetivo desarrollar una metodología para comparar y evaluar de forma equitativa a distintos motores que se especializan en el manejo de grafos. En primera instancia se analizan los estudios relacionados a este tema que fungirán como soporte para nuestra investigación e identificar los puntos a mejorar para crear un proceso de evaluación y realizar el punto de referencia de bases de datos orientadas a grafos que son populares por el tiempo que llevan desarrollándose y siendo utilizadas en la industria y la academia, y otros que han surgido recientemente para mejorar las limitantes que hay en el mercado. El principal componente del trabajo es crear los pasos requeridos para ejecutar pruebas con distintos tipos de conjuntos de datos y algoritmos a un grupo de BDG. De forma posterior se ejecuta un caso de estudio en el cual se define un ambiente de validación homogéneo para todo sistema, donde se puedan tener las especificaciones de hardware y software para que todas las BDG puedan correr sin restricciones. De la misma manera se definen los aspectos a evaluar, que incluyen, pero no están limitados a las capacidades del lenguaje de consultas que proveen, la integración con otras plataformas o sistemas, y el soporte que cada una provee para la ejecución de algoritmos sobre grafos. La selección de los datos que se cargarán en las distintas plataformas debe considerar que tengan el formato adecuado que cada BDG soporta, y en caso contrario, analizar si los datos pueden ser convertidos para ser utilizados en la base de datos. Finalmente, se toman como caso de prueba las bases de datos basadas en grafos GraphDB, JanusGraph, Neo4j, y TigerGraph. El caso de estudio utiliza la metodología desarrollada a través de este trabajo para evaluar las bases de datos.ITESO, A. C.Consejo Nacional de Ciencia y Tecnologí