23 research outputs found

    The HyperBagGraph DataEdron: An Enriched Browsing Experience of Multimedia Datasets

    Full text link
    Traditional verbatim browsers give back information in a linear way according to a ranking performed by a search engine that may not be optimal for the surfer. The latter may need to assess the pertinence of the information retrieved, particularly when s\cdothe wants to explore other facets of a multi-facetted information space. For instance, in a multimedia dataset different facets such as keywords, authors, publication category, organisations and figures can be of interest. The facet simultaneous visualisation can help to gain insights on the information retrieved and call for further searches. Facets are co-occurence networks, modeled by HyperBag-Graphs -- families of multisets -- and are in fact linked not only to the publication itself, but to any chosen reference. These references allow to navigate inside the dataset and perform visual queries. We explore here the case of scientific publications based on Arxiv searches.Comment: Extension of the hypergraph framework shortly presented in arXiv:1809.00164 (possible small overlaps); use the theoretical framework of hb-graphs presented in arXiv:1809.0019

    Gunrock: A High-Performance Graph Processing Library on the GPU

    Full text link
    For large-scale graph analytics on the GPU, the irregularity of data access and control flow, and the complexity of programming GPUs have been two significant challenges for developing a programmable high-performance graph library. "Gunrock", our graph-processing system designed specifically for the GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge. We evaluate Gunrock on five key graph primitives and show that Gunrock has on average at least an order of magnitude speedup over Boost and PowerGraph, comparable performance to the fastest GPU hardwired primitives, and better performance than any other GPU high-level graph library.Comment: 14 pages, accepted by PPoPP'16 (removed the text repetition in the previous version v5

    Performance Comparison Analysis of ArangoDB, MySQL, and Neo4j: An Experimental Study of Querying Connected Data

    Full text link
    Choosing and developing performant database solutions helps organizations optimize their operational practices and decision-making. Since graph data is becoming more common, it is crucial to develop and use them in big data with complex relationships with high and consistent performance. However, legacy database technologies such as MySQL are tailored to store relational databases and need to perform more complex queries to retrieve graph data. Previous research has dealt with performance aspects such as CPU and memory usage. In contrast, energy usage and temperature of the servers are lacking. Thus, this paper evaluates and compares state-of-the-art graphs and relational databases from the performance aspects to allow a more informed selection of technologies. Graph-based big data applications benefit from informed selection database technologies for data retrieval and analytics problems. The results show that Neo4j performs faster in querying connected data than MySQL and ArangoDB, and energy, CPU, and memory usage performances are reported in this paper.Comment: https://hdl.handle.net/10125/10731

    Computational Modelling for Bankruptcy Prediction: Semantic data Analysis Integrating Graph Database and Financial Ontology

    Get PDF
    In this paper, we propose a novel intelligent methodology to construct a Bankruptcy Prediction Computation Model, which is aimed to execute a company's financial status analysis accurately. Based on the semantic data analysis and management, our methodology considers Semantic Database System as the core of the system. It comprises three layers: an Ontology of Bankruptcy Prediction, Semantic Search Engine, and a Semantic Analysis Graph Database system. The Ontological layer defines the basic concepts of the financial risk management as well as the objects that serve as sources of knowledge for predicting a company's bankruptcy. The Graph Database layer utilises a powerful semantic data technology, which serves as a semantic data repository for our model. The article provides a detailed description of the construction of the Ontology and its informal conceptual representation. We also present a working prototype of the Graph Database system, constructed using the Neo4j application, and show the connection between well-known financial ratios. We argue that this methodology which utilises state of the art semantic data management mechanisms enables data processing and relevant computations in a more efficient way than approaches using the traditional relational database. These give us solid grounds to build a system that is capable of tackling the data of any complexity level

    Systematic Literature Review: Current Products, Topic, and Implementation of Graph Database

    Get PDF
    Planning, developing, and updating software cannot be separated from the role of the database. From various types of databases, graph databases are considered to have various advantages over their predecessor, relational databases. Graph databases then become the latest trend in the software and data science industry, apart from the development of graph theory itself. The proliferation of research on GDB in the last decade raises questions about what topics are associated with GDB, what industries use GDB in its data processing, what the GDB models are, and what types of GDB have been used most frequently by users in the last few years. This article aims to answer these questions through a Literature Review, which is carried out by determining objectives, determining the limits of review coverage, determining inclusion and exclusion criteria for data retrieval, data extraction, and quality assessment. Based on a review of 60 studies, several research topics related to GDB are Semantic Web, Big Data, and Parallel computing. A total of 19 (30%) studies used Neo4j as their database. Apart from Social Networks, the industries that implement GDB the most are the Transportation sector, Scientific Article Networks, and general sectors such as Enterprise Data, Biological data, and History data. This Literature Review concludes that research on the topic of the Graph Database is still developing in the future. This is shown by the breadth of application and the variety of new derivatives of GDB products offered by researchers to address existing problems

    Evaluación y punto de referencia para bases de datos orientadas a grafos

    Get PDF
    Las bases de datos orientadas a grafos (BDG) han adquirido popularidad dentro del análisis de datos masivos ya que proveen un rendimiento superior a la que se obtiene a través de una base de datos relacional en los escenarios en donde la alta conectividad entre datos se convierte en el principal componente para interpretar datos y obtener información de ellos para la toma de decisiones. Este trabajo de obtención de grado (TOG) tiene como objetivo desarrollar una metodología para comparar y evaluar de forma equitativa a distintos motores que se especializan en el manejo de grafos. En primera instancia se analizan los estudios relacionados a este tema que fungirán como soporte para nuestra investigación e identificar los puntos a mejorar para crear un proceso de evaluación y realizar el punto de referencia de bases de datos orientadas a grafos que son populares por el tiempo que llevan desarrollándose y siendo utilizadas en la industria y la academia, y otros que han surgido recientemente para mejorar las limitantes que hay en el mercado. El principal componente del trabajo es crear los pasos requeridos para ejecutar pruebas con distintos tipos de conjuntos de datos y algoritmos a un grupo de BDG. De forma posterior se ejecuta un caso de estudio en el cual se define un ambiente de validación homogéneo para todo sistema, donde se puedan tener las especificaciones de hardware y software para que todas las BDG puedan correr sin restricciones. De la misma manera se definen los aspectos a evaluar, que incluyen, pero no están limitados a las capacidades del lenguaje de consultas que proveen, la integración con otras plataformas o sistemas, y el soporte que cada una provee para la ejecución de algoritmos sobre grafos. La selección de los datos que se cargarán en las distintas plataformas debe considerar que tengan el formato adecuado que cada BDG soporta, y en caso contrario, analizar si los datos pueden ser convertidos para ser utilizados en la base de datos. Finalmente, se toman como caso de prueba las bases de datos basadas en grafos GraphDB, JanusGraph, Neo4j, y TigerGraph. El caso de estudio utiliza la metodología desarrollada a través de este trabajo para evaluar las bases de datos.ITESO, A. C.Consejo Nacional de Ciencia y Tecnologí
    corecore