99,430 research outputs found

    A Comparative Analysis of Graph Vs Relational Database For Instructional Module Development System

    Get PDF
    abstract: In today's data-driven world, every datum is connected to a large amount of data. Relational databases have been proving itself a pioneer in the field of data storage and manipulation since 1970s. But more recently they have been challenged by NoSQL graph databases in handling data models which have an inherent graphical representation. Graph databases with the ability to store physical relationships between two nodes and native graph processing technique have been doing exceptionally well in graph data storage and management for applications like recommendation engines, biological modeling, network modeling, social media applications, etc. Instructional Module Development System (IMODS) is a web-based software system that guides STEM instructors through the complex task of curriculum design, ensures tight alignment between various components of a course (i.e., learning objectives, content, assessments), and provides relevant information about research-based pedagogical and assessment strategies. The data model of IMODS is highly connected and has an inherent graphical representation between all its entities with numerous relationships between them. This thesis focuses on developing an algorithm to determine completeness of course design developed using IMODS. As part of this research objective, the study also analyzes the data model for best fit database to run these algorithms. As part of this thesis, two separate applications abstracting the data model of IMODS have been developed - one with Neo4j (graph database) and another with PostgreSQL (relational database). The research objectives of the thesis are as follows: (i) evaluate the performance of Neo4j and PostgreSQL in handling complex queries that will be fired throughout the life cycle of the course design process; (ii) devise an algorithm to determine the completeness of a course design developed using IMODS. This thesis presents the process of creating data model for PostgreSQL and converting it into a graph data model to be abstracted by Neo4j, creating SQL and CYPHER scripts for undertaking experiments on both platforms, testing and elaborate analysis of the results and evaluation of the databases in the context of IMODS.Dissertation/ThesisMasters Thesis Computer Science 201

    Extreme Acceleration of Graph Neural Network-based Prediction Models for Quantum Chemistry

    Full text link
    Molecular property calculations are the bedrock of chemical physics. High-fidelity \textit{ab initio} modeling techniques for computing the molecular properties can be prohibitively expensive, and motivate the development of machine-learning models that make the same predictions more efficiently. Training graph neural networks over large molecular databases introduces unique computational challenges such as the need to process millions of small graphs with variable size and support communication patterns that are distinct from learning over large graphs such as social networks. This paper demonstrates a novel hardware-software co-design approach to scale up the training of graph neural networks for molecular property prediction. We introduce an algorithm to coalesce the batches of molecular graphs into fixed size packs to eliminate redundant computation and memory associated with alternative padding techniques and improve throughput via minimizing communication. We demonstrate the effectiveness of our co-design approach by providing an implementation of a well-established molecular property prediction model on the Graphcore Intelligence Processing Units (IPU). We evaluate the training performance on multiple molecular graph databases with varying degrees of graph counts, sizes and sparsity. We demonstrate that such a co-design approach can reduce the training time of such molecular property prediction models from days to less than two hours, opening new possibilities for AI-driven scientific discovery

    Feature-Based Classification of Bidirectional Transformation Approaches

    Get PDF
    International audienceBidirectional model transformation is a key technology in model-driven engineering (MDE), when two models that can change over time have to be kept constantly consistent with each other. While several model transformation tools include at least a partial support to bidirectionality, it is not clear how these bidirectional capabilities relate to each other and to similar classical problems in computer science, from the view update problem in databases to bidirectional graph transformations. This paper tries to clarify and visualize the space of design choices for bidirectional transformations from an MDE point of view, in the form of a feature model. The selected list of existing approaches are characterized by mapping them to the feature model. Then, the feature model is used to highlight some unexplored research lines in bidirectional transformations

    The Train Benchmark: cross-technology performance evaluation of continuous model queries

    Get PDF
    In model-driven development of safety-critical systems (like automotive, avionics or railways), well- formedness of models is repeatedly validated in order to detect design flaws as early as possible. In many indus- trial tools, validation rules are still often implemented by a large amount of imperative model traversal code which makes those rule implementations complicated and hard to maintain. Additionally, as models are rapidly increas- ing in size and complexity, efficient execution of validation rules is challenging for the currently available tools. Checking well-formedness constraints can be captured by declarative queries over graph models, while model update operations can be specified as model transformations. This paper presents a benchmark for systematically assessing the scalability of validating and revalidating well-formedness constraints over large graph models. The benchmark defines well-formedness validation scenarios in the railway domain: a metamodel, an instance model generator and a set of well- formedness constraints captured by queries, fault injection and repair operations (imitating the work of systems engi- neers by model transformations). The benchmark focuses on the performance of query evaluation, i.e. its execution time and memory consumption, with a particular empha- sis on reevaluation. We demonstrate that the benchmark can be adopted to various technologies and query engines, including modeling tools; relational, graph and semantic databases. The Train Benchmark is available as an open- source project with continuous builds from https://github. com/FTSRG/trainbenchmark

    Using Ontologies for the Design of Data Warehouses

    Get PDF
    Obtaining an implementation of a data warehouse is a complex task that forces designers to acquire wide knowledge of the domain, thus requiring a high level of expertise and becoming it a prone-to-fail task. Based on our experience, we have detected a set of situations we have faced up with in real-world projects in which we believe that the use of ontologies will improve several aspects of the design of data warehouses. The aim of this article is to describe several shortcomings of current data warehouse design approaches and discuss the benefit of using ontologies to overcome them. This work is a starting point for discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure

    gMark: Schema-Driven Generation of Graphs and Queries

    Full text link
    Massive graph data sets are pervasive in contemporary application domains. Hence, graph database systems are becoming increasingly important. In the experimental study of these systems, it is vital that the research community has shared solutions for the generation of database instances and query workloads having predictable and controllable properties. In this paper, we present the design and engineering principles of gMark, a domain- and query language-independent graph instance and query workload generator. A core contribution of gMark is its ability to target and control the diversity of properties of both the generated instances and the generated workloads coupled to these instances. Further novelties include support for regular path queries, a fundamental graph query paradigm, and schema-driven selectivity estimation of queries, a key feature in controlling workload chokepoints. We illustrate the flexibility and practical usability of gMark by showcasing the framework's capabilities in generating high quality graphs and workloads, and its ability to encode user-defined schemas across a variety of application domains.Comment: Accepted in November 2016. URL: http://ieeexplore.ieee.org/document/7762945/. in IEEE Transactions on Knowledge and Data Engineering 201
    • …
    corecore