10 research outputs found

    Comparative analysis of PropertyFirst vs. EntityFirst modeling approaches in graph databases

    Get PDF
    While relational databases still hold the primary position in the database technology domain, and have been for the longest time of any Computer Science technology has since its inception, for the first time the relational databases now have valid and worthy opponent in the NoSQL database movement. NoSQL databases, even though not many people have heard of them, with a significant number of Computer Science people included, have spread rapidly in many shapes and forms and have done so in quite a chaotic fashion. Similarly to the way they appeared and spread, design and modeling for them have been undertaken in an unstructured manner. Currently they are subcategorized in 4 main groups as: Key-value stores, Column Family stores, Document stores and Graph databases. In this thesis, different modeling approaches for graph databases, applied to the same domain are analyzed and compared, especially from a design perspective. The database selected here as the implemented technology is Neo4J by Neo Technologies and is a directed property graph database, which means that relationships between its data entities must have a starting and ending (or source and destination) node. This research provides an overview of two competing modeling approaches and evaluates them in a context of a real world example. The work done here shows that both of these modeling approaches are valid and that it is possible to fully develop a data model based on the same domain data with both approaches and that both can be used later to support application access in a similar fashion. One of the models provides for faster access to data, but at a cost of higher maintenance and increased complexity

    An exploration of graph algorithms and graph databases

    Get PDF
    With data becoming larger in quantity, the need for complex, efficient algorithms to solve computationally complex problems has become greater. In this thesis we evaluate a selection of graph algorithms; we provide a novel algorithm for solving and approximating the Longest Simple Cycle problem, as well as providing novel implementations of other graph algorithms in graph database systems.The first area of exploration is finding the Longest Simple Cycle in a graph problem. We propose two methods of finding the longest simple cycle. The first method is an exact approach based on a flow-based Integer Linear Program. The second is a multi-start local search heuristic which uses a simple depth-first search as a basis for a cycle, and improves this with four perturbation operators.Secondly, we focus on implementing the Minimum Dominating Set problem into graph database systems. An unoptimised greedy heuristic solution to the Minimum Dominating Set problem is implemented into a client-server system using a declarative query language, an embedded database system using an imperative query language and a high level language as a direct comparison. The performance of the graph back-end on the database systems is evaluated. The language expressiveness of the query languages is also explored. We identify limitations of the query methods of the database system, and propose a function that increases the functionality of the queries

    Linguaggi di interrogazione per basi di dati semistrutturati a grafo.

    Get PDF
    Negli ultimi anni, grazie anche all'avvento del web 2.0, si sono rese disponibili grandi sorgenti di dati semistrutturati. Si è quindi sviluppato un certo interesse da parte della comunità di ricerca su alcuni modelli per basi di dati atti ad una migliore rappresentazione e più effciente interrogazione di questo tipo di informazioni: un noto esempio proposto dal W3C è "Resource Description Framework". In questa tesi si analizzano alcuni modelli a grafo sia per dati semistrutturati che non, e si confrontano i relativi linguaggi di interrogazione dal punto di vista del loro potere espressivo

    Detection of potential misuse in information systems based on temporal graph anomalies

    Get PDF
    U složenom informacijskom sustavu u kojem korisnici imaju različite uloge, putem kojih su im dodijeljene različite ovlasti, moguće su složene zlouporabe pri kojima nitko od korisnika ne prekoračuje svoje ovlasti, no zajedničkim djelovanjem mogu prouzročiti štetu ili steći korist. Ovakav oblik unutarnjih prijetnji sustavima, u kojima organizirano sudjeluje veći broj autoriziranih korisnika koji ne prekoračuju dodijeljene im ovlasti, nije dovoljno istražen. U ovom radu je predložena općenita metoda za pronalazak mogućih zlouporaba sustava neovisno o semantici podataka i poznavanju poslovnih procesa sustava. Metoda se temelji na postojanju povijesti podataka informacijskog sustava. Implementacijom i testiranjem je ocijenjeno da predložena metoda prepoznaje moguće zlouporabe sustava. Predloženi model potpuno vremenski određenog grafa i algoritmi za konverziju relacijskih i vremenskih relacijskih podataka u grafove, pronalazak čestih vremenskih podgrafova i usporedbu vremenskih grafova su iskoristivi za opću namjenu. Znanstveni doprinosi: 1) Algoritam za transformaciju podataka iz relacijskih baza podataka u grafovske baze podataka, s posebnim naglaskom na transformaciju vremenskih relacijskih podataka u potpuno vremenski određene grafove; 2) Algoritam za pronalazak čestih vremenskih podgrafova potpuno vremenski određenog grafa; 3) Algoritam za pronalazak odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafa; 4) Metoda za otkrivanje mogućih sigurnosnih prijetnji na osnovu odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafaUsers of complex information systems can have various roles, which define their permissions. By acting in a coordinated manner, users can perform complex misuses without overstepping their permissions, and cause damage or gain illegal benefits. This kind of internal threats, where multiple users act coordinately and do not overstep their permissions, is not sufficiently researched. This thesis proposes general method for identification of potential misuses, which is independent of data semantics and business rules familiarity. Method is based on the existence of the information system's relational database audit trail. By implementation and testing it is evaluated that the method recognizes potential misuses. Proposed model of completely-timed graph, relational and temporal relational database to graph conversion algorithms, frequent completely-timed subgraph mining algorithm and completely-timed graph comparison algorithm can be used for general purpose. Scientific contributions: 1) relational database to graph database conversion algorithm, with special emphasis on temporal relational database to completely-timed graph conversion; 2) frequent completely-timed subgraph mining algorithm; 3) frequent completely-timed subgraph anomaly detection algorithm; 4) potential information system misuse detection method based on frequent completely-timed subgraph anomalie

    Detection of potential misuse in information systems based on temporal graph anomalies

    Get PDF
    U složenom informacijskom sustavu u kojem korisnici imaju različite uloge, putem kojih su im dodijeljene različite ovlasti, moguće su složene zlouporabe pri kojima nitko od korisnika ne prekoračuje svoje ovlasti, no zajedničkim djelovanjem mogu prouzročiti štetu ili steći korist. Ovakav oblik unutarnjih prijetnji sustavima, u kojima organizirano sudjeluje veći broj autoriziranih korisnika koji ne prekoračuju dodijeljene im ovlasti, nije dovoljno istražen. U ovom radu je predložena općenita metoda za pronalazak mogućih zlouporaba sustava neovisno o semantici podataka i poznavanju poslovnih procesa sustava. Metoda se temelji na postojanju povijesti podataka informacijskog sustava. Implementacijom i testiranjem je ocijenjeno da predložena metoda prepoznaje moguće zlouporabe sustava. Predloženi model potpuno vremenski određenog grafa i algoritmi za konverziju relacijskih i vremenskih relacijskih podataka u grafove, pronalazak čestih vremenskih podgrafova i usporedbu vremenskih grafova su iskoristivi za opću namjenu. Znanstveni doprinosi: 1) Algoritam za transformaciju podataka iz relacijskih baza podataka u grafovske baze podataka, s posebnim naglaskom na transformaciju vremenskih relacijskih podataka u potpuno vremenski određene grafove; 2) Algoritam za pronalazak čestih vremenskih podgrafova potpuno vremenski određenog grafa; 3) Algoritam za pronalazak odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafa; 4) Metoda za otkrivanje mogućih sigurnosnih prijetnji na osnovu odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafaUsers of complex information systems can have various roles, which define their permissions. By acting in a coordinated manner, users can perform complex misuses without overstepping their permissions, and cause damage or gain illegal benefits. This kind of internal threats, where multiple users act coordinately and do not overstep their permissions, is not sufficiently researched. This thesis proposes general method for identification of potential misuses, which is independent of data semantics and business rules familiarity. Method is based on the existence of the information system's relational database audit trail. By implementation and testing it is evaluated that the method recognizes potential misuses. Proposed model of completely-timed graph, relational and temporal relational database to graph conversion algorithms, frequent completely-timed subgraph mining algorithm and completely-timed graph comparison algorithm can be used for general purpose. Scientific contributions: 1) relational database to graph database conversion algorithm, with special emphasis on temporal relational database to completely-timed graph conversion; 2) frequent completely-timed subgraph mining algorithm; 3) frequent completely-timed subgraph anomaly detection algorithm; 4) potential information system misuse detection method based on frequent completely-timed subgraph anomalie

    ScaleSem (model checking et web sémantique)

    Get PDF
    Le développement croissant des réseaux et en particulier l'Internet a considérablement développé l'écart entre les systèmes d'information hétérogènes. En faisant une analyse sur les études de l'interopérabilité des systèmes d'information hétérogènes, nous découvrons que tous les travaux dans ce domaine tendent à la résolution des problèmes de l'hétérogénéité sémantique. Le W3C (World Wide Web Consortium) propose des normes pour représenter la sémantique par l'ontologie. L'ontologie est en train de devenir un support incontournable pour l'interopérabilité des systèmes d'information et en particulier dans la sémantique. La structure de l'ontologie est une combinaison de concepts, propriétés et relations. Cette combinaison est aussi appelée un graphe sémantique. Plusieurs langages ont été développés dans le cadre du Web sémantique et la plupart de ces langages utilisent la syntaxe XML (eXtensible Meta Language). Les langages OWL (Ontology Web Language) et RDF (Resource Description Framework) sont les langages les plus importants du web sémantique, ils sont basés sur XML.Le RDF est la première norme du W3C pour l'enrichissement des ressources sur le Web avec des descriptions détaillées et il augmente la facilité de traitement automatique des ressources Web. Les descriptions peuvent être des caractéristiques des ressources, telles que l'auteur ou le contenu d'un site web. Ces descriptions sont des métadonnées. Enrichir le Web avec des métadonnées permet le développement de ce qu'on appelle le Web Sémantique. Le RDF est aussi utilisé pour représenter les graphes sémantiques correspondant à une modélisation des connaissances spécifiques. Les fichiers RDF sont généralement stockés dans une base de données relationnelle et manipulés en utilisant le langage SQL ou les langages dérivés comme SPARQL. Malheureusement, cette solution, bien adaptée pour les petits graphes RDF n'est pas bien adaptée pour les grands graphes RDF. Ces graphes évoluent rapidement et leur adaptation au changement peut faire apparaître des incohérences. Conduire l application des changements tout en maintenant la cohérence des graphes sémantiques est une tâche cruciale et coûteuse en termes de temps et de complexité. Un processus automatisé est donc essentiel. Pour ces graphes RDF de grande taille, nous suggérons une nouvelle façon en utilisant la vérification formelle Le Model checking .Le Model checking est une technique de vérification qui explore tous les états possibles du système. De cette manière, on peut montrer qu un modèle d un système donné satisfait une propriété donnée. Cette thèse apporte une nouvelle méthode de vérification et d interrogation de graphes sémantiques. Nous proposons une approche nommé ScaleSem qui consiste à transformer les graphes sémantiques en graphes compréhensibles par le model checker (l outil de vérification de la méthode Model checking). Il est nécessaire d avoir des outils logiciels permettant de réaliser la traduction d un graphe décrit dans un formalisme vers le même graphe (ou une adaptation) décrit dans un autre formalismeThe increasing development of networks and especially the Internet has greatly expanded the gap between heterogeneous information systems. In a review of studies of interoperability of heterogeneous information systems, we find that all the work in this area tends to be in solving the problems of semantic heterogeneity. The W3C (World Wide Web Consortium) standards proposed to represent the semantic ontology. Ontology is becoming an indispensable support for interoperability of information systems, and in particular the semantics. The structure of the ontology is a combination of concepts, properties and relations. This combination is also called a semantic graph. Several languages have been developed in the context of the Semantic Web. Most of these languages use syntax XML (eXtensible Meta Language). The OWL (Ontology Web Language) and RDF (Resource Description Framework) are the most important languages of the Semantic Web, and are based on XML.RDF is the first W3C standard for enriching resources on the Web with detailed descriptions, and increases the facility of automatic processing of Web resources. Descriptions may be characteristics of resources, such as the author or the content of a website. These descriptions are metadata. Enriching the Web with metadata allows the development of the so-called Semantic Web. RDF is used to represent semantic graphs corresponding to a specific knowledge modeling. RDF files are typically stored in a relational database and manipulated using SQL, or derived languages such as SPARQL. This solution is well suited for small RDF graphs, but is unfortunately not well suited for large RDF graphs. These graphs are rapidly evolving, and adapting them to change may reveal inconsistencies. Driving the implementation of changes while maintaining the consistency of a semantic graph is a crucial task, and costly in terms of time and complexity. An automated process is essential. For these large RDF graphs, we propose a new way using formal verification entitled "Model Checking".Model Checking is a verification technique that explores all possible states of the system. In this way, we can show that a model of a given system satisfies a given property. This thesis provides a new method for checking and querying semantic graphs. We propose an approach called ScaleSem which transforms semantic graphs into graphs understood by the Model Checker (The verification Tool of the Model Checking method). It is necessary to have software tools to perform the translation of a graph described in a certain formalism into the same graph (or adaptation) described in another formalismDIJON-BU Doc.électronique (212319901) / SudocSudocFranceF

    An Object-Oriented Pattern Matching Language

    No full text
    A graphical model for describing schemes and instances of object-databases and a graphical data manipulation language based on pattern matching, called PaMaL, are introduced. The operations of PaMaL (addition and deletion of nodes and edges) use patterns to indicate the parts of the instance that are affected by the operation. We give the syntax and semantics of the operations and the programming constructs (loop, procedure and program) of PaMaL. We add a reduce-operation to work with a special group of instances, the reduced instances. 1 Introduction One of the first visual or graphical interfaces for databases was QBE [Zlo77]. It introduced a new way of user-database interaction, by providing the user some tools to interact directly with the database and its structure. Since then, the research of visual interfaces has evolved in two directions. One group develops specialized interfaces for geographical or pictorial information (some relevant information can be found in [vl, Coo93])..

    An object-oriented pattern matching language

    No full text
    corecore