8 research outputs found

    Backlogs and Interval Timestamps: Building Blocks for Supporting Temporal Queries in Graph Databases Work in progress paper

    Get PDF
    ABSTRACT The analysis of networks, either at a single point in time or through their evolution, is an increasingly important task in modern data management. Graph databases are uniquely suited to improve static network analysis. However, there's still no consensus on how to best model data evolution with these databases. In our work we propose an elementary concept to support temporal analysis with property graph databases, using a single-graph model limited to structural changes. We manage the temporal aspects of items with interval timestamps and backlogs. To include backlogs in the model we examine two alternatives: (1) global indexes, and (2) using the graph as an index by resorting to timestamp denormalization. We evaluate density calculation and time slice retrieval over successive days from a SNAP dataset, on an Apache Titan prototype of our model, observing from 2x to 100x response time gains by comparing differential vs. snapshot methods; and no conclusive difference between the backlog alternatives

    Spatial Queries for Indoor Location-based Services

    Get PDF
    Indoor Location-based Services (LBS) facilitate people in indoor scenarios such as airports, train stations, shopping malls, and office buildings. Indoor spatial queries are the foundation to support indoor LBSs. However, the existing techniques for indoor spatial queries are limited to support more advanced queries that consider semantic information, temporal variations, and crowd influence. This work studies indoor spatial queries for indoor LBSs. Some typical proposals for indoor spatial queries are compared theoretically and experimentally. Then, it studies three advanced indoor spatial queries, a) Indoor Keyword-aware Routing Query. b) Indoor Temporal-variation aware Routing Query. c) Indoor Crowd-aware Routing Query. A series of techniques are proposed to solve these problems.</p

    Nutzung mobiler Endgeräte zur Analyse von dynamischen Graphen im Raum vor einer Displaywall

    Get PDF
    Eine Graph-Visualisierung von Daten findet Einsatz in verschiedenen Domänen und erleichtert die Gewinnung von Erkenntnissen über Relationen. Viele Datensätze haben eine dynamische Natur und verändern sich über die Zeit. Dynamische Graphen kodieren die temporalen Änderungen der Objekte und ihrer Relationen. Der Vorteil der hochauflösenden Displaywand besteht darin, dass durch einen vergrößerten Darstellungsraum ein guter Überblick über die strukturellen Zusammenhänge entsteht. Der Einsatz der heutzutage sehr verbreiteten mobilen Endgeräte bietet eine lokale nutzerspezifische Darstellung für die Exploration und Manipulation der Graphen einschließlich ihrer dynamischen Komponente. Während sich viele Arbeiten auf die Analyse statischer Strukturen fokussieren, wird das Problem der Exploration und Visualisierung dynamischer Graphen vergleichsweise wenig adressiert. In der vorliegenden Arbeit werden die grundlegenden Aufgaben für die dynamischen Graphen sowie der Stand der Forschung über den interaktiven Raum der kombinierten Anwendung Displaywand und Mobilgerät untersucht. Auf dieser Basis werden die Interaktionskonzepte für die Exploration dynamischer Graphen und die Visualisierung der Änderungen erarbeitet. Der interaktive Raum bietet dabei die räumlichen Freiheitsgrade für die Gestaltung der Interaktionen. Die entstandene Konzeption dient abschließend als Grundlage für die prototypische Umsetzung in einem existierenden Projekt zur Exploration von Graphen mithilfe mobiler Endgeräte.:- Einleitung - Grundlagen und verwandte Arbeiten - Konzeption - Umsetzung - Zusammenfassung, Bewertung und Ausblick - LiteraturverzeichnisGraph visualization for data is used in various domains. It facilitates the extraction of knowledges about the relationships. Many datasets have a dynamic nature and change over time. Dynamic graphs ecode the temporal changes of objects and their relations. The advantage of the high-resolution displaywall is providing a good overview of the structural relations caused by an enlarged representation space. The use of nowadays widespread mobile devices provides a local user-specific view for the exploration of the graphs including their dynamic component. While many studies focuse on the analysis of static data structures, the issue of exploration and visualization of dynamic graphs is adressed by only a few works. This thesis investigates the basic tasks for dynamic graphs and state of research into the interaction space of combined applications of displaywall and mobile device. On this basis the interaction concepts for the exploration of dynamic graphs and the visualization of the changes are developed. The interaction space provides the spatial degrees of freedom for the design of interactions. The resulting set of concepts is the basis for the prototypical implementation in the existing project for the exploration of graphs using the mobile devices.:- Einleitung - Grundlagen und verwandte Arbeiten - Konzeption - Umsetzung - Zusammenfassung, Bewertung und Ausblick - Literaturverzeichni

    Prediction of user behaviour on the web

    Get PDF
    The Web has become an ubiquitous environment for human interaction, communication, and data sharing. As a result, large amounts of data are produced. This data can be utilised by building predictive models of user behaviour in order to support business decisions. However, the fast pace of modern businesses is creating the pressure on industry to provide faster and better decisions. This thesis addresses this challenge by proposing a novel methodology for an effcient prediction of user behaviour. The problems concerned are: (i) modelling user behaviour on the Web, (ii) choosing and extracting features from data generated by user behaviour, and (iii) choosing a Machine Learning (ML) set-up for an effcient prediction. First, a novel Time-Varying Attributed Graph (TVAG) is introduced and then a TVAG-based model for modelling user behaviour on the Web is proposed. TVAGs capture temporal properties of user behaviour by their time varying component of features of the graph nodes and edges. Second, the proposed model allows to extract features for further ML predictions. However, extracting the features and building the model may be unacceptably hard and long process. Thus, a guideline for an effcient feature extraction from the TVAG-based model is proposed. Third, a method for choosing a ML set-up to build an accurate and fast predictive model is proposed and evaluated. Finally, a deep learning architecture for predicting user behaviour on the Web is proposed and evaluated. To sum up, the main contribution to knowledge of this work is in developing the methodology for fast and effcient predictions of user behaviour on the Web. The methodology is evaluated on datasets from a few Web platforms, namely Stack Exchange, Twitter, and Facebook

    Detection of potential misuse in information systems based on temporal graph anomalies

    Get PDF
    U složenom informacijskom sustavu u kojem korisnici imaju različite uloge, putem kojih su im dodijeljene različite ovlasti, moguće su složene zlouporabe pri kojima nitko od korisnika ne prekoračuje svoje ovlasti, no zajedničkim djelovanjem mogu prouzročiti štetu ili steći korist. Ovakav oblik unutarnjih prijetnji sustavima, u kojima organizirano sudjeluje veći broj autoriziranih korisnika koji ne prekoračuju dodijeljene im ovlasti, nije dovoljno istražen. U ovom radu je predložena općenita metoda za pronalazak mogućih zlouporaba sustava neovisno o semantici podataka i poznavanju poslovnih procesa sustava. Metoda se temelji na postojanju povijesti podataka informacijskog sustava. Implementacijom i testiranjem je ocijenjeno da predložena metoda prepoznaje moguće zlouporabe sustava. Predloženi model potpuno vremenski određenog grafa i algoritmi za konverziju relacijskih i vremenskih relacijskih podataka u grafove, pronalazak čestih vremenskih podgrafova i usporedbu vremenskih grafova su iskoristivi za opću namjenu. Znanstveni doprinosi: 1) Algoritam za transformaciju podataka iz relacijskih baza podataka u grafovske baze podataka, s posebnim naglaskom na transformaciju vremenskih relacijskih podataka u potpuno vremenski određene grafove; 2) Algoritam za pronalazak čestih vremenskih podgrafova potpuno vremenski određenog grafa; 3) Algoritam za pronalazak odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafa; 4) Metoda za otkrivanje mogućih sigurnosnih prijetnji na osnovu odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafaUsers of complex information systems can have various roles, which define their permissions. By acting in a coordinated manner, users can perform complex misuses without overstepping their permissions, and cause damage or gain illegal benefits. This kind of internal threats, where multiple users act coordinately and do not overstep their permissions, is not sufficiently researched. This thesis proposes general method for identification of potential misuses, which is independent of data semantics and business rules familiarity. Method is based on the existence of the information system's relational database audit trail. By implementation and testing it is evaluated that the method recognizes potential misuses. Proposed model of completely-timed graph, relational and temporal relational database to graph conversion algorithms, frequent completely-timed subgraph mining algorithm and completely-timed graph comparison algorithm can be used for general purpose. Scientific contributions: 1) relational database to graph database conversion algorithm, with special emphasis on temporal relational database to completely-timed graph conversion; 2) frequent completely-timed subgraph mining algorithm; 3) frequent completely-timed subgraph anomaly detection algorithm; 4) potential information system misuse detection method based on frequent completely-timed subgraph anomalie

    Detection of potential misuse in information systems based on temporal graph anomalies

    Get PDF
    U složenom informacijskom sustavu u kojem korisnici imaju različite uloge, putem kojih su im dodijeljene različite ovlasti, moguće su složene zlouporabe pri kojima nitko od korisnika ne prekoračuje svoje ovlasti, no zajedničkim djelovanjem mogu prouzročiti štetu ili steći korist. Ovakav oblik unutarnjih prijetnji sustavima, u kojima organizirano sudjeluje veći broj autoriziranih korisnika koji ne prekoračuju dodijeljene im ovlasti, nije dovoljno istražen. U ovom radu je predložena općenita metoda za pronalazak mogućih zlouporaba sustava neovisno o semantici podataka i poznavanju poslovnih procesa sustava. Metoda se temelji na postojanju povijesti podataka informacijskog sustava. Implementacijom i testiranjem je ocijenjeno da predložena metoda prepoznaje moguće zlouporabe sustava. Predloženi model potpuno vremenski određenog grafa i algoritmi za konverziju relacijskih i vremenskih relacijskih podataka u grafove, pronalazak čestih vremenskih podgrafova i usporedbu vremenskih grafova su iskoristivi za opću namjenu. Znanstveni doprinosi: 1) Algoritam za transformaciju podataka iz relacijskih baza podataka u grafovske baze podataka, s posebnim naglaskom na transformaciju vremenskih relacijskih podataka u potpuno vremenski određene grafove; 2) Algoritam za pronalazak čestih vremenskih podgrafova potpuno vremenski određenog grafa; 3) Algoritam za pronalazak odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafa; 4) Metoda za otkrivanje mogućih sigurnosnih prijetnji na osnovu odstupanja od čestih vremenskih podgrafova potpuno vremenski određenog grafaUsers of complex information systems can have various roles, which define their permissions. By acting in a coordinated manner, users can perform complex misuses without overstepping their permissions, and cause damage or gain illegal benefits. This kind of internal threats, where multiple users act coordinately and do not overstep their permissions, is not sufficiently researched. This thesis proposes general method for identification of potential misuses, which is independent of data semantics and business rules familiarity. Method is based on the existence of the information system's relational database audit trail. By implementation and testing it is evaluated that the method recognizes potential misuses. Proposed model of completely-timed graph, relational and temporal relational database to graph conversion algorithms, frequent completely-timed subgraph mining algorithm and completely-timed graph comparison algorithm can be used for general purpose. Scientific contributions: 1) relational database to graph database conversion algorithm, with special emphasis on temporal relational database to completely-timed graph conversion; 2) frequent completely-timed subgraph mining algorithm; 3) frequent completely-timed subgraph anomaly detection algorithm; 4) potential information system misuse detection method based on frequent completely-timed subgraph anomalie

    Graph database management systems: storage, management and query processing

    Get PDF
    The proliferation of graph data, generated from diverse sources, have given rise to many research efforts concerning graph analysis. Interactions in social networks, publication networks, protein networks, software code dependencies and transportation systems are all examples of graph-structured data originating from a variety of application domains and demonstrating different characteristics. In recent years, graph database management systems (GDBMS) have been introduced for the management and analysis of graph data. Motivated by the growing number of real-life applications making use of graph database systems, this thesis focuses on the effectiveness and efficiency aspects of such systems. Specifically, we study the following topics relevant to graph database systems: (i) modeling large-scale applications in GDBMS; (ii) storage and indexing issues in GDBMS, and (iii) efficient query processing in GDBMS. In this thesis, we adopt two different application scenarios to examine how graph database systems can model complex features and perform relevant queries on each of them. Motivated by the popular application of social network analytics, we selected Twitter, a microblogging platform, to conduct our detailed analysis. Addressing limitations of existing models, we pro- pose a data model for the Twittersphere that proactively captures Twitter-specific interactions. We examine the feasibility of running analytical queries on GDBMS and offer empirical analysis of the performance of the proposed approach. Next, we consider a use case of modeling software code dependencies in a graph database system, and investigate how these systems can support capturing the evolution of a codebase overtime. We study a code comprehension tool that extracts software dependencies and stores them in a graph database. On a versioned graph built using a very large codebase, we demonstrate how existing code comprehension queries can be efficiently processed and also show the benefit of running queries across multiple versions. Another important aspect of this thesis is the study of storage aspects of graph systems. Throughput of many graph queries can be significantly affected by disk I/O performance; therefore graph database systems need to focus on effective graph storage for optimising disk operations. We observe that the locality of edges plays an important role and we address the edge-labeling problem which aims to label both incoming and outgoing edges of a graph maximizing the ‘edge-consecutiveness’ metric. By achieving a better layout and locality of edges on disk, we show that our proposed algorithms result in significantly improved disk I/O performance leading to faster execution of neighbourhood queries. Some applications require the integrated processing of queries from graph and the textual domains within a graph database system. Aggregation of these dimensions facilitates gaining key insights in several application scenarios. For example, in a social network setting, one may want to find the closest k users in the network (graph traversal) who talk about a particular topic A (textual search). Motivated by such practical use cases, in this thesis we study the top-k social-textual ranking query that essentially requires efficient combination of a keyword search query with a graph traversal. We propose algorithms that leverage graph partitioning techniques, based on the premise that socially close users will be placed within the same partition, allowing more localised computations. We show that our proposed approaches are able to achieve significantly better results compared to standard baselines and demonstrating robust behaviour under changing parameters
    corecore