Search CORE

182 research outputs found

An introduction to Graph Data Management

Author: A Dries
A Gutiérrez
A Iosup
A Morari
A Poulovassilis
AD Zhu
AO Mendelzon
B Amann
B Elser
C Berge
C Vicknair
C Watters
C Weiss
CS Chang
D Conte
D Dominguez-Sal
D Theodoratos
DC Faye
DW Shipman
EF Codd
FW Tompa
G Malewicz
GM Kuper
H He
HS Kunii
IF Cruz
IF Cruz
J Hidders
J Paredaens
J Peckham
J. Hidders
Jonathan Hayes
K Zeng
L Kowalik
L Zou
M Atre
M Ciglan
M Consens
M Gemis
M Gyssens
M Han
M Levene
M Levene
M Levene
M Mainguenaud
M Schmidt
M Yannakakis
MA Bornea
MA Rodriguez
MA Rodriguez
Marc Andries
MP Consens
MP Consens
N Kiesel
N Roussopoulos
O Erling
P Barceló Baeza
P Buneman
P Yuan
Philippe Cudré-Mauroux
PPS Chen
PT Wood
PT Wood
R Agrawal
R Angles
R Angles
R Brijder
R Ronen
RH Güting
RS Xin
S Abiteboul
S Abiteboul
T Neumann
W Fan
W Kim
Y Guo
Y Low
Y Papakonstantinou
Y Tian
Y Zhao
YA Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/12/2017
Field of study

A graph database is a database where the data structures for the schema and/or instances are modeled as a (labeled)(directed) graph or generalizations of it, and where querying is expressed by graph-oriented operations and type constructors. In this article we present the basic notions of graph databases, give an historical overview of its main development, and study the main current systems that implement them

arXiv.org e-Print Archive

Crossref

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography

Author: Galesi Nicola
Ranjbar Fariba
Publication venue
Publication date: 01/01/2018
Field of study

We study maximal identifiability, a measure recently introduced in Boolean Network Tomography to characterize networks' capability to localize failure nodes in end-to-end path measurements. We prove tight upper and lower bounds on the maximal identifiability of failure nodes for specific classes of network topologies, such as trees and

d

-dimensional grids, in both directed and undirected cases. We prove that directed

d

-dimensional grids with support

n

have maximal identifiability

d

using

2d(n-1)+2

monitors; and in the undirected case we show that

2d

monitors suffice to get identifiability of

d-1

. We then study identifiability under embeddings: we establish relations between maximal identifiability, embeddability and graph dimension when network topologies are model as DAGs. Our results suggest the design of networks over

N

nodes with maximal identifiability

\Omega(\log N)

using

O(\log N)

monitors and a heuristic to boost maximal identifiability on a given network by simulating

d

-dimensional grids. We provide positive evidence of this heuristic through data extracted by exact computation of maximal identifiability on examples of small real networks

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Automata with Nested Pebbles Capture First-Order Logic with Transitive Closure

Author: Hendrik Jan Hoogeboom
Joost Engelfriet
Wolfgang Thomas
Publication venue: 'Logical Methods in Computer Science e.V.'
Publication date: 01/01/2007
Field of study

String languages recognizable in (deterministic) log-space are characterized either by two-way (deterministic) multi-head automata, or following Immerman, by first-order logic with (deterministic) transitive closure. Here we elaborate this result, and match the number of heads to the arity of the transitive closure. More precisely, first-order logic with k-ary deterministic transitive closure has the same power as deterministic automata walking on their input with k heads, additionally using a finite set of nested pebbles. This result is valid for strings, ordered trees, and in general for families of graphs having a fixed automaton that can be used to traverse the nodes of each of the graphs in the family. Other examples of such families are grids, toruses, and rectangular mazes. For nondeterministic automata, the logic is restricted to positive occurrences of transitive closure. The special case of k=1 for trees, shows that single-head deterministic tree-walking automata with nested pebbles are characterized by first-order logic with unary deterministic transitive closure. This refines our earlier result that placed these automata between first-order and monadic second-order logic on trees.Comment: Paper for Logical Methods in Computer Science, 27 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

Crossref

Leiden University Scholary Publications

Path constraints in semistructured data

Author: André Yves
Caron Anne-Cécile
Debarbieux Denis
Roos Yves
Tison Sophie
Publication venue: 'Elsevier BV'
Publication date: 15/10/2007
Field of study

International audienceWe consider semistructured data as multirooted edge-labelled directed graphs, and path inclusion constraints on these graphs. A path inclusion constraint pnot precedes, equalsq is satisfied by a semistructured data if any node reached by the regular query p is also reached by the regular query q. In this paper, two problems are mainly studied: the implication problem and the problem of the existence of a finite exact model. - We give a new decision algorithm for the implication problem of a constraint pnot precedes, equalsq by a set of bounded path constraints pinot precedes, equalsui where p, q, and the pi's are regular path expressions and the ui's are words, improving in this particular case, the more general algorithms of S. Abiteboul and V. Vianu, and N. Alechina et al. In the case of a set of word equalities ui≡vi, we provide a more efficient decision algorithm for the implication of a word equality u≡v, improving the more general algorithm of P. Buneman et al. We prove that, in this case, implication for nondeterministic models is equivalent to implication for (complete) deterministic ones. - We introduce the notion of exact model: an exact model of a set of path constraints Click to view the MathML source satisfies the constraint pnot precedes, equalsq if and only if this constraint is implied by Click to view the MathML source. We prove that any set of constraints has an exact model and we give a decidable characterization of data which are exact models of bounded path inclusion constraints sets

HAL - Lille 3

Elsevier - Publisher Connector

INRIA a CCSD electronic archive server

A Trichotomy for Regular Simple Path Queries on Graphs

Author: Bagan Guillaume
Bonifati Angela
Groz Benoit
Publication venue
Publication date: 01/01/2012
Field of study

Regular path queries (RPQs) select nodes connected by some path in a graph. The edge labels of such a path have to form a word that matches a given regular expression. We investigate the evaluation of RPQs with an additional constraint that prevents multiple traversals of the same nodes. Those regular simple path queries (RSPQs) find several applications in practice, yet they quickly become intractable, even for basic languages such as (aa)* or a*ba*. In this paper, we establish a comprehensive classification of regular languages with respect to the complexity of the corresponding regular simple path query problem. More precisely, we identify the fragment that is maximal in the following sense: regular simple path queries can be evaluated in polynomial time for every regular language L that belongs to this fragment and evaluation is NP-complete for languages outside this fragment. We thus fully characterize the frontier between tractability and intractability for RSPQs, and we refine our results to show the following trichotomy: Evaluations of RSPQs is either AC0, NL-complete or NP-complete in data complexity, depending on the regular language L. The fragment identified also admits a simple characterization in terms of regular expressions. Finally, we also discuss the complexity of the following decision problem: decide, given a language L, whether finding a regular simple path for L is tractable. We consider several alternative representations of L: DFAs, NFAs or regular expressions, and prove that this problem is NL-complete for the first representation and PSPACE-complete for the other two. As a conclusion we extend our results from edge-labeled graphs to vertex-labeled graphs and vertex-edge labeled graphs.Comment: 15 pages, conference submissio

arXiv.org e-Print Archive

HAL - Lille 3

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

Graph Pattern Matching in GQL and SQL/PGQ

Author: Deutsch Alin
Francis Nadime
Green Alastair
Hare Keith
Li Bei
Libkin Leonid
Lindaaker Tobias
Marsault Victor
Martens Wim
Michels Jan
Murlak Filip
Plantikow Stefan
Selmer Petra
van Rest Oskar
Voigt Hannes
Vrgoc Domagoj
Wu Mingxi
Zemke Fred
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 12/12/2021
Field of study

As graph databases become widespread, JTC1 -- the committee in joint charge of information technology standards for the International Organization for Standardization (ISO), and International Electrotechnical Commission (IEC) -- has approved a project to create GQL, a standard property graph query language. This complements a project to extend SQL with a new part, SQL/PGQ, which specifies how to define graph views over an SQL tabular schema, and to run read-only queries against them. Both projects have been assigned to the ISO/IEC JTC1 SC32 working group for Database Languages, WG3, which continues to maintain and enhance SQL as a whole. This common responsibility helps enforce a policy that the identical core of both PGQ and GQL is a graph pattern matching sub-language, here termed GPML. The WG3 design process is also analyzed by an academic working group, part of the Linked Data Benchmark Council (LDBC), whose task is to produce a formal semantics of these graph data languages, which complements their standard specifications. This paper, written by members of WG3 and LDBC, presents the key elements of the GPML of SQL/PGQ and GQL in advance of the publication of these new standards

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Edinburgh Research Explorer

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

MULTIHIERARCHICAL DOCUMENTS AND FINE-GRAINED ACCESS CONTROL

Author: Moore Neil
Publication venue: UKnowledge
Publication date: 01/01/2012
Field of study

This work presents new models and algorithms for creating, modifying, and controlling access to complex text. The digitization of texts opens new opportunities for preservation, access, and analysis, but at the same time raises questions regarding how to represent and collaboratively edit such texts. Two issues of particular interest are modelling the relationships of markup (annotations) in complex texts, and controlling the creation and modification of those texts. This work addresses and connects these issues, with emphasis on data modelling, algorithms, and computational complexity; and contributes new results in these areas of research. Although hierarchical models of text and markup are common, complex texts often exhibit layers of overlapping structure that are best described by multihierarchical markup. We develop a new model of multihierarchical markup, the globally ordered GODDAG, that combines features of both graph- and range-based models of markup, allowing documents to be unambiguously serialized. We describe extensions to the XPath query language to support globally ordered GODDAGs, provide semantics for a set of update operations on this structure, and provide algorithms for converting between two different representations of the globally ordered GODDAG. Managing the collaborative editing of documents can require restricting the types of changes different editors may make, while not altogether restricting their access to the document. Fine-grained access control allows precisely these kinds of restrictions on the operations that a user is or is not permitted to perform on a document. We describe a rule-based model of fine-grained access control for updates of hierarchical documents, and in this context analyze the document generation problem: determining whether a document could have been created without violating a particular access control policy. We show that this problem is undecidable in the general case and provide computational complexity bounds for a number of restricted variants of the problem. Finally, we extend our fine-grained access control model from hierarchical to multihierarchical documents. We provide semantics for fine-grained access control policies that control splice-in, splice-out, and rename operations on globally ordered GODDAGs, and show that the multihierarchical version of the document generation problem remains undecidable

University of Kentucky

I/O efficient bisimulation partitioning on very large directed acyclic graphs

Author: Fletcher George H. L.
Haverkort Herman
Hellings Jelle
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

In this paper we introduce the first efficient external-memory algorithm to compute the bisimilarity equivalence classes of a directed acyclic graph (DAG). DAGs are commonly used to model data in a wide variety of practical applications, ranging from XML documents and data provenance models, to web taxonomies and scientific workflows. In the study of efficient reasoning over massive graphs, the notion of node bisimilarity plays a central role. For example, grouping together bisimilar nodes in an XML data set is the first step in many sophisticated approaches to building indexing data structures for efficient XPath query evaluation. To date, however, only internal-memory bisimulation algorithms have been investigated. As the size of real-world DAG data sets often exceeds available main memory, storage in external memory becomes necessary. Hence, there is a practical need for an efficient approach to computing bisimulation in external memory. Our general algorithm has a worst-case IO-complexity of O(Sort(|N| + |E|)), where |N| and |E| are the numbers of nodes and edges, resp., in the data graph and Sort(n) is the number of accesses to external memory needed to sort an input of size n. We also study specializations of this algorithm to common variations of bisimulation for tree-structured XML data sets. We empirically verify efficient performance of the algorithms on graphs and XML documents having billions of nodes and edges, and find that the algorithms can process such graphs efficiently even when very limited internal memory is available. The proposed algorithms are simple enough for practical implementation and use, and open the door for further study of external-memory bisimulation algorithms. To this end, the full open-source C++ implementation has been made freely available

arXiv.org e-Print Archive

Document similarity

Author: Huitfeldt Claus
Sperberg-McQueen C. Michael
Publication venue: 'Mulberry Technologies, Inc.'
Publication date: 01/01/2020
Field of study

In recent years, development of tools and methods for measuring document similarity has become a thriving field in informatics, computer science, and digital humanities. Historically, questions of document similarity have been (and still are) important or even crucial in a large variety of situations. Typically, similarity is judged by criteria which depend on context. The move from traditional to digital text technology has not only provided new possibilities for discovery and measurement of document similarity, it has also posed new challenges. Some of these challenges are technical, others conceptual. This paper argues that a particular, well-established, traditional way of starting with an arbitrary document and constructing a document similar to it, namely transcription, may fruitfully be brought to bear on questions concerning similarity criteria for digital documents. Some simple similarity measures are presented and their application to marked up documents are discussed. We conclude that when documents are encoded in the same vocabulary, n-grams constructed to include markup can be used to recognize structural similarities between documents.publishedVersio

University of Bergen

NORA - Norwegian Open Research Archives