Search CORE

104 research outputs found

Snapshot Semantics for Temporal Multiset Relations (Extended Version)

Author: Böhlen Michael
Dignös Anton
Gamper Johann
Glavic Boris
Niu Xing
Publication venue
Publication date: 06/02/2019
Field of study

Snapshot semantics is widely used for evaluating queries over temporal data: temporal relations are seen as sequences of snapshot relations, and queries are evaluated at each snapshot. In this work, we demonstrate that current approaches for snapshot semantics over interval-timestamped multiset relations are subject to two bugs regarding snapshot aggregation and bag difference. We introduce a novel temporal data model based on K-relations that overcomes these bugs and prove it to correctly encode snapshot semantics. Furthermore, we present an efficient implementation of our model as a database middleware and demonstrate experimentally that our approach is competitive with native implementations and significantly outperforms such implementations on queries that involve aggregation.Comment: extended version of PVLDB pape

arXiv.org e-Print Archive

ZORA

Finding k-Dissimilar Paths with Minimum Collective Length

Author: Blumenthal David B.
Bouros Panagiotis
Chondrogiannis Theodoros
Gamper Johann
Leser Ulf
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 24/10/2018
Field of study

Shortest path computation is a fundamental problem in road networks. However, in many real-world scenarios, determining solely the shortest path is not enough. In this paper, we study the problem of finding k-Dissimilar Paths with Minimum Collective Length (kDPwML), which aims at computing a set of paths from a source s to a target t such that all paths are pairwise dissimilar by at least \theta and the sum of the path lengths is minimal. We introduce an exact algorithm for the kDPwML problem, which iterates over all possible s-t paths while employing two pruning techniques to reduce the prohibitively expensive computational cost. To achieve scalability, we also define the much smaller set of the simple single-via paths, and we adapt two algorithms for kDPwML queries to iterate over this set. Our experimental analysis on real road networks shows that iterating over all paths is impractical, while iterating over the set of simple single-via paths can lead to scalable solutions with only a small trade-off in the quality of the results.Comment: Extended version of the SIGSPATIAL'18 paper under the same titl

arXiv.org e-Print Archive

Crossref

GEDLIB: Une bibliothèque C++ pour le calcul de la distance d'édition sur graphes

Author: Blumenthal David
Bougleux Sébastien
Brun Luc
Gamper Johann
Publication venue: HAL CCSD
Publication date: 19/06/2019
Field of study

International audienceThe graph edit distance (GED) is a flexible graph dissimilarity measure widely used within the structural pattern recognition field. In this paper, we present GEDLIB, a C++ library for exactly or approximately computing GED. Many existing algorithms for GED are already implemented in GEDLIB. Moreover, GEDLIB is designed to be easily extensible: for implementing new edit cost functions and GED algorithms, it suffices to implement abstract classes contained in the library. For implementing these extensions, the user has access to a wide range of utilities, such as deep neural networks, support vector machines, mixed integer linear programming solvers, a blackbox optimizer, and solvers for the linear sum assignment problem with and without error-correction

Upper Bounding the Graph Edit Distance Based on Rings and Machine Learning

Author: Blumenthal David B.
Bougleux Sébastien
Brun Luc
Gamper Johann
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 02/01/2021
Field of study

The graph edit distance (GED) is a flexible distance measure which is widely used for inexact graph matching. Since its exact computation is NP-hard, heuristics are used in practice. A popular approach is to obtain upper bounds for GED via transformations to the linear sum assignment problem with error-correction (LSAPE). Typically, local structures and distances between them are employed for carrying out this transformation, but recently also machine learning techniques have been used. In this paper, we formally define a unifying framework LSAPE-GED for transformations from GED to LSAPE. We also introduce rings, a new kind of local structures designed for graphs where most information resides in the topology rather than in the node labels. Furthermore, we propose two new ring based heuristics RING and RING-ML, which instantiate LSAPE-GED using the traditional and the machine learning based approach for transforming GED to LSAPE, respectively. Extensive experiments show that using rings for upper bounding GED significantly improves the state of the art on datasets where most information resides in the graphs' topologies. This closes the gap between fast but rather inaccurate LSAPE based heuristics and more accurate but significantly slower GED algorithms based on local search

arXiv.org e-Print Archive

HAL - Normandie Université

Database Technology for Processing Temporal Data

Author: Böhlen Michael Hanspeter
Dignös Anton
Gamper Johann
Jensen Christian Søndergaard
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH
Publication date: 01/10/2018
Field of study

VBN

Leveraging range joins for the computation of overlap joins

Author: Böhlen Michael H.
Dignös Anton
Gamper Johann
Jensen Christian S.
Moser Peter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Joins are essential and potentially expensive operations in database management systems. When data is associated with time periods, joins commonly include predicates that require pairs of argument tuples to overlap in order to qualify for the result. Our goal is to enable built-in systems support for such joins. In particular, we present an approach where overlap joins are formulated as unions of range joins, which are more general purpose joins compared to overlap joins, i.e., are useful in their own right, and are supported well by B+-trees. The approach is sufficiently flexible that it also supports joins with additional equality predicates, as well as open, closed, and half-open time periods over discrete and continuous domains, thus offering both generality and simplicity, which is important in a system setting. We provide both a stand-alone solution that performs on par with the state-of-the-art and a DBMS embedded solution that is able to exploit standard indexing and clearly outperforms existing DBMS solutions that depend on specialized indexing techniques. We offer both analytical and empirical evaluations of the proposals. The empirical study includes comparisons with pertinent existing proposals and offers detailed insight into the performance characteristics of the proposals

VBN

ZORA

Leveraging range joins for the computation of overlap joins

Author: Böhlen Michael Hanspeter
Dignös Anton
Gamper Johann
Jensen Christian S
Moser Peter
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

ZORA

Efficient haplotype block recognition of very long and dense genetic sequences

Author: A Christoforou
A Petersen
AJ Lorenz
AL Price
C Dering
C Pattaro
C Song
C Zapata
C Zapata
Cristian Pattaro
DA Tregouet
Daniel Taliun
DE Reich
EC Anderson
H Shim
J Gibson
J Park
JC Barrett
JC Lambert
JD Wall
Johann Gamper
K Wang
K Zhang
K Zhang
MJ Daly
N Patil
O Delaneau
P Flicek
R Mourad
RC Lewontin
S Gu
S Purcell
SB Gabriel
The 1000 Genomes Project Consortium
The International HapMap Consortium
W Hill
WJ Kent
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref