Search CORE

208 research outputs found

Context-Free Path Queries on RDF Graphs

Author: A Hogan
A Polleres
EV Kostylev
F Alkhateeb
F Alkhateeb
GHL Fletcher
J Hayes
J Hopcroft
J Pérez
J Pérez
K Losemann
KJ Kochut
M Arenas
M Lange
M Marx
P Linz
P Sevon
R Angles
S Abiteboul
S Abiteboul
S Bischof
X Zhang
X Zhang
Publication venue
Publication date: 07/10/2016
Field of study

Navigational graph queries are an important class of queries that canextract implicit binary relations over the nodes of input graphs. Most of the navigational query languages used in the RDF community, e.g. property paths in W3C SPARQL 1.1 and nested regular expressions in nSPARQL, are based on the regular expressions. It is known that regular expressions have limited expressivity; for instance, some natural queries, like same generation-queries, are not expressible with regular expressions. To overcome this limitation, in this paper, we present cfSPARQL, an extension of SPARQL query language equipped with context-free grammars. The cfSPARQL language is strictly more expressive than property paths and nested expressions. The additional expressivity can be used for modelling graph similarities, graph summarization and ontology alignment. Despite the increasing expressivity, we show that cfSPARQL still enjoys a low computational complexity and can be evaluated efficiently.Comment: 25 page

arXiv.org e-Print Archive

Crossref

Transformers over Directed Acyclic Graphs

Author: Luo Yuankai
Shi Lei
Thost Veronika
Publication venue
Publication date: 29/06/2023
Field of study

Transformer models have recently gained popularity in graph representation learning as they have the potential to learn complex relationships beyond the ones captured by regular graph neural networks. The main research question is how to inject the structural bias of graphs into the transformer architecture, and several proposals have been made for undirected molecular graphs and, recently, also for larger network graphs. In this paper, we study transformers over directed acyclic graphs (DAGs) and propose architecture adaptations tailored to DAGs: (1) An attention mechanism that is considerably more efficient than the regular quadratic complexity of transformers and at the same time faithfully captures the DAG structure, and (2) a positional encoding of the DAG's partial order, complementing the former. We rigorously evaluate our approach over various types of tasks, ranging from classifying source code graphs to nodes in citation networks, and show that it is effective in two important aspects: in making graph transformers generally outperform graph neural networks tailored to DAGs and in improving SOTA graph transformer performance in terms of both quality and efficiency

arXiv.org e-Print Archive

On Quasi-Interpretations, Blind Abstractions and Implicit Complexity

Author: Baillot Patrick
Lago Ugo Dal
Moyen Jean-Yves
Publication venue
Publication date: 01/01/2006
Field of study

Quasi-interpretations are a technique to guarantee complexity bounds on first-order functional programs: with termination orderings they give in particular a sufficient condition for a program to be executable in polynomial time, called here the P-criterion. We study properties of the programs satisfying the P-criterion, in order to better understand its intensional expressive power. Given a program on binary lists, its blind abstraction is the nondeterministic program obtained by replacing lists by their lengths (natural numbers). A program is blindly polynomial if its blind abstraction terminates in polynomial time. We show that all programs satisfying a variant of the P-criterion are in fact blindly polynomial. Then we give two extensions of the P-criterion: one by relaxing the termination ordering condition, and the other one (the bounded value property) giving a necessary and sufficient condition for a program to be polynomial time executable, with memoisation.Comment: 18 page

arXiv.org e-Print Archive

HAL-Paris 13

Unification and Matching on Compressed Terms

Author: Gascón Adrià
Godoy Guillem
Schmidt-Schauß Manfred
Publication venue
Publication date: 08/03/2010
Field of study

Term unification plays an important role in many areas of computer science, especially in those related to logic. The universal mechanism of grammar-based compression for terms, in particular the so-called Singleton Tree Grammars (STG), have recently drawn considerable attention. Using STGs, terms of exponential size and height can be represented in linear space. Furthermore, the term representation by directed acyclic graphs (dags) can be efficiently simulated. The present paper is the result of an investigation on term unification and matching when the terms given as input are represented using different compression mechanisms for terms such as dags and Singleton Tree Grammars. We describe a polynomial time algorithm for context matching with dags, when the number of different context variables is fixed for the problem. For the same problem, NP-completeness is obtained when the terms are represented using the more general formalism of Singleton Tree Grammars. For first-order unification and matching polynomial time algorithms are presented, each of them improving previous results for those problems.Comment: This paper is posted at the Computing Research Repository (CoRR) as part of the process of submission to the journal ACM Transactions on Computational Logic (TOCL)

arXiv.org e-Print Archive

CiteSeerX

Probabilistic Constraint Logic Programming

Author: Riezler Stefan
Publication venue
Publication date: 11/11/1997
Field of study

This paper addresses two central problems for probabilistic processing models: parameter estimation from incomplete data and efficient retrieval of most probable analyses. These questions have been answered satisfactorily only for probabilistic regular and context-free models. We address these problems for a more expressive probabilistic constraint logic programming model. We present a log-linear probability model for probabilistic constraint logic programming. On top of this model we define an algorithm to estimate the parameters and to select the properties of log-linear models from incomplete data. This algorithm is an extension of the improved iterative scaling algorithm of Della-Pietra, Della-Pietra, and Lafferty (1995). Our algorithm applies to log-linear models in general and is accompanied with suitable approximation methods when applied to large data spaces. Furthermore, we present an approach for searching for most probable analyses of the probabilistic constraint logic programming model. This method can be applied to the ambiguity resolution problem in natural language processing applications.Comment: 35 pages, uses sfbart.cl

arXiv.org e-Print Archive

CiteSeerX

An annotation database for multimodal scientific data

Author: Bogdanschi Cristina
Santini Simone
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 01/01/2009
Field of study

Cristina Bogdanschi, Simone Santini, "An annotation database for multimodal scientific data", Proc. SPIE 7255, Multimedia Content Access: Algorithms and Systems III, 72550G (2009). Copyright 2009 Society of Photo‑Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.In many collaborative research environments novel tools and techniques allow researchers to generate data from experiments and observations at a staggering rate. Researchers in these areas are now facing the strong need for querying, sharing and exchanging these data in a uniform and transparent fashion. However, due to the nature of the various types of heterogeneous data and lack of local and global database structures, standard data integration approaches fail or are not applicable. A viable solution to this problem is the extensive use of metadata. In this paper we present the model of an annotation management system suitable for such research environments, and discuss some aspects of its implementation. Annotations provide rich linkage structure between data and between themselves that translates in a complex graph structure of which annotations and data are the nodes. We show how annotations are managed and used for data retrieval and outline some of the query techniques used in the system

Biblos-e Archivo

Learning Scheduling Algorithms for Data Processing Clusters

Author: Abadi Martín
Addanki Ravichandra
Dai Hanjun
Finn Chelsea
Ghodsi Ali
Gog Ionel
Grandl Robert
Greensmith Evan
Hindman Benjamin
Kingma Diederik P
Mao Hongzi
Mao Hongzi
Marcus Ryan
Mirhoseini Azalia
Mirhoseini Azalia
Pinto Lerrel
Schulman John
Spark Apache
Sutton S.
Weaver Lex
Zaharia Matei
Publication venue
Publication date: 21/08/2019
Field of study

Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load

arXiv.org e-Print Archive

Crossref

DSpace@MIT

CAPRI: efficient inference of cancer progression models from cross-sectional data

Author: Alex Graudenzi
Antoniotti
Beerenwinkel
Bud Mishra
Carvalho
Daniele Ramazzotti
Efron
Giancarlo Mauri
Giulio Caravagna
Hitchcock
Ilya Korsunsky
Inoue
Kleinberg
Koller
Loes Olde Loohuis
Marco Antoniotti
NCI and the NHGRI
Pearl
Pearl
Spirtes
Suppes
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2015
Field of study

We devise a novel inference algorithm to effectively solve the cancer progression model reconstruction problem. Our empirical analysis of the accuracy and convergence rate of our algorithm, CAncer PRogression Inference (CAPRI), shows that it outperforms the state-of-the-art algorithms addressing similar problems. Motivation: Several cancer-related genomic data have become available (e.g., The Cancer Genome Atlas, TCGA) typically involving hundreds of patients. At present, most of these data are aggregated in a cross-sectional fashion providing all measurements at the time of diagnosis.Our goal is to infer cancer progression models from such data. These models are represented as directed acyclic graphs (DAGs) of collections of selectivity relations, where a mutation in a gene A selects for a later mutation in a gene B. Gaining insight into the structure of such progressions has the potential to improve both the stratification of patients and personalized therapy choices. Results: The CAPRI algorithm relies on a scoring method based on a probabilistic theory developed by Suppes, coupled with bootstrap and maximum likelihood inference. The resulting algorithm is efficient, achieves high accuracy, and has good complexity, also, in terms of convergence properties. CAPRI performs especially well in the presence of noise in the data, and with limited sample sizes. Moreover CAPRI, in contrast to other approaches, robustly reconstructs different types of confluent trajectories despite irregularities in the data.We also report on an ongoing investigation using CAPRI to study atypical Chronic Myeloid Leukemia, in which we uncovered non trivial selectivity relations and exclusivity patterns among key genomic events

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Trieste

Crossref

Edinburgh Research Explorer