Search CORE

33,272 research outputs found

Property Testing for Bounded Degree Databases

Author: Adler I
Harwath F
Publication venue: Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik
Publication date: 01/01/2018
Field of study

Aiming at extremely efficient algorithms for big data sets, we introduce property testing of relational databases of bounded degree. Our model generalises the bounded degree model for graphs (Goldreich and Ron, STOC 1997). We prove that in this model, if the databases have bounded tree-width, then every query definable in monadic second-order logic with modulo counting is testable with a constant number of oracle queries and polylogarithmic running time. This is the first logical meta-theorem in property testing of sparse models. Furthermore, we discuss conditions for the existence of uniform and non-uniform testers

Dagstuhl Research Online Publication Server

White Rose Research Online

Exploring Differential Obliviousness

Author: Beimel Amos
Nissim Kobbi
Zaheri Mohammad
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019)
Publication date: 01/01/2019
Field of study

In a recent paper, Chan et al. [SODA \u2719] proposed a relaxation of the notion of (full) memory obliviousness, which was introduced by Goldreich and Ostrovsky [J. ACM \u2796] and extensively researched by cryptographers. The new notion, differential obliviousness, requires that any two neighboring inputs exhibit similar memory access patterns, where the similarity requirement is that of differential privacy. Chan et al. demonstrated that differential obliviousness allows achieving improved efficiency for several algorithmic tasks, including sorting, merging of sorted lists, and range query data structures. In this work, we continue the exploration of differential obliviousness, focusing on algorithms that do not necessarily examine all their input. This choice is motivated by the fact that the existence of logarithmic overhead ORAM protocols implies that differential obliviousness can yield at most a logarithmic improvement in efficiency for computations that need to examine all their input. In particular, we explore property testing, where we show that differential obliviousness yields an almost linear improvement in overhead in the dense graph model, and at most quadratic improvement in the bounded degree model. We also explore tasks where a non-oblivious algorithm would need to explore different portions of the input, where the latter would depend on the input itself, and where we show that such a behavior can be maintained under differential obliviousness, but not under full obliviousness. Our examples suggest that there would be benefits in further exploring which class of computational tasks are amenable to differential obliviousness

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

gMark: Schema-Driven Generation of Graphs and Queries

Author: Advokaat Nicky
Bagan Guillaume
Bonifati Angela
Ciucanu Radu
Fletcher George H. L.
Lemay Aurélien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/11/2016
Field of study

Massive graph data sets are pervasive in contemporary application domains. Hence, graph database systems are becoming increasingly important. In the experimental study of these systems, it is vital that the research community has shared solutions for the generation of database instances and query workloads having predictable and controllable properties. In this paper, we present the design and engineering principles of gMark, a domain- and query language-independent graph instance and query workload generator. A core contribution of gMark is its ability to target and control the diversity of properties of both the generated instances and the generated workloads coupled to these instances. Further novelties include support for regular path queries, a fundamental graph query paradigm, and schema-driven selectivity estimation of queries, a key feature in controlling workload chokepoints. We illustrate the flexibility and practical usability of gMark by showcasing the framework's capabilities in generating high quality graphs and workloads, and its ability to encode user-defined schemas across a variety of application domains.Comment: Accepted in November 2016. URL: http://ieeexplore.ieee.org/document/7762945/. in IEEE Transactions on Knowledge and Data Engineering 201

arXiv.org e-Print Archive

Crossref

Repository TU/e

Pure OAI Repository

HAL Clermont Université

INRIA a CCSD electronic archive server

HAL

Hal-Diderot

Inductive queries for a drug designing robot scientist

Author: A. Lingas
C. Hansch
C.A. Lipinski
D.R. Jones
D.R. Jones
H. Blockeel
J. Matousek
L. Raedt De
R.D. King
R.D. King
T. Gärtner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments

Lirias

Crossref

Bournemouth University Research Online

The University of Manchester - Institutional Repository

DIAL UCLouvain

Genomic stuff: Governing the (im)matter of life

Author: Palsson G
Prainsack B
Publication venue: Research Today Publications
Publication date: 01/08/2011
Field of study

Emphasizing the context of what has often been referred to as “scarce natural resources”, in particular forests, meadows, and fishing stocks, Elinor Ostrom’s important work Governing the commons (1990) presents an institutional framework for discussing the development and use of collective action with respect to environmental problems. In this article we discuss extensions of Ostrom’s approach to genes and genomes and explore its limits and usefulness. With the new genetics, we suggest, the biological gaze has not only been turned inward to the management and mining of the human body, also the very notion of the “biological” has been destabilized. This shift and destabilization, we argue, which is the result of human refashioning and appropriation of “life itself”, raises important questions about the relevance and applicability of Ostrom’s institutional framework in the context of what we call “genomic stuff”, genomic material, data, and information

Directory of Open Access Journals

King's Research Portal

Utrecht University Repository

Brunel University Research Archive