Search CORE

736 research outputs found

Proceedings of the 2019 International Conference on Management of Data

Author
Publication venue
Publication date: 30/06/2019
Field of study

CWI's Institutional Repository

Welcome to Sigmod 2019 - The 2019 ACM SIGMOD International Conference on the Management of Data!

Author: Ailamaki A. (Anastasia)
Boncz P.A. (Peter)
Manegold S. (Stefan)
Publication venue
Publication date: 30/06/2019
Field of study

CWI's Institutional Repository

Personalization by Partial Evaluation.

Author: Ramakrishnan Dr. Naren
Rosson Dr. Mary Beth
Publication venue
Publication date: 01/01/2001
Field of study

The central contribution of this paper is to model personalization by the programmatic notion of partial evaluation.Partial evaluation is a technique used to automatically specialize programs, given incomplete information about their input.The methodology presented here models a collection of information resources as a program (which abstracts the underlying schema of organization and ﬂow of information),partially evaluates the program with respect to user input,and recreates a personalized site from the specialized program.This enables a customizable methodology called PIPE that supports the automatic specialization of resources,without enumerating the interaction sequences beforehand .Issues relating to the scalability of PIPE,information integration,sessioniz-ling scenarios,and case studies are presented

Computer Science Technical Reports @Virginia Tech

Integrity Constraints Revisited: From Exact to Approximate Implication

Author: Kenig Batya
Suciu Dan
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 23rd International Conference on Database Theory (ICDT 2020)
Publication date: 01/01/2020
Field of study

Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction

Dagstuhl Research Online Publication Server

Integrity Constraints Revisited: From Exact to Approximate Implication

Author: Kenig Batya
Suciu Dan
Publication venue
Publication date: 03/04/2019
Field of study

arXiv.org e-Print Archive

Episciences.org

Directory of Open Access Journals

Fast Algorithms and Efficient Statistics: N-point Correlation Functions

Author: Alex Gray
Alex Szalay
Andrew Moore
Andy Connolly
Andy Genovese
II
Istvan Szapudi
Jeff Schneider
Larry Grone
Larry Wasserman
Nick Kanidoris
Robert Nichol
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2000
Field of study

We present here a new algorithm for the fast computation of N-point correlation functions in large astronomical data sets. The algorithm is based on kdtrees which are decorated with cached sufficient statistics thus allowing for orders of magnitude speed-ups over the naive non-tree-based implementation of correlation functions. We further discuss the use of controlled approximations within the computation which allows for further acceleration. In summary, our algorithm now makes it possible to compute exact, all-pairs, measurements of the 2, 3 and 4-point correlation functions for cosmological data sets like the Sloan Digital Sky Survey (SDSS; York et al. 2000) and the next generation of Cosmic Microwave Background experiments (see Szapudi et al. 2000).Comment: To appear in Proceedings of MPA/MPE/ESO Conference "Mining the Sky", July 31 - August 4, 2000, Garching, German

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

Ontology ranking based on the analysis of concept structures

Author: Alani Harith
Brewster Christopher
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2005
Field of study

In view of the need to provide tools to facilitate the reuse of existing knowledge structures such as ontologies, we present in this paper a system, AKTiveRank, for the ranking of ontologies. AKTiveRank uses as input the search terms provided by a knowledge engineer and, using the output of an ontology search engine, ranks the ontologies. We apply a number of classical metrics in an attempt to investigate their appropriateness for ranking ontologies, and compare the results with a questionnaire-based human study. Our results show that AKTiveRank will have great utility although there is potential for improvement

Southampton (e-Prints Soton)

Crossref

Open Research Online (The Open University)

The relational XQuery puzzle: a look-back on the pieces found so far

Author: Teubner Jens
Publication venue
Publication date: 18/06/2018
Field of study

Given the tremendous versatility of relational database implementations toward awide range of database problems, it seems only natural to consider them as back-ends for XML data processing. Yet, the assumptions behind the language XQuery are considerably different to those in traditional RDBMSs. The underlying data model is a tree, data and results carry an intrinsic order, queries are described using explicit iteration and, after all, problems are everything else but regular. Solving the relational XQuery puzzle, therefore, has challenged anumber of research groups over the past years. The purpose of this article is to summarize and assess some of the results that have been obtained during this period to solve the puzzle. Our main focus is on the Pathfinder XQuery compiler, afull reference implementation of apurely relational XQuery processor. As we dissect its components, we relate them to other work in the field and also point to open problems and limitations in the context of relational XQuery processin

RERO DOC Digital Library

Rational Data Base Standards: An Examination of the 1978 CODASYL DDLC Report

Author: Clemons Eric K
Publication venue: ScholarlyCommons
Publication date: 01/01/1979
Field of study

The CODASYL Data Description Language committee\u27s 1978 Report incorporates numerous enhancements and language changes made since the earlier 1971 and 1973 reports. Unfortunately, the major design limitations associated with these earlier specifications, in particular a schema facility too closely related to machine rather than enterprise requirements and an extremely limited subschema facility, are retained. After examination of these limitations, we suggest that the recent CODASYL specifications remain inappropriate as either an instance of an ANSI/SPARC three-schema architecture or as a candidate for a national data base system standard. A long term strategy for the development of a more rational proposal for standardization is suggested. And a short term strategy is offered, one that permits rational planning for and implementation of data base conversions to occur today, without concern that subsequently developed standards might render obsolete the conversion effort and data base management system selected

ScholarlyCommons@Penn