530 research outputs found
Quantifying the Loss of Acyclic Join Dependencies
Acyclic schemes posses known benefits for database design, speeding up
queries, and reducing space requirements. An acyclic join dependency (AJD) is
lossless with respect to a universal relation if joining the projections
associated with the schema results in the original universal relation. An
intuitive and standard measure of loss entailed by an AJD is the number of
redundant tuples generated by the acyclic join. Recent work has shown that the
loss of an AJD can also be characterized by an information-theoretic measure.
Motivated by the problem of automatically fitting an acyclic schema to a
universal relation, we investigate the connection between these two
characterizations of loss. We first show that the loss of an AJD is captured
using the notion of KL-Divergence. We then show that the KL-divergence can be
used to bound the number of redundant tuples. We prove a deterministic lower
bound on the percentage of redundant tuples. For an upper bound, we propose a
random database model, and establish a high probability bound on the percentage
of redundant tuples, which coincides with the lower bound for large databases.Comment: To appear in PODS 202
Characterizations of Decomposable Dependency Models
Decomposable dependency models possess a number of interesting and useful
properties. This paper presents new characterizations of decomposable models in
terms of independence relationships, which are obtained by adding a single
axiom to the well-known set characterizing dependency models that are
isomorphic to undirected graphs. We also briefly discuss a potential
application of our results to the problem of learning graphical models from
data.Comment: See http://www.jair.org/ for any accompanying file
Pure Nash Equilibria: Hard and Easy Games
We investigate complexity issues related to pure Nash equilibria of strategic
games. We show that, even in very restrictive settings, determining whether a
game has a pure Nash Equilibrium is NP-hard, while deciding whether a game has
a strong Nash equilibrium is SigmaP2-complete. We then study practically
relevant restrictions that lower the complexity. In particular, we are
interested in quantitative and qualitative restrictions of the way each players
payoff depends on moves of other players. We say that a game has small
neighborhood if the utility function for each player depends only on (the
actions of) a logarithmically small number of other players. The dependency
structure of a game G can be expressed by a graph DG(G) or by a hypergraph
H(G). By relating Nash equilibrium problems to constraint satisfaction problems
(CSPs), we show that if G has small neighborhood and if H(G) has bounded
hypertree width (or if DG(G) has bounded treewidth), then finding pure Nash and
Pareto equilibria is feasible in polynomial time. If the game is graphical,
then these problems are LOGCFL-complete and thus in the class NC2 of highly
parallelizable problems
Tractable Optimization Problems through Hypergraph-Based Structural Restrictions
Several variants of the Constraint Satisfaction Problem have been proposed
and investigated in the literature for modelling those scenarios where
solutions are associated with some given costs. Within these frameworks
computing an optimal solution is an NP-hard problem in general; yet, when
restricted over classes of instances whose constraint interactions can be
modelled via (nearly-)acyclic graphs, this problem is known to be solvable in
polynomial time. In this paper, larger classes of tractable instances are
singled out, by discussing solution approaches based on exploiting hypergraph
acyclicity and, more generally, structural decomposition methods, such as
(hyper)tree decompositions
A formal context for closures of acyclic hypergraphs
Database constraints in the relational database model (RDBM) can be viewed as a set of rules that apply to a dataset, or as a set of axioms that can generate a (closed) set of those constraints. In this paper, we use Formal Concept Analysis to characterize the axioms of Acyclic Hypergraphs (in the RDBM they are called Acyclic Join Dependencies). This present paper complements and generalizes previous work on FCA and databases constraints.Peer ReviewedPostprint (author's final draft
Justification for inclusion dependency normal form
Functional dependencies (FDs) and inclusion dependencies (INDs) are the most fundamental integrity constraints that arise in practice in relational databases. In this paper, we address the issue of normalization in the presence of FDs and INDs and, in particular, the semantic justification for Inclusion Dependency Normal Form (IDNF), a normal form which combines Boyce-Codd normal form with the restriction on the INDs that they be noncircular and key-based. We motivate and formalize three goals of database design in the presence of FDs and INDs: noninteraction between FDs and INDs, elimination of redundancy and update anomalies, and preservation of entity integrity. We show that, as for FDs, in the presence of INDs being free of redundancy is equivalent to being free of update anomalies. Then, for each of these properties, we derive equivalent syntactic conditions on the database design. Individually, each of these syntactic conditions is weaker than IDNF and the restriction that an FD not be embedded in the righthand side of an IND is common to three of the conditions. However, we also show that, for these three goals of database design to be satisfied simultaneously, IDNF is both a necessary and sufficient condition
Fast Parallel Algorithms on a Class of Graph Structures With Applications in Relational Databases and Computer Networks.
The quest for efficient parallel algorithms for graph related problems necessitates not only fast computational schemes but also requires insights into their inherent structures that lend themselves to elegant problem solving methods. Towards this objective efficient parallel algorithms on a class of hypergraphs called acyclic hypergraphs and directed hypergraphs are developed in this thesis. Acyclic hypergraphs are precisely chordal graphs and their subclasses, and they have applications in relational databases and computer networks. In this thesis, first, we present efficient parallel algorithms for the following problems on graphs. (1) determining whether a graph is strongly chordal, ptolemaic, or a block graph. If the graph is strongly chordal, determine the strongly perfect vertex elimination ordering. (2) determining the minimal set of edges needed to make an arbitrary graph strongly chordal, ptolemaic, or a block graph. (3) determining the minimum cardinality dominating set, connected dominating set, total dominating set, and the domatic number of a strongly chordal graph. Secondly, we show that the query implication problem (Q\sb1\ \to\ Q\sb2) on two queries, which is to determine whether the data retrieved by query Q\sb1 is always a subset of the data retrieved by query Q\sb2, is not even in NP and in fact complete in \Pi\sb2\sp{p}. We present several \u27fine-grain\u27 analyses of the query implication problem and show that the query implication can be solved in polynomial time given chordal queries. Thirdly, we develop efficient parallel algorithms for manipulating directed hypergraphs H such as finding a directed path in H, closure of H, and minimum equivalent hypergraph of H. We show that finding a directed path in a directed hypergraph is inherently sequential. For directed hypergraphs with fixed degree and diameter we present NC algorithms for manipulations. Directed hypergraphs are representation schemes for functional dependencies in relational databases. Finally, we also present an efficient parallel algorithm for multi-dimensional range search. We show that a set of points in a rectangular parallelepiped can be obtained in O(logn) time with only 2.log\sp2 n 10.logn + 14 processors on a EREW-PRAM. A nontrivial implementation technique on the hypercube parallel architecture is also presented. Our method can be easily generalized to the case of d-dimensional range search
Beyond Worst-Case Analysis for Joins with Minesweeper
We describe a new algorithm, Minesweeper, that is able to satisfy stronger
runtime guarantees than previous join algorithms (colloquially, `beyond
worst-case guarantees') for data in indexed search trees. Our first
contribution is developing a framework to measure this stronger notion of
complexity, which we call {\it certificate complexity}, that extends notions of
Barbay et al. and Demaine et al.; a certificate is a set of propositional
formulae that certifies that the output is correct. This notion captures a
natural class of join algorithms. In addition, the certificate allows us to
define a strictly stronger notion of runtime complexity than traditional
worst-case guarantees. Our second contribution is to develop a dichotomy
theorem for the certificate-based notion of complexity. Roughly, we show that
Minesweeper evaluates -acyclic queries in time linear in the certificate
plus the output size, while for any -cyclic query there is some instance
that takes superlinear time in the certificate (and for which the output is no
larger than the certificate size). We also extend our certificate-complexity
analysis to queries with bounded treewidth and the triangle query.Comment: [This is the full version of our PODS'2014 paper.
- …