8,203 research outputs found
Lattices with non-Shannon Inequalities
We study the existence or absence of non-Shannon inequalities for variables
that are related by functional dependencies. Although the power-set on four
variables is the smallest Boolean lattice with non-Shannon inequalities there
exist lattices with many more variables without non-Shannon inequalities. We
search for conditions that ensures that no non-Shannon inequalities exist. It
is demonstrated that 3-dimensional distributive lattices cannot have
non-Shannon inequalities and planar modular lattices cannot have non-Shannon
inequalities. The existence of non-Shannon inequalities is related to the
question of whether a lattice is isomorphic to a lattice of subgroups of a
group.Comment: Ten pages. Submitted to ISIT 2015. The appendix will not appear in
the proceeding
A Rule-Based Approach to Analyzing Database Schema Objects with Datalog
Database schema elements such as tables, views, triggers and functions are
typically defined with many interrelationships. In order to support database
users in understanding a given schema, a rule-based approach for analyzing the
respective dependencies is proposed using Datalog expressions. We show that
many interesting properties of schema elements can be systematically determined
this way. The expressiveness of the proposed analysis is exemplarily shown with
the problem of computing induced functional dependencies for derived relations.
The propagation of functional dependencies plays an important role in data
integration and query optimization but represents an undecidable problem in
general. And yet, our rule-based analysis covers all relational operators as
well as linear recursive expressions in a systematic way showing the depth of
analysis possible by our proposal. The analysis of functional dependencies is
well-integrated in a uniform approach to analyzing dependencies between schema
elements in general.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
Characterization of order-like dependencies with formal concept analysis
Functional Dependencies (FDs) play a key role in many fields
of the relational database model, one of the most widely used database
systems. FDs have also been applied in data analysis, data quality, knowl-
edge discovery and the like, but in a very limited scope, because of their
fixed semantics. To overcome this limitation, many generalizations have
been defined to relax the crisp definition of FDs. FDs and a few of their
generalizations have been characterized with Formal Concept Analysis
which reveals itself to be an interesting unified framework for charac-
terizing dependencies, that is, understanding and computing them in a
formal way. In this paper, we extend this work by taking into account
order-like dependencies. Such dependencies, well defined in the database
field, consider an ordering on the domain of each attribute, and not sim-
ply an equality relation as with standard FDs.Peer ReviewedPostprint (published version
Integrity Constraints Revisited: From Exact to Approximate Implication
Integrity constraints such as functional dependencies (FD), and multi-valued
dependencies (MVD) are fundamental in database schema design. Likewise,
probabilistic conditional independences (CI) are crucial for reasoning about
multivariate probability distributions. The implication problem studies whether
a set of constraints (antecedents) implies another constraint (consequent), and
has been investigated in both the database and the AI literature, under the
assumption that all constraints hold exactly. However, many applications today
consider constraints that hold only approximately. In this paper we define an
approximate implication as a linear inequality between the degree of
satisfaction of the antecedents and consequent, and we study the relaxation
problem: when does an exact implication relax to an approximate implication? We
use information theory to define the degree of satisfaction, and prove several
results. First, we show that any implication from a set of data dependencies
(MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most
quadratic in the number of variables; when the consequent is an FD, the factor
can be reduced to 1. Second, we prove that there exists an implication between
CIs that does not admit any relaxation; however, we prove that every
implication between CIs relaxes "in the limit". Finally, we show that the
implication problem for differential constraints in market basket analysis also
admits a relaxation with a factor equal to 1. Our results recover, and
sometimes extend, several previously known results about the implication
problem: implication of MVDs can be checked by considering only 2-tuple
relations, and the implication of differential constraints for frequent item
sets can be checked by considering only databases containing a single
transaction
Integrity Constraints Revisited: From Exact to Approximate Implication
Integrity constraints such as functional dependencies (FD), and multi-valued dependencies (MVD) are fundamental in database schema design. Likewise, probabilistic conditional independences (CI) are crucial for reasoning about multivariate probability distributions. The implication problem studies whether a set of constraints (antecedents) implies another constraint (consequent), and has been investigated in both the database and the AI literature, under the assumption that all constraints hold exactly. However, many applications today consider constraints that hold only approximately. In this paper we define an approximate implication as a linear inequality between the degree of satisfaction of the antecedents and consequent, and we study the relaxation problem: when does an exact implication relax to an approximate implication? We use information theory to define the degree of satisfaction, and prove several results. First, we show that any implication from a set of data dependencies (MVDs+FDs) can be relaxed to a simple linear inequality with a factor at most quadratic in the number of variables; when the consequent is an FD, the factor can be reduced to 1. Second, we prove that there exists an implication between CIs that does not admit any relaxation; however, we prove that every implication between CIs relaxes "in the limit". Finally, we show that the implication problem for differential constraints in market basket analysis also admits a relaxation with a factor equal to 1. Our results recover, and sometimes extend, several previously known results about the implication problem: implication of MVDs can be checked by considering only 2-tuple relations, and the implication of differential constraints for frequent item sets can be checked by considering only databases containing a single transaction
A New Statistical Parser Based on Bigram Lexical Dependencies
This paper describes a new statistical parser which is based on probabilities
of dependencies between head-words in the parse tree. Standard bigram
probability estimation techniques are extended to calculate probabilities of
dependencies between pairs of words. Tests using Wall Street Journal data show
that the method performs at least as well as SPATTER (Magerman 95, Jelinek et
al 94), which has the best published results for a statistical parser on this
task. The simplicity of the approach means the model trains on 40,000 sentences
in under 15 minutes. With a beam search strategy parsing speed can be improved
to over 200 sentences a minute with negligible loss in accuracy.Comment: 8 pages, to appear in Proceedings of ACL 96. Uuencoded gz-compressed
postscript file created by csh script uufile
- …