113 research outputs found
Using concept lattices to mine functional dependencies
Concept Lattices have been proved to be a valuable tool to represent
the knowlegde in a database.
In this paper we show how functional dependencies in databases
can be extracted using Concept Lattices, not preprocessing the original
database,
but providing a new closure operator. We also prove that this method
generalizes the previous methods and
closure operators that are being used to find association rules in binary
databases.Postprint (published version
A formal context for acyclic join dependencies
Acyclic Join Dependencies (AJD) play a crucial role in database design and normalization. In this paper, we use Formal Concept Analysis (FCA) to characterize a set of AJDs that hold in a given dataset. This present work simplifies and generalizes the characterization of Multivalued Dependencies with FCA.Postprint (author's final draft
A formal context for closures of acyclic hypergraphs
Database constraints in the relational database model (RDBM) can be viewed as a set of rules that apply to a dataset, or as a set of axioms that can generate a (closed) set of those constraints. In this paper, we use Formal Concept Analysis to characterize the axioms of Acyclic Hypergraphs (in the RDBM they are called Acyclic Join Dependencies). This present paper complements and generalizes previous work on FCA and databases constraints.Peer ReviewedPostprint (author's final draft
A New Formal Context for Symmetric Dependencies
In this paper we present a new formal context for symmetric dependencies. We study its properties and compare it with previous approaches. We also discuss how this new context may open the door to solve some open problems for symmetric dependencies.Postprint (published version
Characterization of order-like dependencies with formal concept analysis
Functional Dependencies (FDs) play a key role in many fields
of the relational database model, one of the most widely used database
systems. FDs have also been applied in data analysis, data quality, knowl-
edge discovery and the like, but in a very limited scope, because of their
fixed semantics. To overcome this limitation, many generalizations have
been defined to relax the crisp definition of FDs. FDs and a few of their
generalizations have been characterized with Formal Concept Analysis
which reveals itself to be an interesting unified framework for charac-
terizing dependencies, that is, understanding and computing them in a
formal way. In this paper, we extend this work by taking into account
order-like dependencies. Such dependencies, well defined in the database
field, consider an ordering on the domain of each attribute, and not sim-
ply an equality relation as with standard FDs.Peer ReviewedPostprint (published version
Polysemy and brevity versus frequency in language
The pioneering research of G. K. Zipf on the relationship between word
frequency and other word features led to the formulation of various linguistic
laws. The most popular is Zipf's law for word frequencies. Here we focus on two
laws that have been studied less intensively: the meaning-frequency law, i.e.
the tendency of more frequent words to be more polysemous, and the law of
abbreviation, i.e. the tendency of more frequent words to be shorter. In a
previous work, we tested the robustness of these Zipfian laws for English,
roughly measuring word length in number of characters and distinguishing adult
from child speech. In the present article, we extend our study to other
languages (Dutch and Spanish) and introduce two additional measures of length:
syllabic length and phonemic length. Our correlation analysis indicates that
both the meaning-frequency law and the law of abbreviation hold overall in all
the analyzed languages
Computing Functional Dependencies with Pattern Structures
The treatment of many-valued data with FCA has been achieved by means of scaling. This method has some drawbacks, since the size of the resulting formal contexts depends usually on the number of di erent values that are present in a table, which can be very large.
Pattern structures have been proved to deal with many-valued data, offering a viable and sound alternative to scaling in order to represent and analyze sets of many-valued data with FCA.
Functional dependencies have already been dealt with FCA using the binarization of a table, that is, creating a formal context out of a set of data. Unfortunately, although this method is standard and simple, it has an important drawback, which is the fact that the resulting context is
quadratic in number of objects w.r.t. the original set of data.
In this paper, we examine how we can extract the functional dependencies that hold in a set of data using pattern structures. This allows to build an equivalent concept lattice avoiding the step of binarization, and thus comes with better concept representation and computation.Postprint (published version
Characterizing covers of functional dependencies using FCA
Functional dependencies (FDs) can be used for various important operations
on data, for instance, checking the consistency and the quality of a
database (including databases that contain complex data). Consequently, a generic framework that allows mining a sound, complete, non-redundant and yet compact set of FDs is an important tool for many different applications. There are different definitions of such sets of FDs (usually called cover).
In this paper, we present the characterization of two different kinds of covers for FDs in terms of pattern structures. The convenience of such a characterization is that it allows an easy implementation of efficient mining algorithms which can later be easily adapted to other kinds of similar dependencies. Finally, we present empirical evidence that the proposed approach can perform better than state-ofthe-art FD miner algorithms in large databases.Peer ReviewedPostprint (published version
Characterizing approximate-matching dependencies in formal concept analysis with pattern structures
Functional dependencies (FDs) provide valuable knowledge on the relations between attributes of a data table. A functional dependency holds when the values of an attribute can be determined by another. It has been shown that FDs can be expressed in terms of partitions of tuples that are in agreement w.r.t. the values taken by some subsets of attributes. To extend the use of FDs, several generalizations have been proposed. In this work, we study approximatematching dependencies that generalize FDs by relaxing the constraints on the attributes, i.e. agreement is based on a similarity relation rather than on equality. Such dependencies are attracting attention in the database field since they allow uncrisping the basic notion of FDs extending its application to many different fields, such as data quality, data mining, behavior analysis, data cleaning or data partition, among others. We show that these dependencies can be formalized in the framework of Formal Concept Analysis (FCA) using a previous formalization introduced for standard FDs. Our new results state that, starting from the conceptual structure of a pattern structure, and generalizing the notion of relation between tuples, approximate-matching dependencies can be characterized as implications in a pattern concept lattice. We finally show how to use basic FCA algorithms to construct a pattern concept lattice that entails these dependencies after a slight and tractable binarization of the original data.Postprint (author's final draft
- …