67,200 research outputs found
Anonymization of Sensitive Quasi-Identifiers for l-diversity and t-closeness
A number of studies on privacy-preserving data mining have been proposed. Most of them assume that they can separate quasi-identifiers (QIDs) from sensitive attributes. For instance, they assume that address, job, and age are QIDs but are not sensitive attributes and that a disease name is a sensitive attribute but is not a QID. However, all of these attributes can have features that are both sensitive attributes and QIDs in practice. In this paper, we refer to these attributes as sensitive QIDs and we propose novel privacy models, namely, (l1, ..., lq)-diversity and (t1, ..., tq)-closeness, and a method that can treat sensitive QIDs. Our method is composed of two algorithms: an anonymization algorithm and a reconstruction algorithm. The anonymization algorithm, which is conducted by data holders, is simple but effective, whereas the reconstruction algorithm, which is conducted by data analyzers, can be conducted according to each data analyzer’s objective. Our proposed method was experimentally evaluated using real data sets
Matching Dependencies with Arbitrary Attribute Values: Semantics, Query Answering and Integrity Constraints
Matching dependencies (MDs) were introduced to specify the identification or
matching of certain attribute values in pairs of database tuples when some
similarity conditions are satisfied. Their enforcement can be seen as a natural
generalization of entity resolution. In what we call the "pure case" of MDs,
any value from the underlying data domain can be used for the value in common
that does the matching. We investigate the semantics and properties of data
cleaning through the enforcement of matching dependencies for the pure case. We
characterize the intended clean instances and also the "clean answers" to
queries as those that are invariant under the cleaning process. The complexity
of computing clean instances and clean answers to queries is investigated.
Tractable and intractable cases depending on the MDs and queries are
identified. Finally, we establish connections with database "repairs" under
integrity constraints.Comment: 13 pages, double column, 2 figure
CAISL: Simplification Logic for Conditional Attribute Implications
In this work, we present a sound and complete axiomatic system for conditional attribute implications (CAIs) in Triadic Concept Analysis (TCA). Our approach is strongly based on the Simplification paradigm which offers a more suitable way for automated reasoning than the one based on Armstrong’s Axioms. We also present an automated method to prove the derivability of a CAI from a set of CAI s.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech
Geometric lattice structure of covering and its application to attribute reduction through matroids
The reduction of covering decision systems is an important problem in data
mining, and covering-based rough sets serve as an efficient technique to
process the problem. Geometric lattices have been widely used in many fields,
especially greedy algorithm design which plays an important role in the
reduction problems. Therefore, it is meaningful to combine coverings with
geometric lattices to solve the optimization problems. In this paper, we obtain
geometric lattices from coverings through matroids and then apply them to the
issue of attribute reduction. First, a geometric lattice structure of a
covering is constructed through transversal matroids. Then its atoms are
studied and used to describe the lattice. Second, considering that all the
closed sets of a finite matroid form a geometric lattice, we propose a
dependence space through matroids and study the attribute reduction issues of
the space, which realizes the application of geometric lattices to attribute
reduction. Furthermore, a special type of information system is taken as an
example to illustrate the application. In a word, this work points out an
interesting view, namely, geometric lattice to study the attribute reduction
issues of information systems
- …