2,923 research outputs found
Why is the snowflake schema a good data warehouse design?
Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero
Knowledge structure, knowledge granulation and knowledge distance in a knowledge base
AbstractOne of the strengths of rough set theory is the fact that an unknown target concept can be approximately characterized by existing knowledge structures in a knowledge base. Knowledge structures in knowledge bases have two categories: complete and incomplete. In this paper, through uniformly expressing these two kinds of knowledge structures, we first address four operators on a knowledge base, which are adequate for generating new knowledge structures through using known knowledge structures. Then, an axiom definition of knowledge granulation in knowledge bases is presented, under which some existing knowledge granulations become its special forms. Finally, we introduce the concept of a knowledge distance for calculating the difference between two knowledge structures in the same knowledge base. Noting that the knowledge distance satisfies the three properties of a distance space on all knowledge structures induced by a given universe. These results will be very helpful for knowledge discovery from knowledge bases and significant for establishing a framework of granular computing in knowledge bases
Selecting Informative Features with Fuzzy-Rough Sets and its Application for Complex Systems Monitoring
One of the main obstacles facing current intelligent pattern recognition appli-cations is that of dataset dimensionality. To enable these systems to be effective, a redundancy-removing step is usually carried out beforehand. Rough Set Theory (RST) has been used as such a dataset pre-processor with much success, however it is reliant upon a crisp dataset; important information may be lost as a result of quantization of the underlying numerical features. This paper proposes a feature selection technique that employs a hybrid variant of rough sets, fuzzy-rough sets, to avoid this information loss. The current work retains dataset semantics, allowing for the creation of clear, readable fuzzy models. Experimental results, of applying the present work to complex systems monitoring, show that fuzzy-rough selection is more powerful than conventional entropy-based, PCA-based and random-based methods. Key words: feature selection; feature dependency; fuzzy-rough sets; reduct search; rule induction; systems monitoring.
The posterity of Zadeh's 50-year-old paper: A retrospective in 101 Easy Pieces â and a Few More
International audienceThis article was commissioned by the 22nd IEEE International Conference of Fuzzy Systems (FUZZ-IEEE) to celebrate the 50th Anniversary of Lotfi Zadeh's seminal 1965 paper on fuzzy sets. In addition to Lotfi's original paper, this note itemizes 100 citations of books and papers deemed âimportant (significant, seminal, etc.)â by 20 of the 21 living IEEE CIS Fuzzy Systems pioneers. Each of the 20 contributors supplied 5 citations, and Lotfi's paper makes the overall list a tidy 101, as in âFuzzy Sets 101â. This note is not a survey in any real sense of the word, but the contributors did offer short remarks to indicate the reason for inclusion (e.g., historical, topical, seminal, etc.) of each citation. Citation statistics are easy to find and notoriously erroneous, so we refrain from reporting them - almost. The exception is that according to Google scholar on April 9, 2015, Lotfi's 1965 paper has been cited 55,479 times
Knowledge Granulation, Rough Entropy and Uncertainty Measure in Incomplete Fuzzy Information System
Many real world problems deal with ordering of objects instead of classifying objects, although most of research in data analysis has been focused on the latter. One of the extensions of classical rough sets to take into account the ordering properties is dominance-based rough sets approach which is mainly based on substitution of the indiscernibility relation by a dominance relation. In this paper, we address knowledge measures and reduction in incomplete fuzzy information system using the approach. Firstly, new definitions of knowledge granulation and rough entropy are given, and some important properties of them are investigated. Then, dominance matrix about the measures knowledge granulation and rough entropy is obtained, which could be used to eliminate the redundant attributes in incomplete fuzzy information system. Lastly, a matrix algorithm for knowledge reduction is proposed. An example illustrates the validity of this method and shows the method is applicable to complex fuzzy system. Experiments are also made to show the performance of the newly proposed algorithm
CRIS-IR 2006
The recognition of entities and their
relationships in document collections is an important step towards the discovery of latent knowledge as well as to support knowledge management applications.
The challenge lies on how to extract and correlate entities, aiming to answer key knowledge management questions, such as; who works with whom, on which projects, with which customers and on what research areas. The present work proposes a
knowledge mining approach supported by information retrieval and text mining tasks in which its core is based on the correlation of textual elements through the LRD (Latent Relation Discovery) method. Our experiments show that LRD outperform better than
other correlation methods. Also, we present an application in order to demonstrate the approach over knowledge management scenarios.Fundação para a CiĂȘncia e a Tecnologia (FCT)
Denmark's Electronic Research Librar
Subsethood Measures of Spatial Granules
Subsethood, which is to measure the degree of set inclusion relation, is
predominant in fuzzy set theory. This paper introduces some basic concepts of
spatial granules, coarse-fine relation, and operations like meet, join,
quotient meet and quotient join. All the atomic granules can be hierarchized by
set-inclusion relation and all the granules can be hierarchized by coarse-fine
relation. Viewing an information system from the micro and the macro
perspectives, we can get a micro knowledge space and a micro knowledge space,
from which a rough set model and a spatial rough granule model are respectively
obtained. The classical rough set model is the special case of the rough set
model induced from the micro knowledge space, while the spatial rough granule
model will be play a pivotal role in the problem-solving of structures. We
discuss twelve axioms of monotone increasing subsethood and twelve
corresponding axioms of monotone decreasing supsethood, and generalize
subsethood and supsethood to conditional granularity and conditional fineness
respectively. We develop five conditional granularity measures and five
conditional fineness measures and prove that each conditional granularity or
fineness measure satisfies its corresponding twelve axioms although its
subsethood or supsethood measure only hold one of the two boundary conditions.
We further define five conditional granularity entropies and five conditional
fineness entropies respectively, and each entropy only satisfies part of the
boundary conditions but all the ten monotone conditions
- âŠ