Skip to main content
Article thumbnail
Location of Repository

Why is the snowflake schema a good data warehouse design?

By Mark Levene and George Loizou

Abstract

Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero

Topics: csis
Publisher: Elsevier
Year: 2003
OAI identifier: oai:eprints.bbk.ac.uk.oai2:195

Suggested articles

Citations

  1. (1983). A characterization of globally consistent databases and their access paths. doi
  2. (1999). A Guided Tour of Relational Databases and Beyond. doi
  3. (1992). aih¨ a. The Design of Relational Databases.
  4. (1987). An information-theoretic analysis of relational databases - Part I: Data dependencies and information metric. doi
  5. (1997). An overview of data warehousing and OLAP technology. doi
  6. (1996). Building the Data Warehouse. doi
  7. (1979). Computational problems related to the design of normal form relational schemas. doi
  8. (2000). Conceptual data warehouse design.
  9. (1998). Conceptual design of data warehouses from E/R schemes. doi
  10. (1983). Degrees of acyclicity for hypergraphs and relational database schemes. doi
  11. (1990). Distance in Graphs. doi
  12. (1991). Evaluation of queries in independent database schemes. doi
  13. (1988). Existence of extensions and product extensions for discrete probability distributions. doi
  14. (1986). Functional and inclusion dependencies: A graph theoretic approach. doi
  15. (2001). Guaranteeing no interaction between functional dependencies and tree-like inclusion dependencies. doi
  16. (1999). How to prevent interaction of functional and inclusion dependencies. Information Processing Letters, doi
  17. (1987). Independent and separable database schemes. doi
  18. (1991). Independent database schemes under functional and inclusion dependencies. doi
  19. (2000). Justification for inclusion dependency normal form. doi
  20. (1985). Making Decisions. doi
  21. (1995). Multi-table joins through bitmapped join indices. doi
  22. (1998). Normal forms for multidimensional databases. doi
  23. (1983). On the desirability of acyclic database schemes. doi
  24. (1991). Relational databases: A tutorial for statisticians.
  25. (1986). Statistical treatment of the information content of a database. doi
  26. (1982). Testing satisfaction of functional dependencies. doi
  27. (1998). The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing and Deploying Data Warehouses.
  28. (2000). The Data Webhouse Toolkit: Building the Web-Enabled Data Warehouse.
  29. (1992). The Design of Relational Databases.
  30. (1987). The theory of probabilistic databases.
  31. (1983). Towards a sound view integration methodology. doi

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.