A Horizontal Fragmentation Algorithm for the Fact Relation in a Distributed Data Warehouse

Abstract

Data warehousing is one of the major research topics of appliedside database investigators. Most of the work to date has focused on building large centralized systems that are integrated repositories founded on pre-existing systems upon which all corporate-wide data are based. Unfortunately, this approach is very expensive and tends to ignore the advantages realized during the past decade in the area of distribution and support for data localization in a geographically dispersed corporate structure. This research investigates building distributed data warehouses with particular emphasis placed on distribution design for the data warehouse environment. The article provides an architectural model for a distributed data warehouse, the formal definition of the relational data model for data warehouse and a methodology for distributed data warehouse design along with a "horizontal" fragmentation algorithm for the fact relation. Most of the work to date has focused on building large centralized systems that are integrated repositories founded on pre-existing systems upon which all corporate-wide data is based. The centralized data warehouse is very expensive and tends to ignore the advantages realized during the past decade in the areas of distribution and support for data localization in a geographically dispersed corporate structure. Further, it would be unwise to enforce a centralized data warehouse when the operational systems exist over a widely distributed geographical area. The distributed data warehouse supports the decision makers by providing a single view of data even though that data are physically distributed across multiple data warehouses in multiple systems at different branches. Currently, the field of distributed data warehouse in terms of architecture and design is considered an important research problem that needs investigation. This research contributes to the problem of distributed data warehouse architecture and design by: Keywords distributed data warehouse architecture, distributed data warehouse design, horizontal fragmentation. Extending the preliminary architecture model that has been presented in [8] by proposing a distributed data warehouse system architecture and describing the functionality of its components

    Similar works