14,727 research outputs found

    Formal design of data warehouse and OLAP systems : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University, Palmerston North, New Zealand

    Get PDF
    A data warehouse is a single data store, where data from multiple data sources is integrated for online business analytical processing (OLAP) of an entire organisation. The rationale being single and integrated is to ensure a consistent view of the organisational business performance independent from different angels of business perspectives. Due to its wide coverage of subjects, data warehouse design is a highly complex, lengthy and error-prone process. Furthermore, the business analytical tasks change over time, which results in changes in the requirements for the OLAP systems. Thus, data warehouse and OLAP systems are rather dynamic and the design process is continuous. In this thesis, we propose a method that is integrated, formal and application-tailored to overcome the complexity problem, deal with the system dynamics, improve the quality of the system and the chance of success. Our method comprises three important parts: the general ASMs method with types, the application tailored design framework for data warehouse and OLAP, and the schema integration method with a set of provably correct refinement rules. By using the ASM method, we are able to model both data and operations in a uniform conceptual framework, which enables us to design an integrated approach for data warehouse and OLAP design. The freedom given by the ASM method allows us to model the system at an abstract level that is easy to understand for both users and designers. More specifically, the language allows us to use the terms from the user domain not biased by the terms used in computer systems. The pseudo-code like transition rules, which gives the simplest form of operational semantics in ASMs, give the closeness to programming languages for designers to understand. Furthermore, these rules are rooted in mathematics to assist in improving the quality of the system design. By extending the ASMs with types, the modelling language is tailored for data warehouse with the terms that are well developed for data-intensive applications, which makes it easy to model the schema evolution as refinements in the dynamic data warehouse design. By providing the application-tailored design framework, we break down the design complexity by business processes (also called subjects in data warehousing) and design concerns. By designing the data warehouse by subjects, our method resembles Kimball's "bottom-up" approach. However, with the schema integration method, our method resolves the stovepipe issue of the approach. By building up a data warehouse iteratively in an integrated framework, our method not only results in an integrated data warehouse, but also resolves the issues of complexity and delayed ROI (Return On Investment) in Inmon's "top-down" approach. By dealing with the user change requests in the same way as new subjects, and modelling data and operations explicitly in a three-tier architecture, namely the data sources, the data warehouse and the OLAP (online Analytical Processing), our method facilitates dynamic design with system integrity. By introducing a notion of refinement specific to schema evolution, namely schema refinement, for capturing the notion of schema dominance in schema integration, we are able to build a set of correctness-proven refinement rules. By providing the set of refinement rules, we simplify the designers's work in correctness design verification. Nevertheless, we do not aim for a complete set due to the fact that there are many different ways for schema integration, and neither a prescribed way of integration to allow designer favored design. Furthermore, given its °exibility in the process, our method can be extended for new emerging design issues easily

    Why is the snowflake schema a good data warehouse design?

    Get PDF
    Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero

    Pemanfaatan Data mining Dengan Metode Apriori Dalam Data Warehouse Dengan Snowflake Schema untuk Sistem Informasi Evaluasi Diri (Studi Kasus Fakultas Informatika IT Telkom)

    Get PDF
    Pihak eksekutif fakultas membutuhkan informasi - informasi dalam melakukan proses evaluasi internal untuk mengetahui kinerja fakultas atau evaluasi diri. Sistem informasi evaluasi diri membutuhkan data - data yang terintegrasi dari berbagai sumber dan dapat menampilkan knowledge dari pengolahan data - data. Data war ehouse merupakan metod e untuk integrasi data dan dengan menggunakan pemodelan Snowflake schema untuk penghematan space data . Kemudian penyampaian informasinya menggunakan association rule mining dengan algoritma apriori. Perancangan data warehouse dilakukan dengan four step dimensional design process , setelah tabel fakta dan tabel dimensi terbentuk dilakukan proses data mining untuk menemukan aturan asoiasinya sebagai I nformation delivery nya . Dari hasil analisis dapat ditarik kesimpulan bahwa data warehouse telah memenuhi kriteria empat sifat data warehouse dan snowflake schema cocok untuk perancangan data warehouse untuk Sistem Informasi ini khususnya untuk data mahasiswa . Dalam data mining semakin besar nilai minimum support dan minimum confiden ce , maka akan semakin besar kekuatan rule nya. Rule yang didapatkan kemudian dapat membantu dalam proses evaluasi diri data warehouse, snowflake schema, association rule , data mining, aprior

    Data warehouse schema for monitoring key performance indicators (KPIs) for university teaching and learning using goal oriented approach

    Get PDF
    The growth and development of universities, just as other organisations, depend on their abilities to strategically plan and implement development blueprints which are in line with their vision and mission statements. The actualizations of these statements –which are often abstracted into goals and sub-goals and linked to their respective actors –are better measured by defined key performance indicators (KPIs). And in universities that handle modestly large and heterogeneous data, development of data warehouse is important. Specifically, Universiti Utara Malaysia (UUM) is yet to have a data warehouse for monitoring its organisational KPIs. This study therefore proposes a data warehouse schema for university’s KPIs for teaching and learning KPIs using a Requirement Goal Analysis for Data Warehouse KPI(ReGADaK)approach which is an extension of goal-oriented requirement analysis and design (GRAnD). The proposed schema highlights the facts, dimensions, attributes and measures of UUM’s teaching and learning unit. The measures from the goal analysis of this unit serve as basis of developing the related university’s KPIs. The proposed data warehouse schema is evaluated through expert review, prototyping and usability evaluation. The findings from the evaluation processes suggest that the proposed data warehouse schema is suitable for university’s KPIs for teaching and learning KPIs monitoring and practicable

    Comparing the Understandability of Alternative Data Warehouse Schemas: An Empirical Study

    Get PDF
    An easily understood data warehouse model enables users to better identify and retrieve its data. It also makes it easier for users to suggest changes to its structure and content. Through an exploratory, empirical study, we compared the understandability of the star and traditional relational schemas. The results of our experiment contradict previous findings and show schema type did not lead to significant performance differences for a content identification task. Further, the relational schema actually led to slightly better results for a schema augmentation task. We discuss the implications of these findings for data warehouse design and future research

    A Design Comparison: Data Warehouse Schema versus Conventional Relational Database Schema

    Get PDF
    ABSTRACT Initially, relational database is for both operational and decision support system, as the information society experiences exponential growth in the amount of data/information to be stored in a database, a line has been drown between transactional database and decision support database. Unlike traditional database, data warehouse aims to come from a number of preexisting databases (developed from relational schemas). This conceptual paper discusses traditional database schema design and that of data warehouse schema architectural designs strategies that could be a guiding principles for both learners and beginners in database management system. It has explored the stages in development processes of the two. Subject orientation, data integration, non-volatility of data, and time variations are the key issues under consideration that could differentiate between traditional databases and data warehouse schema designs. It has also presented Design Modelling Techniques as well as addressing logical data models for data warehouse schema and traditional relational database

    Database Schema as a Graph: A Methodology for Data Warehouse Design

    Get PDF

    RANCANGAN DATA WAREHOUSE UNTUK ANALISIS KINERJA PRODUKSI DI PT. URECEL INDONESIA

    Get PDF
    PT. Urecel Indonesia engaged in the manufacturing and production of polyurethane foam. The business process of production is closely linked to the function of three divisions: marketing, purchasing, and production. The amount of data stored in each division and resulted in the accumulation of data. The data in the division has not integrated each other mutually. The leaders often have not get the information that he/she was needed to evaluate the performance of production. This is because the information is presented based on every division itsefl and using a different data source. This study discusses the design of the data warehouse to solve these problems. Research carried out by the System Development Life Cycle (SDLC). In other parts, the design of the data warehouse is done using a snowflake schema, while the extract, transform, and loading (ETL) performed on the data relating to production performance needs. The results of the ETL process is then stored in the data warehouse for analysis. Implemengtasi results and analysis carried out showed that the design of data warehouse generated in this study proved it can be used to measure the performance of the production of PT. Urecel Indonesia
    corecore