14,727 research outputs found
Formal design of data warehouse and OLAP systems : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University, Palmerston North, New Zealand
A data warehouse is a single data store, where data from multiple data sources is integrated for online business analytical processing (OLAP) of an entire organisation. The rationale being single and integrated is to ensure a consistent view of the organisational business performance independent from different angels of business perspectives. Due to its wide coverage of subjects, data warehouse design is a highly complex, lengthy and error-prone process. Furthermore, the business analytical tasks change over time, which results in changes in the requirements for the OLAP systems. Thus, data warehouse and OLAP systems are rather dynamic and the design process is continuous. In this thesis, we propose a method that is integrated, formal and application-tailored to overcome the complexity problem, deal with the system dynamics, improve the quality of the system and the chance of success.
Our method comprises three important parts: the general ASMs method with types, the application tailored design framework for data warehouse and OLAP, and the schema integration method with a set of provably correct refinement rules.
By using the ASM method, we are able to model both data and operations in a uniform conceptual framework, which enables us to design an integrated approach for data warehouse and OLAP design. The freedom given by the ASM method allows us to model the system at an abstract level that is easy to understand for both users and designers. More specifically, the language allows us to use the terms from the user domain not biased by the terms used in computer systems. The pseudo-code like transition rules, which gives the simplest form of operational semantics in ASMs, give the closeness to programming languages for designers to understand. Furthermore, these rules are rooted in mathematics to assist in improving the quality of the system design.
By extending the ASMs with types, the modelling language is tailored for data warehouse with the terms that are well developed for data-intensive applications, which makes it easy to model the schema evolution as refinements in the dynamic data warehouse design.
By providing the application-tailored design framework, we break down the design complexity by business processes (also called subjects in data warehousing) and design concerns. By designing the data warehouse by subjects, our method resembles Kimball's "bottom-up" approach. However, with the schema integration method, our method resolves the stovepipe issue of the approach. By building up a data warehouse iteratively in an integrated framework, our method not only results in an integrated data warehouse, but also resolves the issues of complexity and delayed ROI (Return On Investment) in Inmon's "top-down" approach. By dealing with the user change requests in the same way as new subjects, and modelling data and operations explicitly in a three-tier architecture, namely the data sources, the data warehouse and the OLAP (online Analytical Processing), our method facilitates dynamic design with system integrity.
By introducing a notion of refinement specific to schema evolution, namely schema refinement, for capturing the notion of schema dominance in schema integration, we are able to build a set of correctness-proven refinement rules. By providing the set of refinement rules, we simplify the designers's work in correctness design verification. Nevertheless, we do not aim for a complete set due to the fact that there are many different ways for schema integration, and neither a prescribed way of integration to allow designer favored design.
Furthermore, given its °exibility in the process, our method can be extended for new emerging design issues easily
Why is the snowflake schema a good data warehouse design?
Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero
Pemanfaatan Data mining Dengan Metode Apriori Dalam Data Warehouse Dengan Snowflake Schema untuk Sistem Informasi Evaluasi Diri (Studi Kasus Fakultas Informatika IT Telkom)
Pihak eksekutif fakultas membutuhkan informasi
-
informasi
dalam melakukan
proses evaluasi internal untuk mengetahui kinerja fakultas atau evaluasi diri.
Sistem informasi evaluasi diri membutuhkan data
-
data yang terintegrasi dari
berbagai sumber dan dapat menampilkan
knowledge
dari pengolahan data
-
data.
Data war
ehouse
merupakan metod
e untuk integrasi data dan dengan
menggunakan pemodelan
Snowflake schema
untuk penghematan
space
data
.
Kemudian penyampaian informasinya
menggunakan
association
rule
mining
dengan algoritma apriori. Perancangan
data warehouse
dilakukan dengan
four
step dimensional design process
,
setelah tabel fakta dan tabel dimensi terbentuk
dilakukan
proses
data mining
untuk menemukan aturan asoiasinya
sebagai
I
nformation delivery
nya
. Dari hasil analisis dapat ditarik kesimpulan
bahwa
data
warehouse telah memenuhi kriteria
empat
sifat
data warehouse
dan
snowflake
schema
cocok untuk
perancangan
data warehouse
untuk Sistem Informasi ini
khususnya untuk data mahasiswa
. Dalam
data mining
semakin besar nilai
minimum support
dan
minimum confiden
ce
, maka akan semakin besar kekuatan
rule
nya.
Rule
yang didapatkan
kemudian
dapat membantu dalam proses evaluasi
diri data warehouse, snowflake schema, association rule , data mining, aprior
Data warehouse schema for monitoring key performance indicators (KPIs) for university teaching and learning using goal oriented approach
The growth and development of universities, just as other organisations, depend on
their abilities to strategically plan and implement development blueprints which are
in line with their vision and mission statements. The actualizations of these
statements –which are often abstracted into goals and sub-goals and linked to their
respective actors –are better measured by defined key performance indicators (KPIs).
And in universities that handle modestly large and heterogeneous data, development
of data warehouse is important. Specifically, Universiti Utara Malaysia (UUM) is yet
to have a data warehouse for monitoring its organisational KPIs. This study therefore
proposes a data warehouse schema for university’s KPIs for teaching and learning
KPIs using a Requirement Goal Analysis for Data Warehouse
KPI(ReGADaK)approach which is an extension of goal-oriented requirement
analysis and design (GRAnD). The proposed schema highlights the facts,
dimensions, attributes and measures of UUM’s teaching and learning unit. The
measures from the goal analysis of this unit serve as basis of developing the related
university’s KPIs. The proposed data warehouse schema is evaluated through expert
review, prototyping and usability evaluation. The findings from the evaluation
processes suggest that the proposed data warehouse schema is suitable for
university’s KPIs for teaching and learning KPIs monitoring and practicable
Comparing the Understandability of Alternative Data Warehouse Schemas: An Empirical Study
An easily understood data warehouse model enables users to better identify and retrieve its data. It also makes it easier for users to suggest changes to its structure and content. Through an exploratory, empirical study, we compared the understandability of the star and traditional relational schemas. The results of our experiment contradict previous findings and show schema type did not lead to significant performance differences for a content identification task. Further, the relational schema actually led to slightly better results for a schema augmentation task. We discuss the implications of these findings for data warehouse design and future research
A Design Comparison: Data Warehouse Schema versus Conventional Relational Database Schema
ABSTRACT Initially, relational database is for both operational and decision support system, as the information society experiences exponential growth in the amount of data/information to be stored in a database, a line has been drown between transactional database and decision support database. Unlike traditional database, data warehouse aims to come from a number of preexisting databases (developed from relational schemas). This conceptual paper discusses traditional database schema design and that of data warehouse schema architectural designs strategies that could be a guiding principles for both learners and beginners in database management system. It has explored the stages in development processes of the two. Subject orientation, data integration, non-volatility of data, and time variations are the key issues under consideration that could differentiate between traditional databases and data warehouse schema designs. It has also presented Design Modelling Techniques as well as addressing logical data models for data warehouse schema and traditional relational database
RANCANGAN DATA WAREHOUSE UNTUK ANALISIS KINERJA PRODUKSI DI PT. URECEL INDONESIA
PT. Urecel Indonesia engaged in the manufacturing and production of polyurethane foam. The business process of production is closely linked to the function of three divisions: marketing, purchasing, and production. The amount of data stored in each division and resulted in the accumulation of data. The data in the division has not integrated each other mutually. The leaders often have not get the information that he/she was needed to evaluate the performance of production. This is because the information is presented based on every division itsefl and using a different data source. This study discusses the design of the data warehouse to solve these problems. Research carried out by the System Development Life Cycle (SDLC). In other parts, the design of the data warehouse is done using a snowflake schema, while the extract, transform, and loading (ETL) performed on the data relating to production performance needs. The results of the ETL process is then stored in the data warehouse for analysis. Implemengtasi results and analysis carried out showed that the design of data warehouse generated in this study proved it can be used to measure the performance of the production of PT. Urecel Indonesia
- …