1,696 research outputs found
Why is the snowflake schema a good data warehouse design?
Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero
An Alternative Relational OLAP Modeling Approach
Schema design is one of the fundamentals in database theory and practice as well. In this paper, we discuss the problem of locally valid dimensional attributes in a classification hierarchy of a typical OLAP scenario. In a first step, we show that the traditional star and snowflake schema approach is not feasible in this very natural case of a hierarchy. Therefore, we sketch two alternative modeling approaches resulting in practical solutions and a seamless extension of the traditional star and snowflake schema approach: In a pure relational approach, we replace each dimension table of a star / snowflake schema by a set of views directly reflecting the classification hierarchy. The second approach takes advantage of the object-relational extensions. Using object-relational techniques in the context for the relational representation of a multidimensional OLAP scenario is a novel approach and promises a clean and smooth schema design
Pemanfaatan Data mining Dengan Metode Apriori Dalam Data Warehouse Dengan Snowflake Schema untuk Sistem Informasi Evaluasi Diri (Studi Kasus Fakultas Informatika IT Telkom)
Pihak eksekutif fakultas membutuhkan informasi
-
informasi
dalam melakukan
proses evaluasi internal untuk mengetahui kinerja fakultas atau evaluasi diri.
Sistem informasi evaluasi diri membutuhkan data
-
data yang terintegrasi dari
berbagai sumber dan dapat menampilkan
knowledge
dari pengolahan data
-
data.
Data war
ehouse
merupakan metod
e untuk integrasi data dan dengan
menggunakan pemodelan
Snowflake schema
untuk penghematan
space
data
.
Kemudian penyampaian informasinya
menggunakan
association
rule
mining
dengan algoritma apriori. Perancangan
data warehouse
dilakukan dengan
four
step dimensional design process
,
setelah tabel fakta dan tabel dimensi terbentuk
dilakukan
proses
data mining
untuk menemukan aturan asoiasinya
sebagai
I
nformation delivery
nya
. Dari hasil analisis dapat ditarik kesimpulan
bahwa
data
warehouse telah memenuhi kriteria
empat
sifat
data warehouse
dan
snowflake
schema
cocok untuk
perancangan
data warehouse
untuk Sistem Informasi ini
khususnya untuk data mahasiswa
. Dalam
data mining
semakin besar nilai
minimum support
dan
minimum confiden
ce
, maka akan semakin besar kekuatan
rule
nya.
Rule
yang didapatkan
kemudian
dapat membantu dalam proses evaluasi
diri data warehouse, snowflake schema, association rule , data mining, aprior
A Pattern Based Approach for Re-engineering Non-Ontological Resources into Ontologies
With the goal of speeding up the ontology development process, ontology engineers are starting to reuse as much as possible available ontologies and non-ontological resources such as classification schemes, thesauri, lexicons and folksonomies, that already have some degree of consensus. The reuse of such non-ontological resources necessarily involves their re-engineering into ontologies. Non-ontological resources are highly heterogeneous in their data model and contents: they encode different types of knowledge, and they can be modeled and implemented in different ways. In this paper we present (1) a typology for non-ontological resources, (2) a pattern based approach for re-engineering non-ontological resources into ontologies, and (3) a use case of the proposed approach
Benchmarking Summarizability Processing in XML Warehouses with Complex Hierarchies
Business Intelligence plays an important role in decision making. Based on
data warehouses and Online Analytical Processing, a business intelligence tool
can be used to analyze complex data. Still, summarizability issues in data
warehouses cause ineffective analyses that may become critical problems to
businesses. To settle this issue, many researchers have studied and proposed
various solutions, both in relational and XML data warehouses. However, they
find difficulty in evaluating the performance of their proposals since the
available benchmarks lack complex hierarchies. In order to contribute to
summarizability analysis, this paper proposes an extension to the XML warehouse
benchmark (XWeB) with complex hierarchies. The benchmark enables us to generate
XML data warehouses with scalable complex hierarchies as well as
summarizability processing. We experimentally demonstrated that complex
hierarchies can definitely be included into a benchmark dataset, and that our
benchmark is able to compare two alternative approaches dealing with
summarizability issues.Comment: 15th International Workshop on Data Warehousing and OLAP (DOLAP
2012), Maui : United States (2012
Automatic Schema Design for Co-Clustered Tables
Schema design of analytical workloads provides opportunities to index, cluster, partition and/or materialize. With these opportunities also the complexity of finding the right setup rises. In this paper we present an automatic schema design approach for a table co-clustering scheme called Bitwise Dimensional Co-Clustering, aimed at schemas with a moderate amount dimensions, but not limited to typical star and snowflake schemas. The goal is to design one primary schema and keep the knobs to turn to a minimum while providing a robust schema for a wide range of queries. In our approach a clustered schema is derived by trying to apply dimensions throughout the whole schema and co-cluster as many tables as possible according to at least one common dimension. Our approach is based on the assumption that initially foreign key relationships and a set of dimensions are defined based on classic DDL
- …