1,696 research outputs found

    Why is the snowflake schema a good data warehouse design?

    Get PDF
    Database design for data warehouses is based on the notion of the snowflake schema and its important special case, the star schema. The snowflake schema represents a dimensional model which is composed of a central fact table and a set of constituent dimension tables which can be further broken up into subdimension tables. We formalise the concept of a snowflake schema in terms of an acyclic database schema whose join tree satisfies certain structural properties. We then define a normal form for snowflake schemas which captures its intuitive meaning with respect to a set of functional and inclusion dependencies. We show that snowflake schemas in this normal form are independent as well as separable when the relation schemas are pairwise incomparable. This implies that relations in the data warehouse can be updated independently of each other as long as referential integrity is maintained. In addition, we show that a data warehouse in snowflake normal form can be queried by joining the relation over the fact table with the relations over its dimension and subdimension tables. We also examine an information-theoretic interpretation of the snowflake schema and show that the redundancy of the primary key of the fact table is zero

    An Alternative Relational OLAP Modeling Approach

    Get PDF
    Schema design is one of the fundamentals in database theory and practice as well. In this paper, we discuss the problem of locally valid dimensional attributes in a classification hierarchy of a typical OLAP scenario. In a first step, we show that the traditional star and snowflake schema approach is not feasible in this very natural case of a hierarchy. Therefore, we sketch two alternative modeling approaches resulting in practical solutions and a seamless extension of the traditional star and snowflake schema approach: In a pure relational approach, we replace each dimension table of a star / snowflake schema by a set of views directly reflecting the classification hierarchy. The second approach takes advantage of the object-relational extensions. Using object-relational techniques in the context for the relational representation of a multidimensional OLAP scenario is a novel approach and promises a clean and smooth schema design

    Pemanfaatan Data mining Dengan Metode Apriori Dalam Data Warehouse Dengan Snowflake Schema untuk Sistem Informasi Evaluasi Diri (Studi Kasus Fakultas Informatika IT Telkom)

    Get PDF
    Pihak eksekutif fakultas membutuhkan informasi - informasi dalam melakukan proses evaluasi internal untuk mengetahui kinerja fakultas atau evaluasi diri. Sistem informasi evaluasi diri membutuhkan data - data yang terintegrasi dari berbagai sumber dan dapat menampilkan knowledge dari pengolahan data - data. Data war ehouse merupakan metod e untuk integrasi data dan dengan menggunakan pemodelan Snowflake schema untuk penghematan space data . Kemudian penyampaian informasinya menggunakan association rule mining dengan algoritma apriori. Perancangan data warehouse dilakukan dengan four step dimensional design process , setelah tabel fakta dan tabel dimensi terbentuk dilakukan proses data mining untuk menemukan aturan asoiasinya sebagai I nformation delivery nya . Dari hasil analisis dapat ditarik kesimpulan bahwa data warehouse telah memenuhi kriteria empat sifat data warehouse dan snowflake schema cocok untuk perancangan data warehouse untuk Sistem Informasi ini khususnya untuk data mahasiswa . Dalam data mining semakin besar nilai minimum support dan minimum confiden ce , maka akan semakin besar kekuatan rule nya. Rule yang didapatkan kemudian dapat membantu dalam proses evaluasi diri data warehouse, snowflake schema, association rule , data mining, aprior

    A Pattern Based Approach for Re-engineering Non-Ontological Resources into Ontologies

    Get PDF
    With the goal of speeding up the ontology development process, ontology engineers are starting to reuse as much as possible available ontologies and non-ontological resources such as classification schemes, thesauri, lexicons and folksonomies, that already have some degree of consensus. The reuse of such non-ontological resources necessarily involves their re-engineering into ontologies. Non-ontological resources are highly heterogeneous in their data model and contents: they encode different types of knowledge, and they can be modeled and implemented in different ways. In this paper we present (1) a typology for non-ontological resources, (2) a pattern based approach for re-engineering non-ontological resources into ontologies, and (3) a use case of the proposed approach

    Benchmarking Summarizability Processing in XML Warehouses with Complex Hierarchies

    Full text link
    Business Intelligence plays an important role in decision making. Based on data warehouses and Online Analytical Processing, a business intelligence tool can be used to analyze complex data. Still, summarizability issues in data warehouses cause ineffective analyses that may become critical problems to businesses. To settle this issue, many researchers have studied and proposed various solutions, both in relational and XML data warehouses. However, they find difficulty in evaluating the performance of their proposals since the available benchmarks lack complex hierarchies. In order to contribute to summarizability analysis, this paper proposes an extension to the XML warehouse benchmark (XWeB) with complex hierarchies. The benchmark enables us to generate XML data warehouses with scalable complex hierarchies as well as summarizability processing. We experimentally demonstrated that complex hierarchies can definitely be included into a benchmark dataset, and that our benchmark is able to compare two alternative approaches dealing with summarizability issues.Comment: 15th International Workshop on Data Warehousing and OLAP (DOLAP 2012), Maui : United States (2012

    Automatic Schema Design for Co-Clustered Tables

    Get PDF
    Schema design of analytical workloads provides opportunities to index, cluster, partition and/or materialize. With these opportunities also the complexity of finding the right setup rises. In this paper we present an automatic schema design approach for a table co-clustering scheme called Bitwise Dimensional Co-Clustering, aimed at schemas with a moderate amount dimensions, but not limited to typical star and snowflake schemas. The goal is to design one primary schema and keep the knobs to turn to a minimum while providing a robust schema for a wide range of queries. In our approach a clustered schema is derived by trying to apply dimensions throughout the whole schema and co-cluster as many tables as possible according to at least one common dimension. Our approach is based on the assumption that initially foreign key relationships and a set of dimensions are defined based on classic DDL
    corecore