106 research outputs found

    Implementing data-driven decision support system based on independent educational data mart

    Get PDF
    Decision makers in the educational field always seek new technologies and tools, which provide solid, fast answers that can support decision-making process. They need a platform that utilize the students’ academic data and turn them into knowledge to make the right strategic decisions. In this paper, a roadmap for implementing a data driven decision support system (DSS) is presented based on an educational data mart. The independent data mart is implemented on the students’ degrees in 8 subjects in a private school (Al-Iskandaria Primary School in Basrah province, Iraq). The DSS implementation roadmap is started from pre-processing paper-based data source and ended with providing three categories of online analytical processing (OLAP) queries (multidimensional OLAP, desktop OLAP and web OLAP). Key performance indicator (KPI) is implemented as an essential part of educational DSS to measure school performance. The static evaluation method shows that the proposed DSS follows the privacy, security and performance aspects with no errors after inspecting the DSS knowledge base. The evaluation shows that the data driven DSS based on independent data mart with KPI, OLAP is one of the best platforms to support short-to-long term academic decisions

    Analyse en ligne (OLAP) de documents

    Get PDF
    Thèse également disponible sur le site de l'Université Paul Sabatier, Toulouse 3 : http://thesesups.ups-tlse.fr/160/Data warehouses and OLAP systems (On-Line Analytical Processing) provide methods and tools for enterprise information system data analysis. But only 20% of the data of a corporate information system may be processed with actual OLAP systems. The rest, namely 80%, i.e. documents, remains out of reach of OLAP systems due to the lack of adapted tools and processes. To solve this issue we propose a multidimensional conceptual model for representing analysis concepts. The model rests on a unique concept that models both analysis subjects as well as analysis axes. We define an aggregation function to aggregate textual data in order to obtain a summarised vision of the information extracted from documents. This function summarises a set of keywords into a smaller and more general set. We introduce a core of manipulation operators that allow the specification of analyses and their manipulation with the use of the concepts of the model. We associate a design process for the integration of data extracted from documents within an OLAP system that describes the phases for designing the conceptual schema, for analysing the document sources and for the loading process. In order to validate these propositions we have implemented a prototype.Les entrepôts de données et les systèmes d'analyse en ligne OLAP (On-Line Analytical Processing) fournissent des méthodes et des outils permettant l'analyse de données issues des systèmes d'information des entreprises. Mais, seules 20% des données d'un système d'information est constitué de données analysables par les systèmes OLAP actuels. Les 80% restant, constitués de documents, restent hors de portée de ces systèmes faute d'outils ou de méthodes adaptés. Pour répondre à cette problématique nous proposons un modèle conceptuel multidimensionnel pour représenter les concepts d'analyse. Ce modèle repose sur un unique concept, modélisant à la fois les sujets et les axes d'une analyse. Nous y associons une fonction pour agréger des données textuelles afin d'obtenir une vision synthétique des informations issues de documents. Cette fonction résume un ensemble de mots-clefs par un ensemble plus petit et plus général. Nous introduisons un noyau d'opérations élémentaires permettant la spécification d'analyses multidimensionnelles à partir des concepts du modèle ainsi que leur manipulation pour affiner une analyse. Nous proposons également une démarche pour l'intégration des données issues de documents, qui décrit les phases pour concevoir le schéma conceptuel multidimensionnel, l'analyse des sources de données ainsi que le processus d'alimentation. Enfin, pour valider notre proposition, nous présentons un prototype

    Requirement modeling for data warehouse using goal-UML approach: the case of health care

    Get PDF
    Decision makers use Data Warehouse (DW) for performing analysis on business information. DW development is a long term process with high risk of failure and it is difficult to estimate the future requirements for the decision-making. Further, the current DW design does not consider the early and late requirements analysis during its development, especially by using Unified Modeling Language (UML) approach. Due to this problem, it is crucial that current DW modeling approaches covered both early and late requirements analysis in the DW design. A case study was conducted on Malaysia Rural Health Care (MRH) to gather the requirements for DW design. The goal-oriented approach has been used to analyze the early requirements and later was mapped to UML approach to produce a new DW modeling called Goal-UML (G-UML). The proposed approach highlighted the mapping process of DW conceptual schema to a class diagram to produce a complete MRH-DW design. The correctness of the DW design was evaluated using expert reviews. The G-UML method can contribute to the development of DW and be a guideline to the DW developers to produce an improved DW design that meets all the user requirement

    Data Cube Approximation and Mining using Probabilistic Modeling

    Get PDF
    On-line Analytical Processing (OLAP) techniques commonly used in data warehouses allow the exploration of data cubes according to different analysis axes (dimensions) and under different abstraction levels in a dimension hierarchy. However, such techniques are not aimed at mining multidimensional data. Since data cubes are nothing but multi-way tables, we propose to analyze the potential of two probabilistic modeling techniques, namely non-negative multi-way array factorization and log-linear modeling, with the ultimate objective of compressing and mining aggregate and multidimensional values. With the first technique, we compute the set of components that best fit the initial data set and whose superposition coincides with the original data; with the second technique we identify a parsimonious model (i.e., one with a reduced set of parameters), highlight strong associations among dimensions and discover possible outliers in data cells. A real life example will be used to (i) discuss the potential benefits of the modeling output on cube exploration and mining, (ii) show how OLAP queries can be answered in an approximate way, and (iii) illustrate the strengths and limitations of these modeling approaches

    Conceptual models as basis for integrated information warehouse development

    Full text link
    Research in the field of information warehousing mostly focuses technical aspects. Only recently some contributions are found dealing with methodical aspects of information warehouse development processes. With respect to the central role information warehouses play for the management a development method is presented which strictly concentrates on management views. Language concepts are developed which allow the specification of information warehouses out of the managements perspective and a representation formalism supporting this language is presented. Methodically the language construction is based on constructive philosophy. Conceptual models are used as meta information in later development phases and it is shown how meta data of etl and olap tools available on the market can be generated out of the conceptual models. The approach presented has been verified by means of a prototype.<br/

    Обзор подходов к организации физического уровня в СУБД

    Get PDF
    In this paper we survey various DBMS physical design options. We will consider both vertical and horizontal partitioning, and briefly cover replication. This survey is not limited only to local systems, but also includes distributed ones. The latter adds a new interesting question — how to actually distribute data among several processing nodes. Aside from theoretical approaches we consider the practical ones, implemented in any contemporary DBMS. We cover these aspects not only from user, but also architect and programmer perspectives.В данной работе мы рассмотрели различные методы организации физического уровня СУБД: вертикальное и горизонтальное фрагментирование, а также вкратце нами затронут вопрос репликации. Указанные методы были рассмотрены не только для локальных, но и для распределенных СУБД. Последним было уделено повышенное внимание: были рассмотрены методы размещения данных на узлах распределенной системы. Кроме теоретических работ, приведены работы практического характера, в которых освещены вопросы применения вышеуказанных методов в современных коммерческих СУБД. Они были рассмотрены как с позиции пользователя, так и с позиций архитектора и программиста СУБ

    CubiST++: Evaluating Ad-Hoc CUBE Queries Using Statistics Trees

    Get PDF
    We report on a new, efficient encoding for the data cube, which results in a drastic speed-up of OLAP queries that aggregate along any combination of dimensions over numerical and categorical attributes. We are focusing on a class of queries called cube queries, which return aggregated values rather than sets of tuples. Our approach, termed CubiST++ (Cubing with Statistics Trees Plus Families), represents a drastic departure from existing relational (ROLAP) and multi-dimensional (MOLAP) approaches in that it does not use the view lattice to compute and materialize new views from existing views in some heuristic fashion. Instead, CubiST++ encodes all possible aggregate views in the leaves of a new data structure called statistics tree (ST) during a one-time scan of the detailed data. In order to optimize the queries involving constraints on hierarchy levels of the underlying dimensions, we select and materialize a family of candidate trees, which represent superviews over the different hierarchical levels of the dimensions. Given a query, our query evaluation algorithm selects the smallest tree in the family, which can provide the answer. Extensive evaluations of our prototype implementation have demonstrated its superior run-time performance and scalability when compared with existing MOLAP and ROLAP systems

    Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web

    Get PDF
    If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches

    Specification of management views in information warehouse projects

    Full text link
    With respect to the major role information warehouses play for the management an approach for specifying management views within the requirements specification phase is presented. Based on a framework relating development phases and abstraction layers the roles of documents within development processes are organised. The importance of using management views as metadata and parameters in later development phases is elaborated. Formally the transformation of management views into logical data mart schemes and report queries is shown by means of algorithms. Development phases are integrated based on meta level relationships. <br/

    EM-OLAP Framework - Econometric Model Transformation Method for OLAP Design in Intelligence Systems

    Get PDF
    Econometrics is currently one of the most popular approaches to economic analysis. To better support advances in these areas as much as possible, it is necessary to apply econometric problems to econometric intelligent systems. The article describes an econometric OLAP framework that supports the design of a multidimensional database to secure econometric analyses to increase the effectiveness of the development of econometric intelligent systems. The first part of the article consists of the creation of formal rules for the new transformation of the econometric model (TEM) method for the econometric model transformation of multidimensional schema through the use of mathematical notation. In the proposed TEM method, the authors pay attention to the measurement of quality and understandability of the multidimensional schema, and compare the proposed method with the original TEM-CM method. In the second part of the article, the authors create a multidimensional database prototype according to the new TEM method and design an OLAP application for econometric Analysis
    corecore