    A model-based software architecture for XML data and metadata integration in data warehouse systems

    This project is carried out to develop a system prototype of an electronic tendering (e-Tender) system.Several steps have been taken starting with information gathering and analyzing, developing a prototype, and ending in system testing.The prototype was further tested with real users to analyze the document flow speed.In conclusion, e-Tendering system has a better approach compared to the manual process of tender. The document flow speed was increased by 58.5%, which suggests a more efficient process

    Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses

    A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses

    Advanced Implementation Techniques for Scientific Data Warehouses

    Data warehouses using a multidimensional view of data have become very popular in both business and science in recent years. Data warehouses for scientific purposes such as medicine and bio-chemistry pose several great challenges to existing data warehouse technology. Data warehouses usually use pre-aggregated data to ensure fast query response. However, pre-aggregation cannot be used in practice if the dimension structures or the relationships between facts and dimensions are irregular. A technique for overcoming this limitation and some experimental results are presented. Queries over scientific data warehouses often need to reference data that is external to the data warehouse, e.g., data that is too complex to be handled by current data warehouse technology, data that is "owned" by other organizations, or data that is updated frequently. An exampl

    Web usage mining for UUM learning care using association rules

    The enormous content of information on the World Wide Web makes it obvious candidate for data mining research. Application of data milling techniques to the World Wide Web referred as Web mining where this term has been used in three distinct ways; Web Content Mining, Web Structure Mining and Web Usage Mining. E-Leaming is one of the Web based application where it will facing with large amount of data. In order to produce the university E-Learning (UUM Educare) portal usage patterns and user behaviors, this paper implements the high level process of Web usage mining using basic Association Rules algorithm - Apriori Algorithm. Web usage mining consists of three main phases, namely Data Preprocessing, Pattern Discovering and Pattern Analysis. Main resources, server log files become a set of raw data where it's must go through with all the Web usage mining phases to produce the final results — set of rides. With the powerful of data mining technique, Web usage mining approach has been combined with the basic Association Rules, Apriori Algorithm to optimize the content of the university E�Learning portal. Finally, this paper will present an overview of results with the analysis and Web administrator can use the findings for the suitable valuable actions

    Research on Materialized View Selection

    定义了数据仓库领域的视图选择问题,并讨论了与该问题相关的代价模型、收益函数、代价计算、约束条件和视图索引等内容;介绍了3大类视图选择方法,即静态方法、动态方法和混合方法,以及各类方法的代表性研究成果;最后展望未来的研究方向.Definition of view selection issue in the field of data warehouses is presented, followed by the discussion of related problems, such as cost model, benefit function, cost computation, restriction condition, view index, etc. Then three categories of view selection methods, namely, static, dynamic and hybrid methods are discussed. For each method, some representative work is introduced. Finally some future trends in this area are discussed.Supported by the National Natural Science Foundation of China under Grant No.60473051 (国家自然科学基金); the National High-Tech Research and Development Plan of China under Grant Nos.2007AA01Z191, 2006AA01Z230 (国家高技术研究发展计划(863)

    Développement d'une approche pour l'analyse solap en temps réel : adapatation aux besoins des activités sportives en plein air

    Au cours des dernières années, différents types de travaux ont été réalisés indépendamment au sein du même centre de recherche (Centre de Recherche en Géomatique de l'Université Laval). Parmi ceux-ci, on retrouve des travaux axés sur l'acquisition et le traitement des données spatiales en sport de plein air d'une part, et des travaux axés sur l'exploration et l'analyse des données spatiales avec une solution SOLAP d'autre part. L'exploitation conjointe de ces travaux permettait de répondre à de nouvelles attentes et plus particulièrement à une nouvelle application : l'évaluation et l'analyse de la performance d'athlètes pratiquant un sport extérieur grâce à des données calculées à partir d'observations GPS. En effet, suite à des observations GPS, la position, la vitesse et l'accélération de l'athlète peuvent être calculées précisément. Cependant, aucun logiciel ne permettait d'analyser rapidement et facilement les nouvelles données recueillies.Pourtant, les entraîneurs d'athlètes de haut niveau désirent obtenir des données sur les performances actuelles, de façon rapide et exacte, pour ainsi adapter immédiatement leur entraînement et favoriser le succès de l'athlète. Or, la technologie SOLAP offre aux utilisateurs une interface cliente très intuitive pour l'analyse spatio-temporelle. Cependant, son fonctionnement ne permettait pas d'ajouter rapidement de nouvelles données obtenues à partir d'observations GPS. Cette recherche visait alors à développer une approche répondant à des besoins d'analyse SOLAP en temps réel retrouvés dans certaines applications et plus particulièrement dans le sport de haut niveau. Nous avons aussi vérifié qu'une solution SOLAP utilisée dans le domaine de la gestion des entreprises pour faciliter les prises de décision peut être transposée dans celui de l'analyse de la performance des athlètes. Pour ce faire, un SOLAP juste-à-temps, baptisé SOLAP-SPORT, a été développé dans le cadre de ce projet de recherche