458 research outputs found

    Data warehouse stream view update with multiple streaming.

    Get PDF
    The main objective of data warehousing is to store information representing an integration of base data from single or multiple data sources over an extended period of time. To provide fast access to the data, regardless of the availability of the data source, data warehouses often use materialized views. Materialized views are able to provide aggregation on some attributes to help Decision Support Systems. Updating materialized views in response to modifications in the base data is called materialized view maintenance. In some applications, for example, the stock market and banking systems, the source data is updated so frequently that we can consider them as a continuous stream of data. To keep the materialized view updated with respect to changes in the base tables in a traditional way will cause query response times to increase. This thesis proposes a new view maintenance algorithm for multiple streaming which improves semi-join methods and hash filter methods. Our proposed algorithm is able to update a view which joins two base tables where both of the base tables are in the form of data streams (always changing). By using a timestamp, building updategrams in parallel and by optimizing the joining cost between two data sources it can reduce the query response time or execution time significantly.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .A336. Source: Masters Abstracts International, Volume: 44-03, page: 1391. Thesis (M.Sc.)--University of Windsor (Canada), 2005

    Data warehouse stream view update with hash filter.

    Get PDF
    A data warehouse usually contains large amounts of information representing an integration of base data from one or more external data sources over a long period of time to provide fast-query response time. It stores materialized views which provide aggregation (SUM, MIX, MIN, COUNT and AVG) on some measure attributes of interest for data warehouse users. The process of updating materialized views in response to the modification of the base data is called materialized view maintenance. Some data warehouse application domains, like stock markets, credit cards, automated banking and web log domains depend on data sources updated as continuous streams of data. In particular, electronic stock trading markets such as the NASDAQ, generate large volumes of data, in bursts that are up to 4,200 messages per second. This thesis proposes a new view maintenance algorithm (StreamVup), which improves on semi join methods by using hash filters. The new algorithm first, reduce the amount of bytes transported through the network for streams tuples, and secondly reduces the cost of join operations during view update by eliminating the recompution of view updates caused by newly arriving duplicate tuples. (Abstract shortened by UMI.)Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2003 .I85. Source: Masters Abstracts International, Volume: 42-05, page: 1753. Adviser: C. I. Ezeife. Thesis (M.Sc.)--University of Windsor (Canada), 2003

    Data degradation to enhance privacy for the Ambient Intelligence

    Get PDF
    Increasing research in ubiquitous computing techniques towards the development of an Ambient Intelligence raises issues regarding privacy. To gain the required data needed to enable application in this Ambient Intelligence to offer smart services to users, sensors will monitor users' behavior to fill personal context histories. Those context histories will be stored on database/information systems which we consider as honest: they can be trusted now, but might be subject to attacks in the future. Making this assumption implies that protecting context histories by means of access control might be not enough. To reduce the impact of possible attacks, we propose to use limited retention techniques. In our approach, we present applications a degraded set of data with a retention delay attached to it which matches both application requirements and users privacy wishes. Data degradation can be twofold: the accuracy of context data can be lowered such that the less privacy sensitive parts are retained, and context data can be transformed such that only particular abilities for application remain available. Retention periods can be specified to trigger irreversible removal of the context data from the system

    Mining Query Plans for Finding Candidate Queries and Sub-Queries for Materialized Views in BI Systems Without Cube Generation

    Get PDF
    Materialized views are important for optimizing Business Intelligence (BI) systems when they are designed without data cubes. Selecting candidate queries from large number of queries for materialized views is a challenging task. Most of the work done in the past involves finding out frequent queries from the past workload and creating materialized views from such queries by either manually analyzing workload or using approximate string matching algorithms using query text. Most of the existing methods suggest complete queries but ignore query components such as sub queries for creation of materialized views. This paper presents a novel method to determine on which queries and query components materialized views can be created to optimize aggregate and join queries by mining database of query execution plans which are in the form of binary trees. The proposed algorithm showed significant improvement in terms of more number of optimized queries because it is using the execution plan tree of the query as a basis of selection of query to be optimized using materialized views rather than choosing query text which is used by traditional methods. For selecting a correct set of queries to be optimized using materialized views, the paper proposes efficient specialized frequent tree component mining algorithm with novel heuristics to prune search space. These frequent components are used to determine the possible set of candidate queries for creation of materialized views. Experimentation on standard, real and synthetic data sets, and also the theoretical basis, proved that the proposed method is able to optimize a large number of queries with less number of materialized views and showed a significant improvement in performance compared to traditional methods

    Maintenance Modification Algorithms and its Implementation on object oriented data warehouse

    Get PDF
    A data warehouse (DW) is a database used for reporting Paper describes Modification Algorithm and implementation on Object Oriented Data Warehousing. A Data Warehouse collects information and data from source data bases to support analytical processing of decision support functions and acts as an information provider. In initial research data warehouses focused on relational data model. In this paper concept of object oriented data warehouse is introduced modification maintenance algorithms and its implementation to maintained consistency between data warehouse and source data base

    K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources

    Get PDF
    The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, on-the- fly integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear winner . Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application

    Conceptual design of an XML FACT repository for dispersed XML document warehouses and XML marts

    Get PDF
    Since the introduction of eXtensible Markup Language (XML), XML repositories have gained a foothold in many global (and government) organizations, where, e-Commerce and e-business models have maturated in handling daily transactional data among heterogeneous information systems in multi-data formats. Due to this, the amount of data available for enterprise decision-making process is increasing exponentially and are being stored and/or communicated in XML. This presents an interesting challenge to investigate models, frameworks and techniques for organizing and analyzing such voluminous, yet distributed XML documents for business intelligence in the form of XML warehouse repositories and XML marts. In this paper, we address such an issue, where we propose a view-driven approach for modelling and designing of a Global XML FACT (GxFACT) repository under the MDA initiatives. Here we propose the GxFACT using logically grouped, geographically dispersed, XML document warehouses and Document Marts in a global enterprise setting. To deal with organizations? evolving decision-making needs, we also provide three design strategies for building and managing of such GxFACT in the context of modelling of further hierarchical dimensions and/or global document warehouses

    Data Warehouse Technology and Application in Data Centre Design for E-government

    Get PDF
    corecore