18 research outputs found

    Knowledge and Metadata Integration for Warehousing Complex Data

    Full text link
    With the ever-growing availability of so-called complex data, especially on the Web, decision-support systems such as data warehouses must store and process data that are not only numerical or symbolic. Warehousing and analyzing such data requires the joint exploitation of metadata and domain-related knowledge, which must thereby be integrated. In this paper, we survey the types of knowledge and metadata that are needed for managing complex data, discuss the issue of knowledge and metadata integration, and propose a CWM-compliant integration solution that we incorporate into an XML complex data warehousing framework we previously designed.Comment: 6th International Conference on Information Systems Technology and its Applications (ISTA 07), Kharkiv : Ukraine (2007

    A Join Index for XML Data Warehouses

    Get PDF
    XML data warehouses form an interesting basis for decision-support applications that exploit complex data. However, native-XML database management systems (DBMSs) currently bear limited performances and it is necessary to research for ways to optimize them. In this paper, we propose a new join index that is specifically adapted to the multidimensional architecture of XML warehouses. It eliminates join operations while preserving the information contained in the original warehouse. A theoretical study and experimental results demonstrate the efficiency of our join index. They also show that native XML DBMSs can compete with XML-compatible, relational DBMSs when warehousing and analyzing XML data.Comment: 2008 International Conference on Information Resources Management (Conf-IRM 08), Niagra Falls : Canada (2008

    Data Mining-based Fragmentation of XML Data Warehouses

    Full text link
    With the multiplication of XML data sources, many XML data warehouse models have been proposed to handle data heterogeneity and complexity in a way relational data warehouses fail to achieve. However, XML-native database systems currently suffer from limited performances, both in terms of manageable data volume and response time. Fragmentation helps address both these issues. Derived horizontal fragmentation is typically used in relational data warehouses and can definitely be adapted to the XML context. However, the number of fragments produced by classical algorithms is difficult to control. In this paper, we propose the use of a k-means-based fragmentation approach that allows to master the number of fragments through its kk parameter. We experimentally compare its efficiency to classical derived horizontal fragmentation algorithms adapted to XML data warehouses and show its superiority

    MaxPart: An Efficient Search-Space Pruning Approach to Vertical Partitioning

    Get PDF
    Vertical partitioning is the process of subdividing the attributes of a relation into groups, creating fragments. It represents an effective way of improving performance in the database systems where a significant percentage of query processing time is spent on the full scans of tables. Most of proposed approaches for vertical partitioning in databases use a pairwise affinity to cluster the attributes of a given relation. The affinity measures the frequency of accessing simultaneously a pair of attributes. The attributes having high affinity are clustered together so as to create fragments containing a maximum of attributes with a strong connectivity. However, such fragments can directly and efficiently be achieved by the use of maximal frequent itemsets. This technique of knowledge engineering reflects better the closeness or affinity when more than two attributes are involved. The partitioning process can be done faster and more accurately with the help of such knowledge discovery technique of data mining. In this paper, an approach based on maximal frequent itemsets to vertical partitioning is proposed to efficiently search for an optimized solution by judiciously pruning the potential search space. Moreover, we propose an analytical cost model to evaluate the produced partitions. Experimental studies show that the cost of the partitioning process can be substantially reduced using only a limited set of potential fragments. They also demonstrate the effectiveness of our approach in partitioning small and large tables

    Authenticating Query Results in Edge Computing

    Get PDF
    10.1109/ICDE.2004.1320027Proceedings - International Conference on Data Engineering20560-571PIDE

    Open archival information systems for database preservation

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Universidade do Porto. Faculdade de Engenharia. 201
    corecore