7,549 research outputs found

    Xml warehouse modelling and queryinq

    Get PDF
    International audienceIntegrating XML documents in data warehouse is a major issue for decisional data processing and business intelligence. Indeed this type of data is increasingly being used in organisations’ information system. But the current warehousing systems do not manage documents as they do for extracted data from relational databases. We have therefore developed a multidimensional model based on the Unified Modeling Language (UML), to describe an XML Document Warehouse (XDW). The warehouse diagram obtained is a Star schema ( StarCD) which fact represents the documents class to be analyzed, and the dimensions correspond to analysis criteria extracted from the structure of the documents. The standard XQuery language can express queries on XML documents, but it is not suitable for analyzing a warehouse as its syntax is too complex for a non IT specialist. This paper presents a new language aimed at decision-makers and allows applying OLAP queries on a XDW described by a StarCD

    Data cube computational model with Hadoop MapReduce

    Get PDF
    XML has become a widely used and well structured data format for digital document handling and message transmission. To find useful knowledge in XML data, data warehouse and OLAP applications aimed at providing supports for decision making should be developed. Apache Hadoop is an open source cloud computing framework that provides a distributed file system for large scale data processing. In this paper, we discuss an XML data cube model which offers us the complete views to observe XML data, and present a basic algorithm to implement its building process on Hadoop. To improve the efficiency, an optimized algorithm more suitable for this kind of XML data is also proposed. The experimental results given in the paper prove the effectiveness of our optimization strategies

    Modeling views in the layered view model for XML using UML

    Get PDF
    In data engineering, view formalisms are used to provide flexibility to users and user applications by allowing them to extract and elaborate data from the stored data sources. Conversely, since the introduction of Extensible Markup Language (XML), it is fast emerging as the dominant standard for storing, describing, and interchanging data among various web and heterogeneous data sources. In combination with XML Schema, XML provides rich facilities for defining and constraining user-defined data semantics and properties, a feature that is unique to XML. In this context, it is interesting to investigate traditional database features, such as view models and view design techniques for XML. However, traditional view formalisms are strongly coupled to the data language and its syntax, thus it proves to be a difficult task to support views in the case of semi-structured data models. Therefore, in this paper we propose a Layered View Model (LVM) for XML with conceptual and schemata extensions. Here our work is three-fold; first we propose an approach to separate the implementation and conceptual aspects of the views that provides a clear separation of concerns, thus, allowing analysis and design of views to be separated from their implementation. Secondly, we define representations to express and construct these views at the conceptual level. Thirdly, we define a view transformation methodology for XML views in the LVM, which carries out automated transformation to a view schema and a view query expression in an appropriate query language. Also, to validate and apply the LVM concepts, methods and transformations developed, we propose a view-driven application development framework with the flexibility to develop web and database applications for XML, at varying levels of abstraction

    XLDM: an xlink-based multidimensional metamodel

    Get PDF
    The growth of data available on the Internet and the improvement of ways to handle them consist of an important issue while designing a data model. In this context, XML provides the necessary formalism to establish a standard to represent and exchange data. Since the technologies of data warehouse are often used for data analysis, it is necessary to define a cube model data to XML. However, data representation in XML may generate syntactic, semantic and structural heterogeneity problems on XML documents, which are not considered by related approaches. To solve these problems, it is required the definition of a data schema. This paper proposes a metamodel to specify XML document cubes, based on relationships between elements and XML documents. This approach solves the XML data heterogeneity problems by taking advantages of data schema definition and relationships defined by XLink. The methodology used provides formal rules to define the concepts proposed. Following this formalism is then instantiated using XML Schema and XLink. It also presents a case study in the medical field and a comparison with XBRL Dimensions and a financial and multidimensional data model which uses XLink

    XWeB: the XML Warehouse Benchmark

    Full text link
    With the emergence of XML as a standard for representing business data, new decision support applications are being developed. These XML data warehouses aim at supporting On-Line Analytical Processing (OLAP) operations that manipulate irregular XML data. To ensure feasibility of these new tools, important performance issues must be addressed. Performance is customarily assessed with the help of benchmarks. However, decision support benchmarks do not currently support XML features. In this paper, we introduce the XML Warehouse Benchmark (XWeB), which aims at filling this gap. XWeB derives from the relational decision support benchmark TPC-H. It is mainly composed of a test data warehouse that is based on a unified reference model for XML warehouses and that features XML-specific structures, and its associate XQuery decision support workload. XWeB's usage is illustrated by experiments on several XML database management systems

    XML content warehousing: Improving sociological studies of mailing lists and web data

    Get PDF
    In this paper, we present the guidelines for an XML-based approach for the sociological study of Web data such as the analysis of mailing lists or databases available online. The use of an XML warehouse is a flexible solution for storing and processing this kind of data. We propose an implemented solution and show possible applications with our case study of profiles of experts involved in W3C standard-setting activity. We illustrate the sociological use of semi-structured databases by presenting our XML Schema for mailing-list warehousing. An XML Schema allows many adjunctions or crossings of data sources, without modifying existing data sets, while allowing possible structural evolution. We also show that the existence of hidden data implies increased complexity for traditional SQL users. XML content warehousing allows altogether exhaustive warehousing and recursive queries through contents, with far less dependence on the initial storage. We finally present the possibility of exporting the data stored in the warehouse to commonly-used advanced software devoted to sociological analysis
    • 

    corecore