3,513 research outputs found

    Modeling views in the layered view model for XML using UML

    Get PDF
    In data engineering, view formalisms are used to provide flexibility to users and user applications by allowing them to extract and elaborate data from the stored data sources. Conversely, since the introduction of Extensible Markup Language (XML), it is fast emerging as the dominant standard for storing, describing, and interchanging data among various web and heterogeneous data sources. In combination with XML Schema, XML provides rich facilities for defining and constraining user-defined data semantics and properties, a feature that is unique to XML. In this context, it is interesting to investigate traditional database features, such as view models and view design techniques for XML. However, traditional view formalisms are strongly coupled to the data language and its syntax, thus it proves to be a difficult task to support views in the case of semi-structured data models. Therefore, in this paper we propose a Layered View Model (LVM) for XML with conceptual and schemata extensions. Here our work is three-fold; first we propose an approach to separate the implementation and conceptual aspects of the views that provides a clear separation of concerns, thus, allowing analysis and design of views to be separated from their implementation. Secondly, we define representations to express and construct these views at the conceptual level. Thirdly, we define a view transformation methodology for XML views in the LVM, which carries out automated transformation to a view schema and a view query expression in an appropriate query language. Also, to validate and apply the LVM concepts, methods and transformations developed, we propose a view-driven application development framework with the flexibility to develop web and database applications for XML, at varying levels of abstraction

    XML content warehousing: Improving sociological studies of mailing lists and web data

    Get PDF
    In this paper, we present the guidelines for an XML-based approach for the sociological study of Web data such as the analysis of mailing lists or databases available online. The use of an XML warehouse is a flexible solution for storing and processing this kind of data. We propose an implemented solution and show possible applications with our case study of profiles of experts involved in W3C standard-setting activity. We illustrate the sociological use of semi-structured databases by presenting our XML Schema for mailing-list warehousing. An XML Schema allows many adjunctions or crossings of data sources, without modifying existing data sets, while allowing possible structural evolution. We also show that the existence of hidden data implies increased complexity for traditional SQL users. XML content warehousing allows altogether exhaustive warehousing and recursive queries through contents, with far less dependence on the initial storage. We finally present the possibility of exporting the data stored in the warehouse to commonly-used advanced software devoted to sociological analysis

    XML Reconstruction View Selection in XML Databases: Complexity Analysis and Approximation Scheme

    Full text link
    Query evaluation in an XML database requires reconstructing XML subtrees rooted at nodes found by an XML query. Since XML subtree reconstruction can be expensive, one approach to improve query response time is to use reconstruction views - materialized XML subtrees of an XML document, whose nodes are frequently accessed by XML queries. For this approach to be efficient, the principal requirement is a framework for view selection. In this work, we are the first to formalize and study the problem of XML reconstruction view selection. The input is a tree TT, in which every node ii has a size cic_i and profit pip_i, and the size limitation CC. The target is to find a subset of subtrees rooted at nodes i1,⋯ ,iki_1,\cdots, i_k respectively such that ci1+⋯+cik≤Cc_{i_1}+\cdots +c_{i_k}\le C, and pi1+⋯+pikp_{i_1}+\cdots +p_{i_k} is maximal. Furthermore, there is no overlap between any two subtrees selected in the solution. We prove that this problem is NP-hard and present a fully polynomial-time approximation scheme (FPTAS) as a solution

    Extracting, Transforming and Archiving Scientific Data

    Get PDF
    It is becoming common to archive research datasets that are not only large but also numerous. In addition, their corresponding metadata and the software required to analyse or display them need to be archived. Yet the manual curation of research data can be difficult and expensive, particularly in very large digital repositories, hence the importance of models and tools for automating digital curation tasks. The automation of these tasks faces three major challenges: (1) research data and data sources are highly heterogeneous, (2) future research needs are difficult to anticipate, (3) data is hard to index. To address these problems, we propose the Extract, Transform and Archive (ETA) model for managing and mechanizing the curation of research data. Specifically, we propose a scalable strategy for addressing the research-data problem, ranging from the extraction of legacy data to its long-term storage. We review some existing solutions and propose novel avenues of research.Comment: 8 pages, Fourth Workshop on Very Large Digital Libraries, 201
    • …
    corecore