769 research outputs found

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Modeling views in the layered view model for XML using UML

    Get PDF
    In data engineering, view formalisms are used to provide flexibility to users and user applications by allowing them to extract and elaborate data from the stored data sources. Conversely, since the introduction of Extensible Markup Language (XML), it is fast emerging as the dominant standard for storing, describing, and interchanging data among various web and heterogeneous data sources. In combination with XML Schema, XML provides rich facilities for defining and constraining user-defined data semantics and properties, a feature that is unique to XML. In this context, it is interesting to investigate traditional database features, such as view models and view design techniques for XML. However, traditional view formalisms are strongly coupled to the data language and its syntax, thus it proves to be a difficult task to support views in the case of semi-structured data models. Therefore, in this paper we propose a Layered View Model (LVM) for XML with conceptual and schemata extensions. Here our work is three-fold; first we propose an approach to separate the implementation and conceptual aspects of the views that provides a clear separation of concerns, thus, allowing analysis and design of views to be separated from their implementation. Secondly, we define representations to express and construct these views at the conceptual level. Thirdly, we define a view transformation methodology for XML views in the LVM, which carries out automated transformation to a view schema and a view query expression in an appropriate query language. Also, to validate and apply the LVM concepts, methods and transformations developed, we propose a view-driven application development framework with the flexibility to develop web and database applications for XML, at varying levels of abstraction

    Modified Query-Roles Based Access Control Model (Q-RBAC) for Interactive Access of Ontology Data

    Get PDF
    The data access model plays an important role during accessing and querying the stored data from the database. It provides an access right and authorization of accessing data into a database. It can distinguish the access boundaries between the administrators and the users where the database administrators can create certain policies either from the client application side or directly from the database side, depending upon the nature of running application. However, the emerging technology on the ontology repository has forced some database developers to adapt most of the access policies from the traditional database system and many of the policies were inherited from the relational database. This method of adopting or borrowing access policies from other storage system has created an unnecessary layer between the ontology repository and database. Most of the emerging ontology repositories lack an independent access model that provides or distinguishes access right between the administrators and users or between the ontology data. This paper proposed the improved access layer from the ontology repository with an additional users’ policy creation layer that will lead to increase data security and also increase the performance of querying data. Our effort relies on re-modifying the role based access control model from the traditional one to the new proposed model that organized by the rich users’ policies and perfect query rewriting layer. Although it is associated with query module, the proposed model has an additional security layer to restrict unauthorized users from accessing stored data in order to improve querying and data access performance Keywords: Access methods, Access control, Rule based access control model. Oracle NoSQL database, Virtual data layer, Ontology Query

    Introduction to the TPLP special issue, logic programming in databases: From Datalog to semantic-web rules

    Get PDF
    Much has happened in data and knowledge base research since the introduction of the relational model in Codd (1970) and its strong logical foundations influence its advances ever since. Logic has been a common ground where Database and Artificial Intelligence research competed and collaborated with each other for a long time (Abiteboul et al. 1995). The product of this joint effort has been a set of logic-based formalisms, such as the Relational Calculus (Codd 1970), Datalog (Ceri et al. 1990), Description Logics (Baader et al. 2007), etc., capturing not only the structure but also the semantics of data in an explicit way, thus enabling complex inference procedures.This special issue contains three rigorously reviewed articles addressing problems that span from Query Answering to Data Mining. All these contributions have their roots in the foundational formalisms of Data and Knowledge Bases such as Logic Programming, Description Logic and Hybrid Logics, representing a clear example of the effort that the Database and the Semantic-Web communities are producing to bridge the various schools of thinking in modern Data and Knowledge Management

    An Approach to Conceptual Schema Evolution

    Get PDF
    In this work we will analyse conceptual foundations of user centric content management. Content management often involves integration of content that was created from different points of view. Current modeling techniques and especially current systems lack of a sufficient support of handling these situations. Although schema integration is undecideable in general, we will introduce a conceptual model together with a modeling and maintenance methodology that simplifies content integration in many practical situations. We will define a conceptual model based on the Higher-Order Entity Relationship Model that combines advantages of schema oriented modeling techniques like ER modeling with element driven paradims like approaches for semistructured data management. This model is ready to support contextual reasoning based on local model semantics. For the special case of schema evolution based on schema versioning we will derive the compatibility relation between local models by tracking dependencies of schema revisions. Additionally, we will discuss implementational facets, such as storage aspects for structurally flexible content or generation of adaptive user interfaces based on a conceptual interaction model

    Implementation of Web Query Languages Reconsidered

    Get PDF
    Visions of the next generation Web such as the "Semantic Web" or the "Web 2.0" have triggered the emergence of a multitude of data formats. These formats have different characteristics as far as the shape of data is concerned (for example tree- vs. graph-shaped). They are accompanied by a puzzlingly large number of query languages each limited to one data format. Thus, a key feature of the Web, namely to make it possible to access anything published by anyone, is compromised. This thesis is devoted to versatile query languages capable of accessing data in a variety of Web formats. The issue is addressed from three angles: language design, common, yet uniform semantics, and common, yet uniform evaluation. % Thus it is divided in three parts: First, we consider the query language Xcerpt as an example of the advocated class of versatile Web query languages. Using this concrete exemplar allows us to clarify and discuss the vision of versatility in detail. Second, a number of query languages, XPath, XQuery, SPARQL, and Xcerpt, are translated into a common intermediary language, CIQLog. This language has a purely logical semantics, which makes it easily amenable to optimizations. As a side effect, this provides the, to the best of our knowledge, first logical semantics for XQuery and SPARQL. It is a very useful tool for understanding the commonalities and differences of the considered languages. Third, the intermediate logical language is translated into a query algebra, CIQCAG. The core feature of CIQCAG is that it scales from tree- to graph-shaped data and queries without efficiency losses when tree-data and -queries are considered: it is shown that, in these cases, optimal complexities are achieved. CIQCAG is also shown to evaluate each of the aforementioned query languages with a complexity at least as good as the best known evaluation methods so far. For example, navigational XPath is evaluated with space complexity O(q d) and time complexity O(q n) where q is the query size, n the data size, and d the depth of the (tree-shaped) data. CIQCAG is further shown to provide linear time and space evaluation of tree-shaped queries for a larger class of graph-shaped data than any method previously proposed. This larger class of graph-shaped data, called continuous-image graphs, short CIGs, is introduced for the first time in this thesis. A (directed) graph is a CIG if its nodes can be totally ordered in such a manner that, for this order, the children of any node form a continuous interval. CIQCAG achieves these properties by employing a novel data structure, called sequence map, that allows an efficient evaluation of tree-shaped queries, or of tree-shaped cores of graph-shaped queries on any graph-shaped data. While being ideally suited to trees and CIGs, the data structure gracefully degrades to unrestricted graphs. It yields a remarkably efficient evaluation on graph-shaped data that only a few edges prevent from being trees or CIGs
    corecore