3,615 research outputs found

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Who Owns Data In The Enterprise? Rethinking Data Ownership in Time of Big Data and Analytics

    Get PDF
    Today, a myriad of data is generated via connected devices and digital applications. With recent ad- vances in artificial intelligence (AI), companies are seeking new opportunities to monetize data. This goes along with improving their capabilities to manage big data and analytics (BDA). A critical factor that is often cited concerning the ‘soft’ aspects of BDA is data ownership, i.e. clarifying the funda- mental rights and responsibilities for data. Scholars have investigated data ownership for operational systems and data warehouses, where the purpose of data processing is known. In the BDA context, defining accountabilities for data ownership is more challenging, because data is stored in data lakes and used for new, previously unknown purposes. Based on insights from three case studies with exten- sive experience in BDA, we identify ownership principles and three data ownership types: data, data platform, and data product. By redefining the concept of data ownership, our research answers fun- damental questions about how data management changes with BDA, extending existing concepts on data ownership and contributing to the data governance literature

    Data ownership revisited: clarifying data accountabilities in times of big data and analytics

    Get PDF
    Today, a myriad of data is generated via connected devices and digital applications. In order to benefit from these data, companies have to develop their capabilities related to big data and analytics (BDA). A critical factor that is often cited concerning the “soft” aspects of BDA is data ownership, i.e., clarifying the fundamental rights and responsibilities for data. IS research has investigated data ownership for operational systems and data warehouses, where the purpose of data processing is known. In the BDA context, defining accountabilities for data is more challenging because data are stored in data lakes and used for previously unknown purposes. Based on four case studies, we identify ownership principles and three distinct types: data, data platform, and data product ownership. Our research answers fundamental questions about how data management changes with BDA and lays the foundation for future research on data and analytics governance

    Materializing aaseline views for deviation detection exploratory OLAP

    Get PDF
    The final publication is available at link.springer.comAlert-raising and deviation detection in OLAP and explora-tory search concerns calling the user’s attention to variations and non-uniform data distributions, or directing the user to the most interesting exploration of the data. In this paper, we are interested in the ability of a data warehouse to monitor continuously new data, and to update accordingly a particular type of materialized views recording statistics, called baselines. It should be possible to detect deviations at various levels of aggregation, and baselines should be fully integrated into the database. We propose Multi-level Baseline Materialized Views (BMV), including the mechanisms to build, refresh and detect deviations. We also propose an incremental approach and formula for refreshing baselines efficiently. An experimental setup proves the concept and shows its efficiency.Peer ReviewedPostprint (author's final draft

    Multi-Objective Materialized View Selection in Data-Intensive Flows

    Get PDF
    In this thesis we present Forge, a tool for automating multi-objective materialization of intermediate results in data-intensive flows, driven by a set of different quality objectives. We report initial evaluation results, showing the feasibility and efficiency of our approach

    Multidimensional database modelling with differentiated multiple aggregations

    Get PDF
    International audienceMany solutions have been defined for multidimensional database modelling. These propositions consider the same aggregation function to determine the values of an indicator according to different levels of granularity into the multidimensional space. We provide a more flexible conceptual model that supports multiple differentiated aggregations. Multiple aggregations allow associating different aggregation functions to the same measure for each dimension and for each hierarchy. Differentiated aggregation allows specific aggregations at each level (parameter). Our model is based on a double graphical formalism, expressive enough to control the validity of aggregation functions. We also study the consequences of this conceptual modelling for building lattices of pre-computed aggregates in a relational online analytical processing (R-OLAP) environment

    Theory borrowing in IT-rich contexts : lessons from IS strategy research

    Get PDF
    While indigenous theorizing in information systems has clear merits, theory borrowing will not, and should not, be eschewed given its appeal and usefulness. In this article, we aim at increasing our understanding of modifying of borrowed theories in IT-rich contexts. We present a framework in which we discuss how two recontextualization approaches of specification and distinction help with increasing the IT-richness of borrowed constructs and relationships. In doing so, we use several illustrative examples from information systems strategy. The framework can be used by researchers as a tool to explore the multitude of ways in which a theory from another discipline can yield the understanding of IT phenomena

    E-CRM and CMS systems: potential for more dynamic businesses

    Get PDF
    Any change in customer’s behaviour affects the customer’s value. In addition, profitability and economic viability also change. Most companies still do not know entirely their customer base characteristics. They find difficult to define criteria that segment their customer base to find high-value customers. They need to focus on target selections to carry on with marketing campaigns which involve high investments. Given the potential of e-CRM and CMS as powerful tools to guide customer-oriented understanding and analysis, greater attention is required. Several companies, operating within the same business and having access to the same information and technology, differ in e-CRM performance. Without sufficient evidence, managers are prone to making investment decisions that are neither efficient nor effective. So it is imperative to base the decision of e-CRM and CMS adoption, on not only their analytical power, but also on economic viability criteria for sustainable business dynamic

    ViewDF: a Flexible Framework for Incremental View Maintenance in Stream Data Warehouses

    Get PDF
    Because of the increasing data sizes and demands for low latency in modern data analysis, the traditional data warehousing technologies are greatly pushed beyond their limits. Several stream data warehouse (SDW) systems, which are warehouses that ingest append-only data feeds and support frequent refresh cycles, have been proposed including different methods to improve the responsiveness of the systems. Materialized views are critical in large-scale data warehouses due to their ability to speed up queries. Thus an SDW maintains layers of materialized views. Materialized view maintenance in SDW systems introduces new challenges. However, some of the existing SDW systems do not address the maintenance of views while others employ view maintenance techniques that are not efficient. This thesis presents ViewDF, a flexible framework for incremental maintenance of materialized views in SDW systems that generalizes existing techniques and enables new optimizations for views defined with operators that are common in stream analytics. We give a special view definition (ViewDF) to enhance the traditional way of creating views in SQL by being able to reference any partition of any table. We describe a prototype system based on this idea, which allows users to write ViewDFs directly and can automatically translate a broad class of queries into ViewDFs. Several optimizations are proposed and experiments show that our proposed system can improve view maintenance time by a factor of two or more in practical settings.1 yea
    corecore